![]() |
New Centrino - Benchmark Seems Too Low
Hey Everybody,
I'm new to the board. I just got a new Dell 600m centrino laptop with the 1.3 ghz processor. I was thinking that because of its huge L2 cache, and supposedly high IPC that it would do very well at the PRIME95 benchmark. These are the numbers I got: ------------------------------------------------- [size=1]Intel(R) Pentium(R) M processor 1300MHz CPU speed: 1295.75 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: unknown L2 cache size: unknown L1 cache line size: unknown L2 cache line size: unknown Prime95 version 23.2, RdtscTiming=1 Best time for 384K FFT length: 47.045 ms. Best time for 448K FFT length: 56.487 ms. Best time for 512K FFT length: 63.443 ms. Best time for 640K FFT length: 81.330 ms. Best time for 768K FFT length: 99.770 ms. Best time for 896K FFT length: 118.425 ms. Best time for 1024K FFT length: 132.536 ms. Best time for 1280K FFT length: 176.199 ms. Best time for 1536K FFT length: 214.005 ms. Best time for 1792K FFT length: 257.252 ms. Best time for 2048K FFT length: 285.545 ms.[/size] ------------------------------------------------------- Computer Specs: Dell Inspiron 600m 384MB RAM 1.3 ghz processor I put the Power Setting on "Always On" which I believe keeps the processor running at full speed. Do these numbers seem really low to anyone else? /Mike King [i]Mod edit: Removed a few hundred lines of 3DMark stuff... :) RESULTS 3DMark Score 4956[/i] |
My loser Celeron for comparison...
[code]Mobile Intel(R) Celeron(R) CPU 1.50GHz CPU speed: 1495.62 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: 256 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes TLBS: 64 Prime95 version 23.2, RdtscTiming=1 Best time for 384K FFT length: 26.221 ms. Best time for 448K FFT length: 30.991 ms. Best time for 512K FFT length: 35.972 ms. Best time for 640K FFT length: 46.071 ms. Best time for 768K FFT length: 57.197 ms. Best time for 896K FFT length: 72.777 ms. Best time for 1024K FFT length: 83.713 ms. Best time for 1280K FFT length: 107.117 ms. Best time for 1536K FFT length: 132.089 ms. Best time for 1792K FFT length: 166.544 ms. Best time for 2048K FFT length: 173.676 ms.[/code] Mike and I have been working on this for several days... His Pentium-M has 1MB L2 cache and SSE2... I don't think it is in SpeedStep mode... WCPUID (realtime frequency option on) shows 1.3GHz... On this particular CPU SpeedStep is 600MHz... Note that his cache size and cache lines are not reported... That is the only thing I can think of... I know this particular CPU has 32K data L1 and 32K instruction L1 and 1024K L2... |
I've just looked at Intel's cpuid document. It looks like the Centrino has a 64-byte L2 cache line NOT SECTORED. If so, a prefetch only reads 64 bytes, not 128. I'll need to double the number of prefetch instructions - ugh.
|
What are cache lines and what do they do?
|
I didnt know the Centrino processor has SSE2 :surprised:ops: :surprised:ops:
Not a word about its SSE2 support on intels own consumer site. |
[quote]I didnt know the Centrino processor has SSE2 :surprised:ops: :surprised:ops:
Not a word about its SSE2 support on intels own consumer site.[/quote] From [url]http://www.intel.com/products/mobiletechnology/performance.htm[/url] [quote]Intel Centrino mobile technology also features advanced instruction prediction to eliminate CPU process replication, and second-generation Streaming SIMD Extensions (Streaming SIMD Extensions 2) with instructions integrated into the software to enhance performance.[/quote] |
[quote]What are cache lines and what do they do?[/quote]
They are part of the explanation of the L2 cache layout. In a P4, when you read a byte from main memory, 128 bytes are read in because often if you read one byte you'll be reading in nearby bytes. This 128 bytes is called the cache line size. In the Pentium-M they've changed the L2 cache line size to 64 bytes. Most programs could care less about the cache line size, but prime95 is the exception because it tries so hard to manage cache usage. Prime95 SSE2 code prefetches every 128th byte, thus reading in every cache line it will use ahead of time. On the Pentium-M, since only 64 bytes are read in by the prefetch instruction, the other 64 bytes are not - and will incur a roughly 300 clock penalty when accessed. This does not adequately explain the wretched performance of the Pentium-M though. |
Does this help any?
http://www.anandtech.com/mobile/showdoc.html?i=1800&p=6 |
More links...
ftp://download.intel.com/design/mobile/datashts/25261201.pdf http://developer.intel.com/design/mobile/specupdt/25266501.pdf |
[quote] http://www.mersenneforum.org/attachments/pdfs/pap.pap313.pdf[/quote]
Immediately striking is this erratum on page 11 of the above document: [quote][b]Y3. RDTSC Instruction May Report the Wrong Time Stamp Counter Value[/b] Problem: The Time Stamp Counter is a 64-bit counter that is read in two 32-bit chunks. The counter incorrectly advances and therefore the two chunks may go out of synchronization causing the Read Time Stamp Counter (RDTSC) instruction to report the wrong time stamp counter value Implication: This erratum may cause software to see the wrong representation of processor time and may result in unpredictable software operation. Workaround: It is possible for BIOS to contain a workaround for this erratum.[/quote] |
Mike King, email me. I have a hacked-up prime95 with more prefetches that I'd like you to benchmark.
|
| All times are UTC. The time now is 04:35. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.