![]() |
|
|
#12 |
|
Aug 2002
2×3×53 Posts |
Will V. 23.1 help any other machine besides Celeron and P4?
Does it only help on LL testing? Does it only help with certain FFT lengths? |
|
|
|
|
|
#13 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
1D6616 Posts |
Quote:
P4 Celeron - 640K to 2M FFTs P4 Northwood - 1280K to 4M FFTs Why would any other situations be faster? Well, remember in the IRC chat I mentioned the weird problem where the debug version was faster than the non-debug version? It turns out this is because the debug version filled memory with 0xCD. Doing so walked through the pages linearly which makes it more likely the VM manager will allocate them linearly in physical memory. Prime95's FFT assembly code is optimized for this situation. This is especially important in the FFTs where close to all of the L2 cache was being utilized. If pages are not in contiguous physical memory, then some page reads will force other pages to be kicked out. What I did was add a call to memset right after allocating memory so that the non-debug version behaves just like the debug version. Which FFT were "close to all of the L2 cache being utilized" and thus seeing the biggest benefit? The P4 celeron - 896K and 1M. The P4 willamette - 1792K and 2M. There should be some improvement in many FFT sizes but it will not be very noticeable for most. I've no data on Athlon, P3, etc. as to whether the memset fix will make a big impact on any of their FFTs. We know that timings vary from run to run, so it may be hard to pinpoint. Please post any CPU/FFT combinations where you consistently see a 3% or more improvement in v23 - others will be interested. |
|
|
|
|
|
|
#14 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
2·53·71 Posts |
OK, try the latest v23.1 to see if it detects the L2 cache size correctly
|
|
|
|
|
|
#15 |
|
Aug 2002
2×3×53 Posts |
23.1 helps my P4s do there 33Ms.
About 3%. |
|
|
|
|
|
#16 |
|
Jan 2003
2×3 Posts |
here are your results
Intel(R) Celeron(R) CPU 2.00GHz CPU speed: 2425.00 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: unknown L1 cache line size: 64 bytes L2 cache line size: unknown TLBS: 64 Prime95 version 22.13, RdtscTiming=1 Best time for 256K FFT length: 13.283 ms. Best time for 320K FFT length: 20.281 ms. Best time for 384K FFT length: 23.058 ms. Best time for 448K FFT length: 28.685 ms. Best time for 512K FFT length: 35.766 ms. Best time for 640K FFT length: 68.557 ms. Best time for 768K FFT length: 103.835 ms. Best time for 896K FFT length: 126.564 ms. Best time for 1024K FFT length: 148.212 ms. Best time for 1280K FFT length: 231.566 ms. Best time for 1536K FFT length: 279.798 ms. Best time for 1792K FFT length: 342.215 ms. [Thu Jan 30 16:33:01 2003] Intel(R) Celeron(R) CPU 2.00GHz CPU speed: 2425.14 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: 128 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes TLBS: 64 Prime95 version 23.1, RdtscTiming=1 Best time for 384K FFT length: 20.535 ms. Best time for 448K FFT length: 25.709 ms. Best time for 512K FFT length: 33.254 ms. Best time for 640K FFT length: 42.347 ms. Best time for 768K FFT length: 55.776 ms. Best time for 896K FFT length: 70.974 ms. Best time for 1024K FFT length: 83.636 ms. Best time for 1280K FFT length: 96.240 ms. Best time for 1536K FFT length: 120.434 ms. Best time for 1792K FFT length: 164.635 ms. Best time for 2048K FFT length: 178.784 ms. [Thu Jan 30 16:35:09 2003] Intel(R) Celeron(R) CPU 2.00GHz CPU speed: 2017.71 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: 128 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes TLBS: 64 Prime95 version 23.1, RdtscTiming=1 Best time for 384K FFT length: 24.646 ms. Best time for 448K FFT length: 30.928 ms. Best time for 512K FFT length: 39.963 ms. Best time for 640K FFT length: 50.842 ms. Best time for 768K FFT length: 67.195 ms. Best time for 896K FFT length: 85.514 ms. Best time for 1024K FFT length: 100.128 ms. Best time for 1280K FFT length: 115.591 ms. Best time for 1536K FFT length: 146.711 ms. Best time for 1792K FFT length: 196.979 ms. Best time for 2048K FFT length: 215.330 ms. [Thu Jan 30 16:41:09 2003] Intel(R) Celeron(R) CPU 2.00GHz CPU speed: 2018.10 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: unknown L1 cache line size: 64 bytes L2 cache line size: unknown TLBS: 64 Prime95 version 22.13, RdtscTiming=1 Best time for 256K FFT length: 15.998 ms. Best time for 320K FFT length: 24.348 ms. Best time for 384K FFT length: 27.829 ms. Best time for 448K FFT length: 34.437 ms. Best time for 512K FFT length: 43.061 ms. Best time for 640K FFT length: 82.321 ms. Best time for 768K FFT length: 124.978 ms. Best time for 896K FFT length: 152.649 ms. Best time for 1024K FFT length: 178.291 ms. Best time for 1280K FFT length: 277.927 ms. Best time for 1536K FFT length: 336.477 ms. Best time for 1792K FFT length: 410.994 ms. |
|
|
|
|
|
#17 |
|
Aug 2002
223 Posts |
Intel(R) Pentium(R) 4 Mobile CPU 1.60GHz
CPU speed: 1199.02 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: 512 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes TLBS: 64 Prime95 version 22.12, RdtscTiming=1 Best time for 256K FFT length: 15.023 ms. Best time for 320K FFT length: 19.940 ms. Best time for 384K FFT length: 24.177 ms. Best time for 448K FFT length: 28.870 ms. Best time for 512K FFT length: 32.792 ms. Best time for 640K FFT length: 42.319 ms. Best time for 768K FFT length: 51.536 ms. Best time for 896K FFT length: 63.530 ms. Best time for 1024K FFT length: 69.550 ms. Best time for 1280K FFT length: 99.515 ms. Best time for 1536K FFT length: 123.621 ms. Best time for 1792K FFT length: 153.544 ms. Intel(R) Pentium(R) 4 Mobile CPU 1.60GHz CPU speed: 1199.07 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: 512 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes TLBS: 64 Prime95 version 23.1, RdtscTiming=1 Best time for 384K FFT length: 24.114 ms. Best time for 448K FFT length: 28.829 ms. Best time for 512K FFT length: 32.651 ms. Best time for 640K FFT length: 42.160 ms. Best time for 768K FFT length: 51.535 ms. Best time for 896K FFT length: 63.071 ms. Best time for 1024K FFT length: 68.382 ms. Best time for 1280K FFT length: 91.864 ms. Best time for 1536K FFT length: 111.431 ms. Best time for 1792K FFT length: 138.291 ms. Best time for 2048K FFT length: 148.697 ms. |
|
|
|
|
|
#18 |
|
Aug 2002
223 Posts |
Intel(R) Xeon(TM) CPU 2.80GHz
CPU speed: 2787.36 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: 512 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes TLBS: 64 Prime95 version 22.12, RdtscTiming=1 Best time for 256K FFT length: 9.437 ms. Best time for 320K FFT length: 12.028 ms. Best time for 384K FFT length: 14.538 ms. Best time for 448K FFT length: 17.354 ms. Best time for 512K FFT length: 19.651 ms. Best time for 640K FFT length: 25.286 ms. Best time for 768K FFT length: 30.692 ms. Best time for 896K FFT length: 37.598 ms. Best time for 1024K FFT length: 40.882 ms. Best time for 1280K FFT length: 57.051 ms. Best time for 1536K FFT length: 72.264 ms. Best time for 1792K FFT length: 89.022 ms. Intel(R) Xeon(TM) CPU 2.80GHz CPU speed: 2787.37 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: 512 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes TLBS: 64 Prime95 version 23.1, RdtscTiming=1 Best time for 384K FFT length: 14.415 ms. Best time for 448K FFT length: 17.315 ms. Best time for 512K FFT length: 19.521 ms. Best time for 640K FFT length: 25.115 ms. Best time for 768K FFT length: 30.626 ms. Best time for 896K FFT length: 37.338 ms. Best time for 1024K FFT length: 40.576 ms. Best time for 1280K FFT length: 53.926 ms. Best time for 1536K FFT length: 65.628 ms. Best time for 1792K FFT length: 81.447 ms. Best time for 2048K FFT length: 87.222 ms. |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Is any GTX 750 the GeForce GTX 750 Ti owner here? | pepi37 | Hardware | 12 | 2016-07-17 22:35 |
| Pentium 90 // Pentium ][ 400 years | ValerieVonck | Programming | 4 | 2006-12-12 17:06 |
| Celeron 2.40 too slow? | rudi_m | Hardware | 14 | 2005-10-11 03:31 |
| New celeron. look, look! | E_tron | Hardware | 5 | 2004-07-13 05:16 |
| Celeron vs. P4 | PrimeCruncher | Hardware | 7 | 2003-11-14 02:19 |