mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2012-02-17, 18:01   #34
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11·311 Posts
Default

Quote:
Originally Posted by Ralf Recker View Post
What is your mainboard type / BIOS version?
Asus P9X79 PRO, BIOS v0802

Quote:
Originally Posted by Ralf Recker View Post
If you post your CPUID (you can look it up with CPU-Z or a similar tool) George might be able to use the information to improve the CPU detection.
CPU-Z screenshot attached.

Quote:
Originally Posted by Zero View Post
JH, it would be interesting to see your results with HT disabled.
Faster:
Code:
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
CPU speed: 4425.82 MHz, 6 cores
CPU features: Prefetch, MMX, SSE, SSE2, SSE4, AVX
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 12 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Prime95 64-bit version 27.3, RdtscTiming=1
Best time for 768K FFT length: 3.489 ms., avg: 3.518 ms.
Best time for 896K FFT length: 4.176 ms., avg: 4.653 ms.
Best time for 1024K FFT length: 4.750 ms., avg: 4.867 ms.
Best time for 1280K FFT length: 6.127 ms., avg: 6.862 ms.
Best time for 1536K FFT length: 7.589 ms., avg: 8.131 ms.
Best time for 1792K FFT length: 9.277 ms., avg: 9.828 ms.
Best time for 2048K FFT length: 10.422 ms., avg: 10.728 ms.
Best time for 2560K FFT length: 13.317 ms., avg: 14.179 ms.
Best time for 3072K FFT length: 16.633 ms., avg: 17.080 ms.
Best time for 3584K FFT length: 20.201 ms., avg: 20.543 ms.
Best time for 4096K FFT length: 22.846 ms., avg: 23.072 ms.
Best time for 5120K FFT length: 30.047 ms., avg: 31.448 ms.
Best time for 6144K FFT length: 36.142 ms., avg: 38.056 ms.
Best time for 7168K FFT length: 43.865 ms., avg: 44.438 ms.
Best time for 8192K FFT length: 50.945 ms., avg: 51.268 ms.
Timing FFTs using 2 threads.
Best time for 768K FFT length: 1.964 ms., avg: 2.017 ms.
Best time for 896K FFT length: 2.268 ms., avg: 2.521 ms.
Best time for 1024K FFT length: 2.570 ms., avg: 2.840 ms.
Best time for 1280K FFT length: 3.346 ms., avg: 3.861 ms.
Best time for 1536K FFT length: 4.126 ms., avg: 4.513 ms.
Best time for 1792K FFT length: 4.986 ms., avg: 5.269 ms.
Best time for 2048K FFT length: 5.590 ms., avg: 5.960 ms.
Best time for 2560K FFT length: 7.147 ms., avg: 7.247 ms.
Best time for 3072K FFT length: 8.904 ms., avg: 10.948 ms.
Best time for 3584K FFT length: 10.938 ms., avg: 11.665 ms.
Best time for 4096K FFT length: 12.223 ms., avg: 12.343 ms.
Best time for 5120K FFT length: 15.972 ms., avg: 17.223 ms.
Best time for 6144K FFT length: 19.084 ms., avg: 19.621 ms.
Best time for 7168K FFT length: 23.257 ms., avg: 23.906 ms.
Best time for 8192K FFT length: 27.104 ms., avg: 27.206 ms.
Timing FFTs using 3 threads.
Best time for 768K FFT length: 1.356 ms., avg: 1.397 ms.
Best time for 896K FFT length: 1.585 ms., avg: 1.637 ms.
Best time for 1024K FFT length: 1.781 ms., avg: 1.871 ms.
Best time for 1280K FFT length: 2.300 ms., avg: 2.585 ms.
Best time for 1536K FFT length: 2.898 ms., avg: 3.237 ms.
Best time for 1792K FFT length: 3.532 ms., avg: 4.807 ms.
Best time for 2048K FFT length: 3.944 ms., avg: 5.175 ms.
Best time for 2560K FFT length: 4.980 ms., avg: 8.678 ms.
Best time for 3072K FFT length: 6.210 ms., avg: 7.212 ms.
Best time for 3584K FFT length: 7.864 ms., avg: 8.434 ms.
Best time for 4096K FFT length: 8.591 ms., avg: 9.051 ms.
Best time for 5120K FFT length: 11.160 ms., avg: 12.002 ms.
Best time for 6144K FFT length: 13.168 ms., avg: 14.839 ms.
Best time for 7168K FFT length: 15.991 ms., avg: 16.532 ms.
Best time for 8192K FFT length: 18.811 ms., avg: 20.220 ms.
Timing FFTs using 4 threads.
Best time for 768K FFT length: 1.289 ms., avg: 1.306 ms.
Best time for 896K FFT length: 1.517 ms., avg: 1.603 ms.
Best time for 1024K FFT length: 1.697 ms., avg: 1.795 ms.
Best time for 1280K FFT length: 2.191 ms., avg: 2.825 ms.
Best time for 1536K FFT length: 2.660 ms., avg: 3.394 ms.
Best time for 1792K FFT length: 3.196 ms., avg: 3.266 ms.
Best time for 2048K FFT length: 3.531 ms., avg: 3.884 ms.
Best time for 2560K FFT length: 4.483 ms., avg: 4.571 ms.
Best time for 3072K FFT length: 5.494 ms., avg: 6.034 ms.
Best time for 3584K FFT length: 6.751 ms., avg: 7.105 ms.
Best time for 4096K FFT length: 7.544 ms., avg: 9.167 ms.
Best time for 5120K FFT length: 8.947 ms., avg: 9.108 ms.
Best time for 6144K FFT length: 10.688 ms., avg: 12.479 ms.
Best time for 7168K FFT length: 12.787 ms., avg: 13.099 ms.
Best time for 8192K FFT length: 16.064 ms., avg: 18.325 ms.
Timing FFTs using 5 threads.
Best time for 768K FFT length: 1.201 ms., avg: 1.218 ms.
Best time for 896K FFT length: 1.457 ms., avg: 1.708 ms.
Best time for 1024K FFT length: 1.615 ms., avg: 1.642 ms.
Best time for 1280K FFT length: 2.070 ms., avg: 2.183 ms.
Best time for 1536K FFT length: 2.499 ms., avg: 2.537 ms.
Best time for 1792K FFT length: 3.035 ms., avg: 3.092 ms.
Best time for 2048K FFT length: 3.354 ms., avg: 3.414 ms.
Best time for 2560K FFT length: 4.285 ms., avg: 4.558 ms.
Best time for 3072K FFT length: 5.231 ms., avg: 5.284 ms.
Best time for 3584K FFT length: 6.474 ms., avg: 6.568 ms.
Best time for 4096K FFT length: 7.195 ms., avg: 7.287 ms.
Best time for 5120K FFT length: 8.379 ms., avg: 10.145 ms.
Best time for 6144K FFT length: 9.583 ms., avg: 10.677 ms.
Best time for 7168K FFT length: 11.155 ms., avg: 12.152 ms.
Best time for 8192K FFT length: 15.262 ms., avg: 17.920 ms.
Timing FFTs using 6 threads.
Best time for 768K FFT length: 1.145 ms., avg: 1.161 ms.
Best time for 896K FFT length: 1.383 ms., avg: 1.403 ms.
Best time for 1024K FFT length: 1.538 ms., avg: 1.562 ms.
Best time for 1280K FFT length: 1.965 ms., avg: 1.998 ms.
Best time for 1536K FFT length: 2.390 ms., avg: 2.445 ms.
Best time for 1792K FFT length: 2.918 ms., avg: 3.389 ms.
Best time for 2048K FFT length: 3.234 ms., avg: 3.581 ms.
Best time for 2560K FFT length: 4.163 ms., avg: 4.245 ms.
Best time for 3072K FFT length: 5.088 ms., avg: 5.615 ms.
Best time for 3584K FFT length: 6.297 ms., avg: 6.394 ms.
Best time for 4096K FFT length: 6.991 ms., avg: 7.393 ms.
Best time for 5120K FFT length: 8.213 ms., avg: 8.400 ms.
Best time for 6144K FFT length: 9.420 ms., avg: 10.439 ms.
Best time for 7168K FFT length: 10.742 ms., avg: 11.658 ms.
Best time for 8192K FFT length: 14.644 ms., avg: 17.097 ms.
Best time for 61 bit trial factors: 1.716 ms.
Best time for 62 bit trial factors: 1.718 ms.
Best time for 63 bit trial factors: 1.958 ms.
Best time for 64 bit trial factors: 2.031 ms.
Best time for 65 bit trial factors: 2.371 ms.
Best time for 66 bit trial factors: 2.808 ms.
Best time for 67 bit trial factors: 2.777 ms.
Best time for 75 bit trial factors: 2.701 ms.
Best time for 76 bit trial factors: 2.701 ms.
Best time for 77 bit trial factors: 2.707 ms.
James Heinrich is offline   Reply With Quote
Old 2012-02-17, 18:15   #35
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11×311 Posts
Default

Quote:
Originally Posted by Prime95 View Post
That is correct. James, try adding "TimingOutput=4" to prime.txt. Restart aand run just one worker. Note the per-iteration times. Now start the second worker, note times, etc.. Do the workers slow down a lot?

On my machine (all workers running 2400K FFTs), I get times of 1 worker - 13.7ms, 2 workers - 13.9ms, 3 workers - 14.5ms, 4 workers - 16.6ms.
With hyperthreading disabled:
Quote:
Resuming primality test of M44789989 using AVX Core2 type-3 FFT length 2400K, Pass1=384, Pass2=6400
1: 12.8ms
2: 13.2ms
3: 13.6ms
4: 14.8ms
5: 16.7ms
6: ~19ms (ranges from 18.3 to 21.1 in different workers)
James Heinrich is offline   Reply With Quote
Old 2012-02-17, 18:22   #36
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Grrrr. Does Options/CPU identify the chip as supporting AVX?

If not, can you add the line "CpuSupportsAVX=1" to local.ini and let me know if your benchmarks indicate prime95 runs faster with AVX vs. v26 using SSE2? Thanks.
I have a dual Opteron 6272 (Bulldozer) system. Doesn't work. Here is what happens with benchmark:

Code:
 
[Feb 17 11:17] Worker starting
[Feb 17 11:17] Your timings will be written to the results.txt file.
[Feb 17 11:17] Compare your results to other computers at http://www.mersenne.org/report_benchmarks
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 2 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 3 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 4 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 5 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 6 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 7 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 8 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 9 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 10 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 11 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 12 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 13 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 14 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 15 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 16 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 17 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 18 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 19 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 20 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 21 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 22 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 23 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 24 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 25 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 26 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 27 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 28 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 29 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 30 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 31 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing FFTs using 32 threads.
[Feb 17 11:17] Cannot initialize FFT code, errcode=1002
[Feb 17 11:17] Number sent to gwsetup is too large for the FFTs to handle.
[Feb 17 11:17] Timing trial factoring of M35000011 with 61 bit length factors.  Best time: 6.404 ms.
[Feb 17 11:17] Timing trial factoring of M35000011 with 62 bit length factors.  Best time: 6.428 ms.
[Feb 17 11:17] Timing trial factoring of M35000011 with 63 bit length factors.  Best time: 12.623 ms.
[Feb 17 11:17] Timing trial factoring of M35000011 with 64 bit length factors.  Best time: 12.646 ms.
[Feb 17 11:17] Timing trial factoring of M35000011 with 65 bit length factors.  Best time: 10.620 ms.
[Feb 17 11:17] Timing trial factoring of M35000011 with 66 bit length factors.  Best time: 10.568 ms.
[Feb 17 11:17] Timing trial factoring of M35000011 with 67 bit length factors.  Best time: 10.608 ms.
[Feb 17 11:17] Timing trial factoring of M35000011 with 75 bit length factors.  Best time: 10.851 ms.
[Feb 17 11:17] Timing trial factoring of M35000011 with 76 bit length factors.  Best time: 10.865 ms.
[Feb 17 11:17] Timing trial factoring of M35000011 with 77 bit length factors.  Best time: 10.838 ms.
[Feb 17 11:17] Benchmark complete.
[Feb 17 11:17] Worker stopped.
The screen shot of what it detected at startup. Note that is sees AVX but the L3 cache size should be 16MB. Also, the CPU cores run @ 2.1GHz each.
Attached Thumbnails
Click image for larger version

Name:	273.JPG
Views:	161
Size:	36.9 KB
ID:	7666  

Last fiddled with by flashjh on 2012-02-17 at 18:23
flashjh is offline   Reply With Quote
Old 2012-02-17, 18:39   #37
Robert_47
 
Mar 2009

2·11 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Grrrr. Does Options/CPU identify the chip as supporting AVX?

If not, can you add the line "CpuSupportsAVX=1" to local.ini and let me know if your benchmarks indicate prime95 runs faster with AVX vs. v26 using SSE2? Thanks.
It does indicate AVX support, and adding CpuSupportsAVX=1 does nothing. Adding CpuArchitecture=5 does work, with the following results.

Code:
[Fri Feb 17 11:10:00 2012]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
AMD FX(tm)-4100 Quad-Core Processor            
CPU speed: 7145.40 MHz, 4 cores
CPU features: Prefetch, MMX, SSE, SSE2, SSE4, AVX
L1 cache size: 16 KB
L2 cache size: 2 MB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 1024
Prime95 64-bit version 27.3, RdtscTiming=1
Best time for 768K FFT length: 16.035 ms., avg: 16.180 ms.
Best time for 896K FFT length: 18.598 ms., avg: 18.746 ms.
Best time for 1024K FFT length: 21.362 ms., avg: 21.603 ms.
Best time for 1280K FFT length: 27.315 ms., avg: 27.488 ms.
Best time for 1536K FFT length: 34.157 ms., avg: 34.386 ms.
Best time for 1792K FFT length: 38.610 ms., avg: 39.915 ms.
Best time for 2048K FFT length: 45.505 ms., avg: 45.838 ms.
Best time for 2560K FFT length: 58.154 ms., avg: 58.291 ms.
Best time for 3072K FFT length: 72.854 ms., avg: 73.033 ms.
Best time for 3584K FFT length: 87.523 ms., avg: 88.185 ms.
Best time for 4096K FFT length: 101.059 ms., avg: 101.255 ms.
Best time for 5120K FFT length: 126.274 ms., avg: 126.459 ms.
Best time for 6144K FFT length: 154.792 ms., avg: 155.017 ms.
Best time for 7168K FFT length: 180.377 ms., avg: 180.638 ms.
Best time for 8192K FFT length: 228.028 ms., avg: 229.531 ms.
Timing FFTs using 2 threads.
Best time for 768K FFT length: 10.198 ms., avg: 10.450 ms.
Best time for 896K FFT length: 12.109 ms., avg: 12.195 ms.
Best time for 1024K FFT length: 13.813 ms., avg: 14.111 ms.
Best time for 1280K FFT length: 17.648 ms., avg: 17.932 ms.
Best time for 1536K FFT length: 22.396 ms., avg: 22.549 ms.
Best time for 1792K FFT length: 26.009 ms., avg: 26.172 ms.
Best time for 2048K FFT length: 29.854 ms., avg: 30.128 ms.
Best time for 2560K FFT length: 38.316 ms., avg: 38.867 ms.
Best time for 3072K FFT length: 47.161 ms., avg: 47.609 ms.
Best time for 3584K FFT length: 56.724 ms., avg: 56.919 ms.
Best time for 4096K FFT length: 64.791 ms., avg: 65.295 ms.
Best time for 5120K FFT length: 81.092 ms., avg: 81.646 ms.
Best time for 6144K FFT length: 101.216 ms., avg: 102.067 ms.
Best time for 7168K FFT length: 119.703 ms., avg: 120.206 ms.
Best time for 8192K FFT length: 147.991 ms., avg: 148.265 ms.
Timing FFTs using 3 threads.
Best time for 768K FFT length: 7.077 ms., avg: 7.567 ms.
Best time for 896K FFT length: 8.158 ms., avg: 8.368 ms.
Best time for 1024K FFT length: 9.247 ms., avg: 9.802 ms.
Best time for 1280K FFT length: 12.015 ms., avg: 12.205 ms.
Best time for 1536K FFT length: 14.970 ms., avg: 15.260 ms.
Best time for 1792K FFT length: 17.418 ms., avg: 17.661 ms.
Best time for 2048K FFT length: 20.051 ms., avg: 20.346 ms.
Best time for 2560K FFT length: 25.626 ms., avg: 26.019 ms.
Best time for 3072K FFT length: 31.602 ms., avg: 32.133 ms.
Best time for 3584K FFT length: 37.741 ms., avg: 38.158 ms.
Best time for 4096K FFT length: 43.382 ms., avg: 43.971 ms.
Best time for 5120K FFT length: 54.130 ms., avg: 54.678 ms.
Best time for 6144K FFT length: 67.366 ms., avg: 68.239 ms.
Best time for 7168K FFT length: 78.552 ms., avg: 79.195 ms.
Best time for 8192K FFT length: 97.640 ms., avg: 98.646 ms.
Timing FFTs using 4 threads.
Best time for 768K FFT length: 5.554 ms., avg: 5.650 ms.
Best time for 896K FFT length: 6.334 ms., avg: 6.482 ms.
Best time for 1024K FFT length: 7.213 ms., avg: 7.749 ms.
Best time for 1280K FFT length: 9.412 ms., avg: 9.538 ms.
Best time for 1536K FFT length: 11.761 ms., avg: 12.036 ms.
Best time for 1792K FFT length: 13.605 ms., avg: 13.865 ms.
Best time for 2048K FFT length: 15.838 ms., avg: 16.180 ms.
Best time for 2560K FFT length: 20.355 ms., avg: 20.613 ms.
Best time for 3072K FFT length: 24.890 ms., avg: 25.225 ms.
Best time for 3584K FFT length: 29.366 ms., avg: 29.775 ms.
Best time for 4096K FFT length: 34.192 ms., avg: 34.332 ms.
Best time for 5120K FFT length: 42.279 ms., avg: 42.504 ms.
Best time for 6144K FFT length: 53.200 ms., avg: 53.802 ms.
[Fri Feb 17 11:15:03 2012]
Best time for 7168K FFT length: 61.309 ms., avg: 61.941 ms.
Best time for 8192K FFT length: 77.664 ms., avg: 78.375 ms.
Best time for 61 bit trial factors: 2.898 ms.
Best time for 62 bit trial factors: 2.935 ms.
Best time for 63 bit trial factors: 3.331 ms.
Best time for 64 bit trial factors: 3.821 ms.
Best time for 65 bit trial factors: 4.922 ms.
Best time for 66 bit trial factors: 7.235 ms.
Best time for 67 bit trial factors: 7.068 ms.
Best time for 75 bit trial factors: 5.679 ms.
Best time for 76 bit trial factors: 5.657 ms.
Best time for 77 bit trial factors: 5.664 ms.
The 26.6 benchmarks:

Code:
[Fri Feb 17 11:16:31 2012]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
AMD FX(tm)-4100 Quad-Core Processor            
CPU speed: 7145.37 MHz, 4 cores
CPU features: Prefetch, MMX, SSE, SSE2, SSE4, AVX
L1 cache size: 16 KB
L2 cache size: 2 MB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 1024
Prime95 64-bit version 26.6, RdtscTiming=1
Best time for 768K FFT length: 12.699 ms., avg: 13.011 ms.
Best time for 896K FFT length: 14.730 ms., avg: 15.182 ms.
Best time for 1024K FFT length: 16.543 ms., avg: 16.850 ms.
Best time for 1280K FFT length: 21.618 ms., avg: 22.041 ms.
Best time for 1536K FFT length: 27.136 ms., avg: 27.453 ms.
Best time for 1792K FFT length: 32.042 ms., avg: 32.217 ms.
Best time for 2048K FFT length: 35.837 ms., avg: 36.206 ms.
Best time for 2560K FFT length: 44.669 ms., avg: 45.025 ms.
Best time for 3072K FFT length: 55.582 ms., avg: 55.973 ms.
Best time for 3584K FFT length: 70.664 ms., avg: 70.783 ms.
Best time for 4096K FFT length: 72.125 ms., avg: 73.739 ms.
Best time for 5120K FFT length: 95.987 ms., avg: 96.159 ms.
Best time for 6144K FFT length: 118.114 ms., avg: 118.218 ms.
Best time for 7168K FFT length: 145.150 ms., avg: 145.459 ms.
Best time for 8192K FFT length: 160.341 ms., avg: 160.421 ms.
Timing FFTs using 2 threads.
Best time for 768K FFT length: 8.115 ms., avg: 8.359 ms.
Best time for 896K FFT length: 9.243 ms., avg: 9.315 ms.
Best time for 1024K FFT length: 10.619 ms., avg: 10.959 ms.
Best time for 1280K FFT length: 13.965 ms., avg: 14.182 ms.
Best time for 1536K FFT length: 17.407 ms., avg: 17.531 ms.
Best time for 1792K FFT length: 19.832 ms., avg: 20.011 ms.
Best time for 2048K FFT length: 23.235 ms., avg: 23.496 ms.
Best time for 2560K FFT length: 28.494 ms., avg: 28.726 ms.
Best time for 3072K FFT length: 35.235 ms., avg: 35.437 ms.
Best time for 3584K FFT length: 47.298 ms., avg: 47.841 ms.
Best time for 4096K FFT length: 47.045 ms., avg: 47.491 ms.
Best time for 5120K FFT length: 60.316 ms., avg: 60.747 ms.
Best time for 6144K FFT length: 74.983 ms., avg: 75.628 ms.
Best time for 7168K FFT length: 91.288 ms., avg: 91.763 ms.
Best time for 8192K FFT length: 105.784 ms., avg: 106.304 ms.
Timing FFTs using 3 threads.
Best time for 768K FFT length: 5.506 ms., avg: 5.592 ms.
Best time for 896K FFT length: 6.490 ms., avg: 6.588 ms.
Best time for 1024K FFT length: 7.328 ms., avg: 7.850 ms.
Best time for 1280K FFT length: 9.430 ms., avg: 9.531 ms.
Best time for 1536K FFT length: 11.618 ms., avg: 11.750 ms.
Best time for 1792K FFT length: 13.882 ms., avg: 14.021 ms.
Best time for 2048K FFT length: 15.552 ms., avg: 15.774 ms.
Best time for 2560K FFT length: 19.449 ms., avg: 19.657 ms.
Best time for 3072K FFT length: 23.973 ms., avg: 24.064 ms.
Best time for 3584K FFT length: 32.700 ms., avg: 33.082 ms.
Best time for 4096K FFT length: 31.916 ms., avg: 31.972 ms.
Best time for 5120K FFT length: 41.108 ms., avg: 41.458 ms.
Best time for 6144K FFT length: 50.510 ms., avg: 51.064 ms.
Best time for 7168K FFT length: 61.613 ms., avg: 62.130 ms.
Best time for 8192K FFT length: 69.090 ms., avg: 69.676 ms.
Timing FFTs using 4 threads.
Best time for 768K FFT length: 4.364 ms., avg: 4.425 ms.
Best time for 896K FFT length: 5.044 ms., avg: 5.093 ms.
Best time for 1024K FFT length: 5.716 ms., avg: 6.188 ms.
Best time for 1280K FFT length: 7.491 ms., avg: 7.534 ms.
Best time for 1536K FFT length: 9.245 ms., avg: 9.304 ms.
Best time for 1792K FFT length: 10.880 ms., avg: 11.000 ms.
Best time for 2048K FFT length: 12.416 ms., avg: 12.567 ms.
Best time for 2560K FFT length: 15.492 ms., avg: 15.610 ms.
Best time for 3072K FFT length: 19.008 ms., avg: 19.299 ms.
Best time for 3584K FFT length: 25.681 ms., avg: 25.914 ms.
Best time for 4096K FFT length: 25.362 ms., avg: 25.521 ms.
Best time for 5120K FFT length: 32.789 ms., avg: 32.869 ms.
Best time for 6144K FFT length: 40.238 ms., avg: 40.991 ms.
Best time for 7168K FFT length: 48.517 ms., avg: 48.870 ms.
Best time for 8192K FFT length: 56.353 ms., avg: 57.335 ms.
Best time for 61 bit trial factors: 2.898 ms.
Best time for 62 bit trial factors: 2.935 ms.
Best time for 63 bit trial factors: 3.327 ms.
Best time for 64 bit trial factors: 3.822 ms.
Best time for 65 bit trial factors: 4.925 ms.
Best time for 66 bit trial factors: 5.845 ms.
Best time for 67 bit trial factors: 5.814 ms.
Best time for 75 bit trial factors: 5.674 ms.
Best time for 76 bit trial factors: 5.656 ms.
Best time for 77 bit trial factors: 5.659 ms.
Both benchmarks were 64 bits. The actual CPU speed was 3800MHz.
Robert_47 is offline   Reply With Quote
Old 2012-02-17, 19:30   #38
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

753410 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
With hyperthreading disabled:1: 12.8ms
2: 13.2ms
3: 13.6ms
4: 14.8ms
5: 16.7ms
6: ~19ms (ranges from 18.3 to 21.1 in different workers)
Interesting - worse than I would have expected (I assume all 4 memory channels are populated). Perhaps contention for the L3 cache is the culprit. If I understand correctly, your L3 cache has to feed all 6 cores.
Prime95 is online now   Reply With Quote
Old 2012-02-17, 19:34   #39
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3,767 Posts
Default

Quote:
Originally Posted by Robert_47 View Post
It does indicate AVX support, and adding CpuSupportsAVX=1 does nothing. Adding CpuArchitecture=5 does work, with the following results.
Pretty grim.

Until I write a version that supports FMA, it looks like I need to steer Bulldozer down the SSE2 path. The way AMD implemented AVX on Bulldozer, SSE2 and AVX have the same theoretical throughput unless you use FMA.
Prime95 is online now   Reply With Quote
Old 2012-02-17, 19:41   #40
Jwb52z
 
Jwb52z's Avatar
 
Sep 2002

17·47 Posts
Default

How many more interim versions do you think there will be before non-Sandy Bridge CPUs can utilize version 27?
Jwb52z is offline   Reply With Quote
Old 2012-02-17, 19:46   #41
Robert_47
 
Mar 2009

2210 Posts
Default

Quote:
Originally Posted by Lennart View Post
What software did you use ?


Lennart
Prime95, and Rebirther's newest llr and pfgw. All three have the same problems for both 32 bit and 64 bit.
Robert_47 is offline   Reply With Quote
Old 2012-02-17, 19:50   #42
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11×311 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Interesting - worse than I would have expected (I assume all 4 memory channels are populated). Perhaps contention for the L3 cache is the culprit. If I understand correctly, your L3 cache has to feed all 6 cores.
All 4 channels are populated, but the RAM is running at 1333MHz, so performance may be better with faster RAM.
James Heinrich is offline   Reply With Quote
Old 2012-02-17, 20:11   #43
aketilander
 
aketilander's Avatar
 
"Åke Tilander"
Apr 2011
Sandviken, Sweden

10668 Posts
Thumbs up Intel Core i7-3930K @ 3.20GHz

Well, when I changed from version 26.6 to version 27.3 on my Intel Core i7-3930K @ 3.20GHz the per iteration time changed from 0.033 to 0.021 on all six cores under Windows ultimate 64-bit. Alla six cores report that they are using AVX.

So there seems to be a VERY large improvement!
aketilander is offline   Reply With Quote
Old 2012-02-17, 20:12   #44
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3,767 Posts
Default

Quote:
Originally Posted by Jwb52z View Post
How many more interim versions do you think there will be before non-Sandy Bridge CPUs can utilize version 27?
I don't know when we will have a final v27 release. This version will work on a non-Sandy Bridge CPU, but it won't be any faster than v26.6.
Prime95 is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 version 29.4 Prime95 Software 441 2020-02-16 15:18
Prime95 version 26.3 Prime95 Software 76 2010-12-11 00:11
Prime95 version 25.5 Prime95 PrimeNet 369 2008-02-26 05:21
Prime95 version 25.4 Prime95 PrimeNet 143 2007-09-24 21:01
When the next prime95 version ? pacionet Software 74 2006-12-07 20:30

All times are UTC. The time now is 17:51.


Sun Aug 1 17:51:28 UTC 2021 up 9 days, 12:20, 0 users, load averages: 2.74, 2.45, 2.01

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.