![]() |
![]() |
#78 |
P90 years forever!
Aug 2002
Yeehaw, FL
3×11×223 Posts |
![]() |
![]() |
![]() |
![]() |
#79 | |
Nov 2002
Anchorage, AK
3×7×17 Posts |
![]() Quote:
Thanks |
|
![]() |
![]() |
![]() |
#80 |
Oct 2008
n00bville
52×29 Posts |
![]()
What per cent speed impact does the activation of the Round off checking and SUMS(input) error (both or only one of them) have?
|
![]() |
![]() |
![]() |
#81 |
P90 years forever!
Aug 2002
Yeehaw, FL
3×11×223 Posts |
![]() |
![]() |
![]() |
![]() |
#82 |
Oct 2008
n00bville
52·29 Posts |
![]() |
![]() |
![]() |
![]() |
#83 |
May 2010
32·7 Posts |
![]()
I ran the benchmark option of a freshly loaded p95v262.zip and noticed a couple of anamolies on my Core i7-920 @3.71 GHz system
In the worker window, the benchmark spams a bit in v26.2: [Sep 26 19:22] Timing 10 iterations at 5120K FFT length. Best time: 20.389 ms., avg time: 20.724 ms. [Sep 26 19:22] Setting affinity to run helper thread 1 on logical CPU #1 [Sep 26 19:22] Setting affinity to run helper thread 3 on logical CPU #3 [Sep 26 19:22] Setting affinity to run helper thread 2 on logical CPU #2 [Sep 26 19:22] Setting affinity to run helper thread 1 on logical CPU #1 [Sep 26 19:22] Setting affinity to run helper thread 2 on logical CPU #2 [Sep 26 19:22] Setting affinity to run helper thread 3 on logical CPU #3 [Sep 26 19:22] Setting affinity to run helper thread 1 on logical CPU #1 [Sep 26 19:22] Setting affinity to run helper thread 3 on logical CPU #3 [Sep 26 19:22] Setting affinity to run helper thread 2 on logical CPU #2 [Sep 26 19:22] Setting affinity to run helper thread 1 on logical CPU #1 [Sep 26 19:22] Setting affinity to run helper thread 2 on logical CPU #2 [Sep 26 19:22] Setting affinity to run helper thread 3 on logical CPU #3 [Sep 26 19:22] Timing 10 iterations at 6144K FFT length. Best time: 25.928 ms., avg time: 26.193 ms. Also, the Trial Factoring benchmark isn't very believable. Note the sudden jump between 62 & 63 bits. Sep 26 19:22] Timing trial factoring of M35000011 with 58 bit length factors. Best time: 2.620 ms. [Sep 26 19:22] Timing trial factoring of M35000011 with 59 bit length factors. Best time: 2.623 ms. [Sep 26 19:22] Timing trial factoring of M35000011 with 60 bit length factors. Best time: 2.630 ms. [Sep 26 19:22] Timing trial factoring of M35000011 with 61 bit length factors. Best time: 2.621 ms. [Sep 26 19:22] Timing trial factoring of M35000011 with 62 bit length factors. Best time: 2.642 ms. [Sep 26 19:22] Timing trial factoring of M35000011 with 63 bit length factors. Best time: 4.423 ms. [Sep 26 19:22] Timing trial factoring of M35000011 with 64 bit length factors. Best time: 4.414 ms. [Sep 26 19:22] Timing trial factoring of M35000011 with 65 bit length factors. Best time: 4.049 ms. [Sep 26 19:22] Timing trial factoring of M35000011 with 66 bit length factors. Best time: 4.023 ms. [Sep 26 19:22] Timing trial factoring of M35000011 with 67 bit length factors. Best time: 4.019 ms. I love the ~15% speed up in 2560K FFTs compared to v25.11 ( 32ms vs 38ms) I find it hard to believe that trial factoring is actually slower in 26.2, but that's what the benchmark says (up to 10%, depending on the factor size). Here are my result files from v25.11 and v26.2 benchmarking, plus a copy of Worker window 1 from v26: Results & Worker Window.zip |
![]() |
![]() |
![]() |
#84 | |
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
107138 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#85 | |
Dec 2003
2·7 Posts |
![]() Quote:
trial factoring timing is quite different between 32-bit version and 64-bit verison. 64-bit is much faster than 32-bit version. please benchmark Prime95 64-bit version 26.2 and compare with Prime95 64-bit version 25.11, and check the results. |
|
![]() |
![]() |
![]() |
#86 | |
Dec 2003
2×7 Posts |
![]() Quote:
Windows 64-bit: ftp://mersenne.org/gimps/p64v262.zip |
|
![]() |
![]() |
![]() |
#87 |
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
2·41·71 Posts |
![]()
This is a gwnum problem not cllr. For some reason cllr is using a pentium 4 fft method on my Q6600. Why not the core 2 one? How does it select which to use?
Code:
Starting Lucas Lehmer Riesel prime test of 595*2^910447-1 Using Pentium4 type-3 FFT length 64K, Pass1=256, Pass2=256 |
![]() |
![]() |
![]() |
#88 | |
P90 years forever!
Aug 2002
Yeehaw, FL
3·11·223 Posts |
![]() Quote:
1) You are probably using a 32-bit executable. The difference between Pentium 4 optimized building blocks and Core 2 optimized building blocks is minimal -- no extra registers available. 2) The Pentium 4 prefetch instruction loads 128 bytes, a Core 2 prefetches 64 bytes. Thus a Core 2 optimized FFT has twice as many prefetch instructions. 3) A Core 2 chip has lots of L2 cache. A 64K FFT probably keeps most of its data in cache, making prefetch instructions of little to no value. Thus, a Pentium4 optimized FFT might be a little faster because it wastes less time executing useless prefetch instructions. Anyhow, the FFT that is selected came from me doing actual timings of Pentium-4 and Core2 optmized FFTs. The Pentium4 FFT was a hair faster. Perhaps I should change the FFT description to "Using Pentium4-optimized-even-though-this-is-a-Core2-CPU type-3 FFT" Last fiddled with by Prime95 on 2010-09-27 at 19:38 |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Prime95 version 27.3 | Prime95 | Software | 148 | 2012-03-18 19:24 |
Prime95 version 26.3 | Prime95 | Software | 76 | 2010-12-11 00:11 |
Prime95 version 25.5 | Prime95 | PrimeNet | 369 | 2008-02-26 05:21 |
Prime95 version 25.4 | Prime95 | PrimeNet | 143 | 2007-09-24 21:01 |
When the next prime95 version ? | pacionet | Software | 74 | 2006-12-07 20:30 |