mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Prime95 version 28.6 / 28.7 (28.7 now available!) (https://www.mersenneforum.org/showthread.php?t=20156)

Prime95 2015-04-05 17:29

Prime95 version 28.6 / 28.7 (28.7 now available!)
 
Prime95 version 28.7 build 1 is available. See [URL]http://www.mersenneforum.org/showpost.php?p=368540&postcount=2[/URL] or the whatsnew.txt file for a list of new features. See next post for bugs found in this release.

[B]Important:[/B]The bug that caused the N-1 primality test of 1024*3^1877301+1 to fail in PFGW and LLR was a serious bug. It is triggered by a lengthy carry propagation. [B]I cannot prove that Mersenne LL tests are immune to this bug,[/B] although I've rerun dozens of Mersenne prime LL tests and every one passed. The bug affects AVX and FMA computers (more details in a later post). The bug has been present since version 27.1. To be safe, I recommend all users with Sandy Bridge or later CPUs upgrade to this version.


Download links:
Windows 64-bit: [URL]ftp://mersenne.org/gimps/p95v287.win64.zip[/URL]
Linux 64-bit: [URL]ftp://mersenne.org/gimps/p95v287.linux64.tar.gz[/URL]
Mac OS X: [URL]ftp://mersenne.org/gimps/p95v287.MacOSX.zip[/URL]
FreeBSD 10 64-bit: [URL]ftp://mersenne.org/gimps/p95v287.FreeBSD10-64.tar.gz[/URL]
Windows 32-bit: [URL]ftp://mersenne.org/gimps/p95v287.win32.zip[/URL]
Linux 32-bit: [URL]ftp://mersenne.org/gimps/p95v287.linux32.tar.gz[/URL]
Source: [URL]ftp://mersenne.org/gimps/p95v287.source.zip[/URL]

Prime95 2015-04-05 17:29

1) In the rarely used undoc.txt feature, TitleOutputFrequency cannot be less than ScaledOutputFrequency. This worked in some 28.5 versions. Fixed in 28.7.
2) Roundoff errors PRPing 55459*2^159718+1 in 32-bit mode. For version 28.7, the maximum exponent that can be tested with a 12K SSE2 FFT has been reduced by 0.5%.
3) Base 2 PRP test of (48591^11329-1)/48590 using AVX FFTs failed. Fixed in version 28.7.
4) Some have complained about the number of roundoff error warnings (and retries) generated. Prime95 will now warn for roundoffs > 27/64 if near the FFT limit and 26/64 for FFTs not near the limit. Fixed in 28.7.
5) Some Intel CPUs not recognized. Added support for CPUID results from several Atom, Haswell, Broadwell, and Skylake CPUs. Fixed in 28.7.
6) A rare bug in generating the final 64-bit residue occurred when the shift count was more than exponent - 64. Fixed in 28.8.
7) LLR failed testing 13126*39^85217-1 using AVX FFTs. This is another example of the carry propagation bug in the add and subtract that was "fixed" in 28.6. A hopefully better fix is now in place for version 28.8.
8) PRP of 10223*2^29588045-1 generated roundoff errors using 2400K to 2688K FFT. Very rare bug computing number of bits to stuff in each FFT word -- computation required more than 53 bits of precision. Fixed in 28.8.

Batalov 2015-04-05 17:35

Thank you!

I will build LLR with gwnum 28.6 and will rerun multiple random tests against a large collection of results that I have (e.g. from CRUS); if any RES64 mismatches will come up, I will share the statistics.

[COLOR=Blue]EDIT: no RES64 mismatches so far in 14,000 random double-check tests (from 0.25 to 4 hours run time range).
7 known primes re-confirmed.[/COLOR]

Dubslow 2015-04-05 17:38

Head post link needs fixing

"http//www.mersenneforum.org/showpost.php?p=368540&postcount=2"

Prime95 2015-04-05 18:13

More on the serious carry propagation bug. The bug was in the adding and subtracting code -- not in the main FFT code. The error only occurs if a single carry propagates over more than 8 FFT words (and only if the carry chain starts at one particular FFT word). Assuming, there are about 18 bits per FFT word, the chance that random data would trigger this bug is about 1 in 2^(18*8) -- i.e. pretty darn low.

If this bug is so rare, how was it found? Well, when testing a prime number the data in the last one or two iterations is decidedly non-random and a single carry propagating over more than eight FFT words can happen.


Prime95 uses the addition and subtraction code only when it is doing a "square carefully" operation.

Further investigation reveals that during an LL test the only time a square careful operation takes place is when a roundoff error > 0.4 is detected and the iteration is redone. Thus to miss a Mersenne prime, you would need to get a roundoff error in the last one or two non-random iterations and trigger the bug in carefully redo-ing the iteration.

For PRP tests, the first 30 and last 30 iterations are always done with a square carefully operation. Thus, a PRP test is much more susceptible to the bug. Though at this time there are no known cases of a PRP test failing due to this bug.

P.S. Any tests using one-pass FFTs or zero-padded FFTs were unaffected by this bug.

ewmayer 2015-04-06 07:09

George: Does your code always check that any carryouts = 0?

I similarly allow for carries propagating at most 8 words, but if this is violated, it trips an error message and assertion-exit. Not the most elegant solution, but far better than 'silent failure' which allows the run to continue.

One quick way to test the relevant logic is to deliberately use an overlong FFT, say 2x as long as the default value. With < 10 bits per word, one is almost guaranteed to get the kinds of longer-than-expected carry propagations you describe.

henryzz 2015-04-06 08:17

Am I correct in thinking that with different shift counts this wouldn't happen in both. If it did it would be a separate random occurance and the chance of the residues being affected the same way is rediculously unlikely.

Prime95 2015-04-06 13:54

[QUOTE=henryzz;399432]Am I correct in thinking that with different shift counts this wouldn't happen in both. If it did it would be a separate random occurance and the chance of the residues being affected the same way is rediculously unlikely.[/QUOTE]

Correct. The different shift counts result in different FFT data. The bug would not affect both LL runs.

Prime95 2015-04-06 13:58

[QUOTE=ewmayer;399430]George: Does your code always check that any carryouts = 0?[/QUOTE]

No. The fix is that if after 7 carry propagates the carry is still non-zero then the carry is simply added to the 8th word.

Batalov 2015-04-06 15:35

I can see now why this was happening (that's the carry effect in two runs with different FFT sizes, one correct and one "under-carried"):
[QUOTE=Batalov;390400]It doesn't fully get rid of errors.
...The interim files for the last 500 bits are different (23 bytes out the whole file are different; and this situation lingers for the whole stretch of the last 500 bits; inconcievable {one would expect that a different bit-state in one iteration even by a few bits should scramble the whole number over just a few more iterations}, but true:
[CODE]> cmp -l ../BBa1/z3083805.2975460 z3083805.2975460
372225 100 200
372226 10 304
372227 30 307
372228 3 23
372241 130 230
372242 10 304
372243 30 307
372244 3 23
372253 300 24
372254 310 213
372255 220 201
372257 160 260
372258 265 161
372259 35 315
372260 3 23
744180 326 163
744181 12 351
744182 357 74
744183 315 317
744195 114 314
744196 66 53
744197 55 234
744198 324 336

> cmp -l ../BBa1/z3083805.2975460 z3083805.2975460 | wc -l
23
[/CODE]I don't understand the bit-state enough...
[/QUOTE]

pepi37 2015-04-06 21:44

I found a bug ( maybe it is feature ) ? :)

If worktodo.txt looks like
[Worker #1]
PRP=95,10,466002,-1
[Worker #2]
PRP=95,10,466002,-1

then in first window is this
[Apr 6 23:42:04] Worker starting
[Apr 6 23:42:04] Setting affinity to run worker on any logical CPU.
[Apr 6 23:42:04] Setting affinity to run helper thread 1 on any logical CPU.
[Apr 6 23:42:04] Starting PRP test of 95*10^466002-1 using AVX FFT length 112K, Pass1=448, Pass2=256, 2 threads

and in second is this
[Apr 6 23:42:04] Waiting 5 seconds to stagger worker starts.
[Apr 6 23:42:10] Worker starting
[Apr 6 23:42:10] Setting affinity to run worker on any logical CPU.
[Apr 6 23:42:10] Setting affinity to run helper thread 1 on any logical CPU.
[Apr 6 23:42:10] [B]Resuming PRP test of 95*10^466002-1 using AVX FFT length 112K, Pass1=448, Pass2=256, 2 threads[/B]
[Apr 6 23:42:12] [B]Iteration: 2 / 1548031 [0.00%].[/B]

[U]But since PRP is start from zero it is incorrect[/U]. Also no calculation or backup file is written for second window...


Also in 28.6 I cannot get to work

[COLOR=Red]TitleOutputFrequency=5000[/COLOR]
ClassicOutput=1
TimingOutput=1

In 28.5 this settings works perfectly, and in any moment I can put mouse on icon and get current status. But in 28.6 this doesnot work :(


All times are UTC. The time now is 05:16.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.