![]() |
[QUOTE=pepi37;406249]One question for maker of Prime95 ( or anybody else who knows answer)
Lets say that I start Prime95 with 1 thread and three logical cores for PRP test. Then I shutdown Prime95 change from three to four ( and vice-versa) cores and resume computing: will that in any case make result non-confidient?[/QUOTE] Not a problem. |
Thanks for fast answer!
|
Can you compile 287 version?
|
Bump !
[QUOTE=Prime95;399389]Prime95 version 28.6 build 1 is available. ... Important: ... The bug affects AVX and FMA computers (more details in a later post). The bug has been present since version 27.1. To be safe, I recommend all users with Sandy Bridge or later CPUs upgrade to this version. ...[/QUOTE][QUOTE=TObject;401449]Which one is the recommended release as of today, May 05, 2015?[/QUOTE][QUOTE=Prime95;401457]28.6. We're testing some gwnum changes in 28.7 with LLR.[/QUOTE]The version of the software on the main download page is still 28.5. And on this forum the thread for that version [url=http://www.mersenneforum.org/showthread.php?t=19182]Sticky: Prime95 version 28.5 (deprecated, use 28.7)[/url] refers to version 28.7 which, apart from the source files, is not available yet. Then the folders accessible through FTP could do with some maintenance (move old and deprecated versions of the software to the mersenne.org/gimps_archive/archived_executables/ and mersenne.org/gimps_archive/archived_sources/ folders.) Jacob |
So, use 28.6 build 1 :smile:
|
[QUOTE=S485122;406512]Bump !
The version of the software on the main download page is still 28.5. And on this forum the thread for that version [url=http://www.mersenneforum.org/showthread.php?t=19182]Sticky: Prime95 version 28.5 (deprecated, use 28.7)[/url] refers to version 28.7 which, apart from the source files, is not available yet. Then the folders accessible through FTP could do with some maintenance (move old and deprecated versions of the software to the mersenne.org/gimps_archive/archived_executables/ and mersenne.org/gimps_archive/archived_sources/ folders.)[/QUOTE] I think, maybe, George was still testing out some changes in 28.6 before making it official? I'd have to refresh my brain on that... I think because that one problem that was found in whatever part of the code didn't necessarily affect Prime95 28.5? |
[QUOTE=Madpoo;406551]I think, maybe, George was still testing out some changes in 28.6 before making it official? [/QUOTE]
I was. I've since convinced myself that the scary bug that affected that one PRP test has virtual no chance of affecting a Mersenne test. Thus no urgency to make 28.6 (or 28.7) the official release. Since then I've been distracted (I know that is a pathetic excuse) and not bothered to build the 28.7 release. |
reliability of CPU's
The current range of second tier of double checks is at a crossover point between two FFT sizes. I think that at least for some type of CPU architectures (in this case AVX2, FMA) the soft crossover is a bit high. I choose to use the prime.txt setting "SoftCrossoverAdjust=-0.012" because retries make you loose time but also because I am under the impression that there is a bug in detecting the "Result is reproducible" situation (see extracts from prime.log and results.txt files below.)
[code][Thu Jul 30 12:54:57 2015] Trying 1000 iterations for exponent 34772027 using 1792K FFT. If average roundoff error is above 0.2424, then a larger FFT will be used. Final average roundoff error is 0.23739, using 1792K FFT for exponent 34772027. [Sat Aug 01 20:50:31 2015] Iteration: 20289756/34772027, Possible error: round off (0.408292146) > 0.40 Continuing from last save file. [Sat Aug 01 21:14:43 2015] Iteration: 20289756/34772027, Possible error: round off (0.408292146) > 0.40 Continuing from last save file. [Sat Aug 01 21:29:33 2015] Disregard last error. Result is reproducible and thus not a hardware problem. For added safety, redoing iteration using a slower, more reliable method. Continuing from last save file. [Mon Aug 03 12:19:37 2015] UID: S485122/5820K, M34772027 is not prime. Res64: C6C1EFD9E2792EA4. We4: F075176C,16354045,05000600, AID: 0583EF661F2D4F47F49B38D286C19C99 Prime.log [Mon Aug 03 12:19:37 2015 - ver 28.6] Sending result to server: UID: S485122/5820K, M34772027 is not prime. Res64: C6C1EFD9E2792EA4. We4: F075176C,16354045,05000600, AID: 0583EF661F2D4F47F49B38D286C19C99 PrimeNet success code with additional info: Computer hardware check recommended. Possible hardware errors occured during LL test of M34772027 LL test successfully completes double-check of M34772027 ----------- results.txt [Mon Aug 03 14:33:06 2015] Trying 1000 iterations for exponent 34782841 using 1792K FFT. If average roundoff error is above 0.2424, then a larger FFT will be used. Final average roundoff error is 0.23813, using 1792K FFT for exponent 34782841. [Thu Aug 06 06:18:51 2015] Iteration: 23828249/34782841, Possible error: round off (0.4079129255) > 0.40 Continuing from last save file. [Thu Aug 06 06:53:42 2015] Iteration: 23828249/34782841, Possible error: round off (0.4079129255) > 0.40 Continuing from last save file. [Thu Aug 06 07:07:56 2015] Disregard last error. Result is reproducible and thus not a hardware problem. For added safety, redoing iteration using a slower, more reliable method. Continuing from last save file. [Fri Aug 07 12:08:20 2015] UID: S485122/5820K, M34782841 is not prime. Res64: 1E4706824A53B02A. We4: 1FF0A89C,14662417,01000200, AID: B1E3718D8D736FAB0056EF866DB5CA80 Prime.log [Fri Aug 07 12:08:20 2015 - ver 28.6] Sending result to server: UID: S485122/5820K, M34782841 is not prime. Res64: 1E4706824A53B02A. We4: 1FF0A89C,14662417,01000200, AID: B1E3718D8D736FAB0056EF866DB5CA80 PrimeNet success code with additional info: Computer hardware check recommended. Possible hardware errors occured during LL test of M34782841 LL test successfully completes double-check of M34782841 CPU credit is 41.1597 GHz-days.[/code]The double-checks where successful, the round-off that were too high occurred at the same iteration and had the same value (within the limits of what is displayed.) The problem is that the first occurrence of that high round-off was not considered reproducible. A consequence of this is the drop in reliability of the computer which in turn affects the assignments received. Jacob |
[QUOTE=S485122;407446]The current range of second tier of double checks is at a crossover point between two FFT sizes.[/QUOTE]
I've noticed this myself when doing checks in the 34M+ range. I don't remember where the fuzzy area is, but quite a few of the tests involve that initial test to see what the round off is looking like. Even then, once it's made it's FFT choice, about 10% of the time a test will still involve one of those round off > 0.4 errors, resulting in it retrying. At least on my machines it is *always* reproducible, so then it does it again using the safer method, etc. All in all, that means probably a 1 hour delay in getting to the end of the test... does that sound right? An extra half hour or so to get the reproducible error, then another half hour or so using the safer method to redo it? Of course if the test had picked the next higher FFT, the overall test might have taken more than an extra hour to run. I guess that depends on how many cores you're throwing at it. On a 10-core worker I can do a 34M exponent in about 13 hours, so losing an hour because of roundoff errors is kind of a big deal, and choosing the next higher FFT from the start may have only added another 30 minutes? I'm guessing there. Anyway, I don't worry about it too much. Yeah, it makes the resulting error code non-zero, but if it's reproducible then the error code reflects that and it's not really too bad. In all of my recent tests, I've only had *one* instance where it was unable to reproduce the error, but I think it may have occurred in the last few iterations so it never had the chance. My residue matched one of the other runs anyway so it was all good, but it was a pretty rare thing so again, I didn't let it bug me. :smile: |
[QUOTE=Madpoo;407572]once it's made it's FFT choice, about 10% of the time a test will still involve one of those round off > 0.4 errors, resulting in it retrying. At least on my machines it is *always* reproducible, so then it does it again using the safer method, etc.[/QUOTE]
I'll change the code. It presently raises a warning & retries for roundoffs > 26/64. The new code will warn for roundoffs > 27/64 if near the FFT limit and 26/64 for FFTs not near the limit. BTW, there is a remote chance that a roundoff error will not be reproducible. Carry propagation takes a "shortcut" that does not guarantee the FFT data is in a strictly balanced representation. Reading data from the intermediate file does store the number in a strictly balanced representation. |
[QUOTE=Batalov;403453]There is some blanket rule that kicks in at n>10000 for FFT choice -- and only generic reduction AVX FFT is used for all n's.
(For n<10000, all-complex AVX FFT is used most of the time; I can send you a lightly sieved set of n, or else for debugging tests you can use any n.) If you use n=351111 for example, the error will be well-controlled for all-complex AVX FFT, yet it will not be chosen. I think an appropriate sized all-complex AVX FFT for this form can always be used, but even with forced FFT2=NNN I cannot force it because for some ranges of n, even the forced FFT2 does not force all-complex, but a zero-padded instead.[/QUOTE] I do not feel comfortable making this change. I tried changing this code: [CODE]/* If the bits in an output word is less than the maximum allowed (or the user is trying to force us */ /* to use this FFT), then we can probably use this FFT length -- though we need to do a few more tests. */ if (weighted_bits_per_output_word <= max_weighted_bits_per_output_word || gwdata->specific_fftlen) { double carries_spread_over; [/CODE] to this: [CODE]/* If the bits in an output word is less than the maximum allowed (or the user is trying to force us */ /* to use this FFT), then we can probably use this FFT length -- though we need to do a few more tests. */ if (bits_per_output_word <= max_bits_per_output_word || gwdata->specific_fftlen) { // if (weighted_bits_per_output_word <= max_weighted_bits_per_output_word || gwdata->specific_fftlen) { double carries_spread_over; [/CODE] but was not happy with the results. You may wish to try your own tweaks to see if you can come up with a criteria that works over all k*b^n+c values. |
| All times are UTC. The time now is 20:38. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.