![]() |
|
|
#45 |
|
Sep 2003
2×5×7×37 Posts |
Then GMP should be OK too, since it's dual-licensed under GNU LGPL v3 and GNU GPL v2, so you can elect to use the former. But MPIR is probably easier to use with Windows.
|
|
|
|
|
|
#46 |
|
"Mihai Preda"
Apr 2015
145210 Posts |
[experimental]
I just added the Jacobi check to gpuOwL ( https://github.com/preda/gpuowl ). It checks it on each "persist" checkpoint save, which by default is every 10M iterations. (thus the performance impact of about 0.2%). It uses GMP; compilation with this check is enabled by defining JACOBI like this: export JACOBI=1 make or make JACOBI=1 |
|
|
|
|
|
#47 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
Quote:
If the Jacobi occasionally applied could catch 40% of the few percent of bad LL test residues early at minimal percentage cost and stop them in their tracks, triggering redoing the calculation from the last believed-good save file, that would be exciting news, applicable not only to prime95 but also CUDALucas and perhaps a whole fleet of codes, raising productive output a couple more percent on the same hardware and electrical budget, independently of other potential optimizations. I don't recall seeing sum of inputs checks in the CUDALucas or CUDAPm1 source. CUDAPm1 hasn't the illegal-residue check yet. I have seen the repeating-zero-residue problem show up in stage two of a CUDAPm1 run just today (2.5 day run on ~250M, ouch). I don't know enough about the underlying math to know whether Jacobi or sum of inputs are applicable to P-1. What checks are built into the prime95 P-1 code? (Haven't even tried to read that.) |
|
|
|
|
|
|
#48 |
|
"Forget I exist"
Jul 2009
Dartmouth NS
100001000011012 Posts |
|
|
|
|
|
|
#49 | |
|
Sep 2003
2·5·7·37 Posts |
Quote:
The interim checks benefit only the user in question by saving some wasted computing cycles; the check at the end on the other hand gives GIMPS a 50% chance of identifying a bad result with certainty. If this final check fails, and no recovery from an interim savefile is possible or desired, then the 64-bit residue is garbage and rather than reporting a meaningless wrong value it should be set to some special marker value like 00000000DEADBEEF. |
|
|
|
|
|
|
#50 |
|
"Mihai Preda"
Apr 2015
22×3×112 Posts |
Yes it makes sense to [also] do the check at the end before submitting the final residue.
It's not clear to me what is the right behavior of the software when the check fails -- should it roll back to the most recent "good" point and re-attempt from there, or should it report the hardware as hopelessly broken and give up. The situation is that, when the jacobi detects the first "sure" error, there is a high probability that there were also undetected errors before. Thus rolling back to a recent point only fixes the visible error, while preserving the hidden errors -- not good. |
|
|
|
|
|
#51 |
|
"Mihai Preda"
Apr 2015
22·3·112 Posts |
BTW, here is a paper discussing an efficient algorithm for computing the Jacobi symbol, in case anybody's interested [in attempting to improve GMP's implem for huge numbers]:
https://maths-people.anu.edu.au/~bre..._ACCMCC_10.pdf |
|
|
|
|
|
#52 | |
|
Sep 2003
2·5·7·37 Posts |
Quote:
"Hidden errors" are not a huge problem... they have always existed, and until now we have only been able to discover them when a double check is done, often years later. This Jacobi check offers a shortcut in some cases, it isn't meant to be a guaranteed filter. The worst-case scenario is just the current status quo. In the end, it's up to the user to decide if they want to run LL testing software on their crappy computer. If the software gives up, that doesn't necessarily mean the user will give up. They might have set it up so that the software just keeps running and fetching new exponents. |
|
|
|
|
|
|
#53 |
|
"Mihai Preda"
Apr 2015
22×3×112 Posts |
Yes I agree that the point is not to fight the user.
But assuming the user is well intended and wants to produce high quality results, upon Jacobi-error should we revert to "the most recent point with no visible error" or revert to start. |
|
|
|
|
|
#54 | |
|
Sep 2014
23 Posts |
Quote:
But I did not understand why it is a problem if r < 0 Since (a*b|n) = (a|n)(b|n) and (-1|n) = (-1)^((n-1)/2), isn't (-r|n) = (r|n)(-1)^((n-1)/2) so that the negative value can be dealt with? |
|
|
|
|
|
|
#55 | |
|
Sep 2014
23 Posts |
Quote:
If single errors are observed only during a few complete tests and there are many (at least apparently) good tests in between, it is probably safe to start from the last (apparently) good iteration. If most tests are interrupted with errors, one should probably start from scratch, and then proceed as long as the permanent checkpoints agree. The last such can then be marked as a good continuation point. The test would be carried on until another error is spotted (or till the end). If the run will be completed, it might be a good idea to make another run from the last good continuation point (perhaps depending on the number of errors seen during the run). (That would result in a self-verified doublecheck.) |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Stockfish / Lutefisk game, move 14 poll. Hungry for fish and black pieces. | MooMoo2 | Other Chess Games | 0 | 2016-11-26 06:52 |
| Redoing factoring work done by unreliable machines | tha | Lone Mersenne Hunters | 23 | 2016-11-02 08:51 |
| Unreliable AMD Phenom 9850 | xilman | Hardware | 4 | 2014-08-02 18:08 |
| [new fish check in] heloo | mwxdbcr | Lounge | 0 | 2009-01-14 04:55 |
| The Happy Fish thread | xilman | Hobbies | 24 | 2006-08-22 11:44 |