View Single Post
 2019-02-07, 14:36 #268 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 116318 Posts Something I've thought about from time to time and it comes up as questions by others, is beginning at iteration small x>0 with a precomputed interim term. For LL, seed of 4, iteration 1 14, iteration 2 194, etc, and it almost doubles in number of bits per iteration, so iteration 3, 37634, 16 bits, still fits in a single word and is far shorter than the mod n for any reasonable sized exponent. So begin with 37634 and the next iteration is called 4. There's an equivalent in PRP3. Saving a few iterations is usually dismissed as not worth the bother, at current exponents. Yes, as a fraction of the work it's tiny, only ~3/100M saved, 30ppb. There are around 400,000 exponents to primality test within 10% of 100M, times two each; 800,000. It adds up to about 0.024 primality tests saved over that span. Step and repeat over additional exponent spans, and the tiny savings add up overall, theoretically, ignoring erasure of some by people quitting, and the savings being distributed over the many participants. Seems like a tiny but fast and straightforward tweak to implement. But it does not result in somewhere an additional primality test being completed, because the savings are divided up among too many systems and workers. (Maybe it results in additional factoring, but even that seems unlikely.) Results are reached a tiny bit sooner. Over a year's time, 30ppb is ~1 second. I just spent centuries worth of the savings estimating and describing it. (If one was doing 64 bit unsigned int math, it could go to iteration 5. In a variable length representation, the initial iterations are only a single word long so the net effect is much much less.) This is also why the first 5 iterations' res64 values are independent of exponent>64. In hex: seed 4 1 E 2 C2 3 9302 4 546B 4C02 5 1BD6 96D9 F03D 3002 So, it seems, in the existing software, one could consider adding this test. After the first 5 iterations, for any sizable exponent and any shift, generate the res64, and test for a match. If not a match, there was excessive round-off or some other error, perhaps an initialization error or memory. I wonder if any existing software does that, or if the code wizards have determined it would detect an error so rarely it is a waste of time. It could perhaps save some novice user from a long wasted run with too small an fft length, but only if roundoff or other error may have a detectable effect this early. Probably if there was something to be gained George and Ernst would be doing it already. I'd find interesting an answer as to why it's not worthwhile. It might be useful in the gpu arena, such as CUDALucas, where some configurations with some gpu models produce wrong residues from the start. Last fiddled with by kriesel on 2019-02-07 at 14:53