 2017-12-10, 12:30 #1 axn     Jun 2003 7·23·29 Posts Pminus1 - Bug report and feature request The following is a bug in calc_exp Code:  if (len >= 50) { giant x; calc_exp (pm1data, k, b, n, c, g, B1, p, lower, lower + (len >> 1)); x = allocgiant (len); calc_exp (pm1data, k, b, n, c, x, B1, p, lower + (len >> 1), upper); mulg (x, g); free (x); return; } itog (2*n, g); for ( ; *p <= B1 && (unsigned long) g->sign < len; *p = sieve (pm1data->sieve_info)) { uint64_t val, max; val = *p; max = B1 / *p; while (val <= max) val *= *p; ullmulg (val, g); } g is initialized to 2*n. The problem is that, this is done for every (recursive) tail call of calc_exp, instead of just the first one, so we end up with B/50 (or more) multiples of 2*n. This just makes stage 0 longer than needed (probably on the order of 5% or so). What is needed: Code: if(lower==0) itog (2*n, g); else setone(g); Now, the feature request: Make stage 0 bigger. Right now it is a paltry 1M (stage_0_limit = (pm1data.B > 1000000) ? 1000000 : pm1data.B;), which is probably fine for regular GIMPS work, but at the mid-to-low end, we routinely deal with much bigger B1 So instead, make it as big as possible (100m is not too bad, since we're looking at < 20MB memory use). If calc_exp takes lot of time to do the actual calculation, make it as big as can be reasonably done by a modern processor in, say 60 seconds. Last fiddled with by axn on 2017-12-10 at 12:31
 2017-12-10, 17:18 #2 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 24·439 Posts Dang, the code used to be right and a bug fix broke it. I'll work up a re-fix. Interestingly, I also saw the stage 0 optimization opportunity a month ago. The only hard part of implementing this is that it breaks existing save files that paused during stage 0. Not insurmountable, I need to bump the version number and either make calc_exp support both formulas based on the save file version number OR simply toss old save files that are in stage 0. I'm feeling lazy today -- tossing sounds better.
Quote:
 Originally Posted by axn Now, the feature request:Make stage 0 bigger.
I've got this coded, raising B1 to 13.3M (about 20M bits). Do you think it should be more? I replaced giants code with GMP code so calc_exp should be faster. Do you want a 29.4b7 to try it out? If so, Windows or Linux.

As I noted earlier, resuming from an older P-1 save file in stage 1 will restart the calculation.

Quote:
 Originally Posted by Prime95 I've got this coded, raising B1 to 13.3M (about 20M bits). Do you think it should be more? I replaced giants code with GMP code so calc_exp should be faster. Do you want a 29.4b7 to try it out? If so, Windows or Linux.
I am currently doing some low range P-1 with B1=30M. It would be really cool if this limit can be further bumped up (to about 100Mbits).

Quote:
 Originally Posted by axn I am currently doing some low range P-1 with B1=30M. It would be really cool if this limit can be further bumped up (to about 100Mbits).

George, is this something that you can spend time on?

 2020-05-25, 19:23 #6 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE 23×31 Posts Since I'm also doing some P-1 with big B1 (and B2), I would be interested in this, too (following axn's reasoning). Can somebody elaborate what stage 0 is? Googling for it only lead me to some biological stuff... @George: If it helps, I'd take a 29.4b7 for testing, Windows.
Quote:
 Originally Posted by axn George, is this something that you can spend time on?

I might can look at it in June. Nag me later. Right now I cannot build a Windows version. With the new laptop comes an upgrade from MSVC 2005 to MSVC 2019.

If the change requires a change to the save file format that could complicate matters.

Quote:
 Originally Posted by Prime95 I might can look at it in June. Nag me later.
Sure, will do.

FTR, I am on linux.

Quote:
 Originally Posted by kruoli Can somebody elaborate what stage 0 is?
Stage 1 of P-1 is implemented in two steps, a fast(er) Stage 0 and a slow(er) regular Stage 1.

The current stage 0 limit is about 13.3M. If your B1 is < 13.3M, you already run the entire Stage 1 using the faster stage 0 logic. If your B1 > 13.3M, then all the small primes up to 13.3M is handled using stage 0 logic and everything above 13.3M is handled using the slower stage 1 logic. One side effect of this is that, the % completed will jump by different amounts during status updates. Initially, it will be advancing x% every status update, and then suddenly it will drop to x/1.5 % after stage 0 is done.

 2020-05-26, 22:19 #10 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE F816 Posts May we add an entry to the Wiki for future interests? Or any kind of google-searchability?
Quote:
 Originally Posted by axn I am on linux.
If that's a barrier, I can run it on Linux, too. I'm on dual-boot on nearly all machines.

