 2020-06-08, 08:55 #23 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE 1100101102 Posts After removing the save files, I ran the same test again, this time with B1=200,000,000 while keeping MaxStage0Prime=100000000. Stage 0 ran about 10 % slower! Is this because 1277 is such a small exponent?
2020-06-08, 09:22   #24
axn

Jun 2003

2·32·269 Posts

Quote:
 Originally Posted by kruoli After removing the save files, I ran the same test again, this time with B1=200,000,000 while keeping MaxStage0Prime=100000000. Stage 0 ran about 10 % slower! Is this because 1277 is such a small exponent?
How much time did it take for 0-100M part of the run (i.e., reaching 50% of stage 1) and how much time it took to reach from 100M-200M? We expect the first half to take x seconds, and second half to take about 1.5x seconds.

Or do you mean when you ran only B1=100M, stage 1 completed in x seconds, and when you ran B1=200M, stage 0 (0-100M) ran in x+10% seconds? That is kind of expected (but not sure how much it should be). This is because we are including all primes and prime powers < B1. Primes between sqrt(B1) and B1 are included only once. For B1=100M, that means primes between 10^4 and 100M. For B1=200M, that means primes between 2*10^4 and 100M (since stage 0 is capped at 100M). Therefore the primes between 10^4 and 2*10^4 are additional inclusions in stage 0 in the case of B1=200M. Similarly, some smaller primes get included more times in B2=200M case than B1=100M case, so the stage 0 will be slightly bigger, even though your max prime limit is the same.

EDIT:- Just did some checking. Whatever I said above shouldn't make any difference at all. It is all very negligible.

EDIT2:- sqrt(2e8) is 1.4e4, not 2e4 :-(

Last fiddled with by axn on 2020-06-08 at 09:47

 2020-06-08, 09:45 #25 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE 40610 Posts Hopefully the full log is somewhat self-explanatory: Code: [Work thread Jun 8 11:41] P-1 on M1277 with B1=200000000 [Work thread Jun 8 11:41] Using AVX FFT length 64 [Work thread Jun 8 11:41] M1277 stage 1 is 3.46% complete. Time: 4.331 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 6.93% complete. Time: 4.350 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 10.39% complete. Time: 4.351 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 13.86% complete. Time: 4.360 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 17.32% complete. Time: 4.351 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 20.79% complete. Time: 4.343 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 24.25% complete. Time: 4.366 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 27.72% complete. Time: 4.342 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 31.19% complete. Time: 4.346 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 34.65% complete. Time: 4.346 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 38.12% complete. Time: 4.349 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 41.58% complete. Time: 4.352 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 45.05% complete. Time: 4.349 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 48.51% complete. Time: 4.344 sec. [Work thread Jun 8 11:42] M1277 stage 1 is 51.98% complete. Time: 4.348 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 54.63% complete. Time: 4.129 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 56.97% complete. Time: 4.041 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 59.28% complete. Time: 4.040 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 61.62% complete. Time: 4.047 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 63.92% complete. Time: 4.048 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 66.20% complete. Time: 4.035 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 68.56% complete. Time: 4.035 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 70.95% complete. Time: 4.025 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 73.32% complete. Time: 4.026 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 75.65% complete. Time: 4.018 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 78.03% complete. Time: 4.015 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 80.36% complete. Time: 4.014 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 82.68% complete. Time: 4.036 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 85.01% complete. Time: 4.028 sec. [Work thread Jun 8 11:43] M1277 stage 1 is 87.38% complete. Time: 4.010 sec. [Work thread Jun 8 11:44] M1277 stage 1 is 89.72% complete. Time: 4.082 sec. [Work thread Jun 8 11:44] M1277 stage 1 is 92.02% complete. Time: 4.064 sec. [Work thread Jun 8 11:44] M1277 stage 1 is 94.35% complete. Time: 4.068 sec. [Work thread Jun 8 11:44] M1277 stage 1 is 96.66% complete. Time: 4.089 sec. [Work thread Jun 8 11:44] M1277 stage 1 is 98.96% complete. Time: 4.098 sec. [Work thread Jun 8 11:44] M1277 stage 1 complete. 709217288 transforms. Time: 156.362 sec. [Work thread Jun 8 11:44] Stage 1 GCD complete. Time: 0.000 sec. [Work thread Jun 8 11:44] M1277 completed P-1, B1=200000000, Wh8: 17C7840E [Work thread Jun 8 11:44] No work to do at the present time. Waiting. @George: Could you have a print out for how long initalization of stage 0 took? It doesn't seem to report that, currently.
 2020-06-08, 09:52 #26 axn     Jun 2003 113528 Posts It looks like stage 0 took about 63 seconds, stage 1 took about 85 second and init took about 8 seconds. EDIT:- What happens if you change the stage 0 max prime to 200M and rerun the B1=200M? Will it complete faster than 156 seconds? Last fiddled with by axn on 2020-06-08 at 09:57
 2020-06-08, 10:07 #27 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE 2×7×29 Posts Yes, now it's 144 s. I forgot that stage 0 will print out in larger steps (in my example about 3.5 percent-point) vs stage 1 (about 2.3 percent points). First; I only looked at the times. Sorry for the trouble.
2020-06-08, 10:10   #28
axn

Jun 2003

113528 Posts

Quote:
 Originally Posted by kruoli Yes, now it's 144 s. I forgot that stage 0 will print out in larger steps (in my example about 3.5 percent-point) vs stage 1 (about 2.3 percent points). First; I only looked at the times. Sorry for the trouble.
Cool. I think this is a must-have feature for anyone doing super large P-1s.

2020-06-08, 15:49   #29
kruoli

"Oliver"
Sep 2017
Porta Westfalica, DE

2·7·29 Posts

Quote:
 Originally Posted by kruoli Same for me here, Windows 10, 64 bit. The process is visible for a second or two in task manager and then vanishes..
Windows log says:
Faulty module: libhwloc-15.dll
Error code: 0xc0000005
Error offset: 0x0000000000008d4e

Interesting...

 2020-06-08, 15:53 #30 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE 2·7·29 Posts New workaround: Use the hwloc from 29.8b6, then 29.8b8 works! Edit: But it crashes when accessing libgmp now. @George: Have you built all of the DLLs yourself and maybe it's using CPU instructions that are not supported on my processor (AVX512 etc.)? I ran the test on a i5-2400. Last fiddled with by kruoli on 2020-06-08 at 15:57 Reason: Workaround doesn't fully work.
 2020-06-08, 18:41 #31 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 67×109 Posts I'm supposed to build a DLL that works on all machines. I may have screwed up.
2020-06-08, 20:28   #32
ixfd64
Bemusing Prompter

"Danny"
Dec 2002
California

2,351 Posts

Quote:
 Originally Posted by ixfd64 Dumb question: what does P-1 stage 0 do?
After reading some posts more closely, it seems "stage 0" in this context refers to what is normally called stage 1 elsewhere.

Last fiddled with by ixfd64 on 2020-06-08 at 20:29

2020-06-08, 20:37   #33
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

67·109 Posts

Quote:
 Originally Posted by ixfd64 After reading some posts more closely, it seems "stage 0" in this context refers to what is normally called stage 1 elsewhere.
Yes, stage 0 is part of stage 1. Maybe I should have called them stage 1a and stage 1b.

