20211031, 16:51  #1 
Oct 2021
U.S. / Maine
2^{2}×3×11 Posts 
Default tests_saved value for FTC PRP and standalone P1
I've become curious on why PRP (for FTCs with no P1 done) and P1 lines from PrimeNet still have the P1 tests_saved parameter set to 2.
Surely with GEC + proof as the only current FTC option, the amount of found P1 factors that would actually save two fulllength primality tests has to be in the hundredths of a percent if not even lower. Is it stupid of me to then think that P1 might actually be a net loss of cycles at present, if the amount of power to spend on it is being computed with the assumption that more power could be saved on the other end than actually could? This question would be purely an interest exercise if the people running standalone P1 were easily keeping up with the FTC wavefront, but this does not seem to be the case because the grand majority of FTCs in cat 2 or above go out with no P1 done. So, it seems to me that any reasonable way of increasing P1 throughput would be worthwhile, especially since it is more likely to be done well by someone who runs it standalone than by an FTC user who only runs it incidentally. I was surprised to find almost no existing discussion on this beyond a few people mentioning in passing that they change the parameter to 1 manually. (This is unfortunately not practical for me because I am not always at my computer frequently enough and do not know enough programming to write a script that could do it.) 
20211101, 03:31  #2  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×5×13×47 Posts 
Quote:
That would handle prime95/mprime, but not gpuowl which needs explicit bounds given. I picked ~107M to be representative, and checked for PRP and for LL results. 107M107.1M PRP at 10/31/2021 ~2045 UTC https://www.mersenne.org/report_prp/...date=1&B1=#bad 37 unverified PRP, 1150 verified PRP, none bad. The unverified range from 20191213 to today; about half are more than a week old. Almost all the verified were by proof, one by DC. 107M107.1M LL at 10/31/2021 ~2051 UTC 8 verified results (4 exponents & DC) ranging 20160221 to 20210409 29 unverified ranging 20160305 to 20210509 none bad. So the ~107M mix is 1148 PRP proof, 1 PRP DC, 37 PRP tbd, probably 18 more PRP DC & 19 PRP proof; 33 LL & DC eventually; & probably 1 LL TC; projected 19 PRP DC & 33 LL DC = 52 DC vs 1148+19= 1167 PRP proof; DC is 52/(52+1167) = 4.27% DC. Computing a little more carefully, including approx proof generation and certification time, PRP w/o proof, 20 * 2 = 40 LL & DC & TC etc, 33 * 2.0408 = 67.35 PRP with proof 1167 * ~1.005 = 1172.84 sum of effort = 1278.19; on 20+33+1167 = 1220exponents; tests saved equivalent / exponent ~ 1.0477 # saved ~ 1.0477 on average See also https://www.mersenneforum.org/showpo...9&postcount=20 

20211104, 13:13  #3 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
17DE_{16} Posts 
Approx 1 test saved bounds appears optimal
I ran a comparison timing with prime95 of 1 versus 2 or 3 saved wavefront P1 assignments, on
i71165g7, 16GiB single DIMM ram, prime95v30.6b4 on Windows 10 x64 Home 21H1 PFactor=aid,1,2,107181343,1,76,1 PFactor=aid,1,2,107181367,1,76,2 PFactor=aid,1,2,107191871,1,76,3 (since the exponents only differ by ~0.01%, the run times for same # of tests saved would probably differ by only ~0.021%) PRP=aid,1,2,107184463,1,76,2 program estimated runtime 15d 9:45 bounds for 1 test saved 403000, 15491000, determined by prime95 bounds for 2 tests saved 851000, 37318000 " " bounds for 3 tests saved 1284000, 61709000 " " runtime for 1 saved est 6:09; 10/31/21 15:16 start s1, 16:52 s1 gcd done, 20:17 s2 & gcd done, actual 5:01 elapsed, 7.2853 GHD, NF runtime for 2 saved est 13:59; 10/31/21 20:17 start s1, 23:44 s1 gcd done, 10:06 s2 & gcd done, actual 13:49 elapsed, 16.8766 GHD, NF (13.817 hours) runtime for 3 saved est 23:30; 11/1/21 10:06 start s1, 15:37 s1 gcd done, 11/2 12:07 s2 & gcd done, actual 26:01 elapsed, 27.2143 GHD odds of factor found for 1 test saved bounds, 3.36% odds of factor found for 2 tests saved bounds, 4.37% odds of factor found for 3 tests saved bounds, 5.00% Run time on a 107M PRP (M107184463) is estimated as 15d 9:45, or 369.75 hours on the same system. Adjusting that by p^2.1 to 107181367, PRP run time is estimated as 369.73 hours. 1 test saved bounds cost 5 hr 1 min, 3.36% chance of saving ~15d 9:44: 5.017 369.73*.0336 = 7.406 hours (7.4 hours likely saved; this is the case for an LL DC, or PRP DC without proof) 1.005 test actually saved (1 PRP/GEC/proof&cert), 5.017369.73*.0336*1.005 = 7.468 hours 2 tests actually saved (2 PRP/GEC without proof), 5.017369.73*.0336*2 = 19.829 hours 2.0408 test actually saved (2 LL & occasionally a third etc), 5.017369.73*.0336*2.0408 = 20.336 hours. 2 test saved bounds 13:49, 4.37% 369.73 hr; 13.817 .0437 369.73 n = 2.341 hours if only 1 actually saved, 18.498 if 2. skipping P1, 0. That's the worst case P1 choice, barring extremely high bounds that cost more than they save. We want the lowest number possible in the following (greatest amount of time saved by doing P1, largest negative magnitude) At 1.0477 saved average: 1saved bounds 5.017369.73*.0336*1.0477 = 7.999 hours (saves 2.15% of a PRP/GEC/proof&cert) 2saved bounds 13.817369.73*.0437*1.0477 = 3.111 hours (0.84% of a PRP/GEC/proof&cert) ratio 7.999/3.111 = 2.568 Difference =4.888 hours, which is ~1.32% of the estimated PRP run time, 1testsaved faster. If this holds similarly for other hardware too, we could speed up the GIMPS project progress by ~1.32%*(369.73/(369.73+13.817)) ~ 1.27% by switching the server from issuing 2testssaved P1 assignments to 1testsaved. (probably in actuality a little less because of gpuowl's bounds selection.) 3saved bounds 26.017369.73*.050*1.0477 = +6.649 hours (costs several hours per exponent on average) Running excessively high P1 bounds can be worse than skipping P1 entirely. All the above is for computation of P1 that s not integrated with PRP computation for the same exponent, such as in prime95/mprime, CUDAPm1, Mlucas, or Gpuowl 6.x. Gpuowl 7.x uses powers of 3 computed during the initial portion of PRP est computation to help obtain the power needed for P1 stage 1, reducing combined cost. Since that cuts P1 stage 1 cost to about 10% that of the independent computation, it would reduce the total additional computation cost for P1 considerably, favoring larger bounds especially for stage 1. Is there any reason to believe that this result is not representative for other exponents or other processor types? I don't know of any. Perhaps the authors of P1 and/or PRP capable software know of some. Last fiddled with by kriesel on 20211104 at 13:46 
20211104, 18:19  #4  
P90 years forever!
Aug 2002
Yeehaw, FL
5^{2}·311 Posts 
Quote:
However, is a factor worth more than a PRP proof? That's a purely subjective question. Mihai and I agree that a factor is more valuable. A factor can prove a Mersenne number composite in milliseconds. A proof and cert cannot be reverified in the future, especially since proof files are discarded due to their large size. To a future researcher, our current proof methods amount to "trust us, we proved number x composite". 

20211104, 20:36  #5 
Oct 2021
U.S. / Maine
2^{2}·3·11 Posts 
I absolutely agree myself, but I question how relevant this is to the tests_saved value.
GIMPS has a dedicated contingent of people factoring alreadytested exponents, using TF deeper than even GPU72 would have given out and P1 bounds greater than what even tests_saved=2 would have set. I see no reason to believe that any large amount of them would lose interest, and their effort only gets cheaper as hardware evolves, so it should almost surely reach the current FTC wavefront and beyond given enough time. I think it is also necessary to consider the practical angle. Good factoring hardware looks different from good FTC hardware, let alone FTC users' actual hardware, and again, standalone P1 testing is not keeping up with the FTC wavefront. So, an attempt to do better P1 will certainly get better results within whatever throughput standalone P1 users can manage, but may still get worse results overall as highcat FTCs' P1 is run at the mercy of whatever the PRP tester has available (if they even bothered to change the default resource limits). You do not have to look very hard to find a recently completed exponent with PRP and P1 done by the same user and poor P1 bounds or no P1 stage 2 at all. The short version of my viewpoint is that an attempt to both save PRP tests and find factors for factors' sake may end up doing neither well. Last fiddled with by techn1ciaN on 20211104 at 20:41 Reason: Sentence flow 
20211104, 20:57  #6 
6809 > 6502
"""""""""""""""""""
Aug 2003
101×103 Posts
10100000000110_{2} Posts 
Why did you not try 1.1 tests saved? That does take a little bit more effort than a 1.0 value, but not much. And it leans a little in favor of finding a factor. From my testing anything in the hundredths column gets ignored. So the gradations are 1.0, 1.1, 1.2 etc.

20211105, 00:17  #7  
Oct 2021
U.S. / Maine
2^{2}·3·11 Posts 
Quote:
I have unfortunately lost the specific post but can recall seeing Madpoo say that he recently personally verified the LL for every exponent below 5 million. 

20211105, 01:12  #8  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2·5·13·47 Posts 
Quote:
For anyone still using CUDAPm1 (and I am a little yet), I think it's limited to integer values of testssaved, from 1 to 9. (CUDAPm1 parse.c source fragment:) Code:
else if (count_numerical_field == 6) { if ( proposed_exponent > 0 ) assignment>ll_saved = (int)proposed_exponent; There's an implicit assumption handling ERROR_RATE that two primality tests will occur, not PRP & proof gen & cert: Code:
/* Balance P1 against 1 or 2 LL/PRP tests (actually more since we get a */ /* corrupt result reported some of the time). */ g.ll_testing_cost = (tests_saved + 2 * ERROR_RATE) * n; There's rounding of B1 and B2 upward to the next 1000, in the costing code, so suffiiciently small changes in tests_saved would produce the same bounds: Code:
/* Round up B1 and B2 to nearest 1000  just to look pretty */ best[1].B1 = round_up_to_multiple_of (best[1].B1, 1000); best[1].B2 = round_up_to_multiple_of (best[1].B2, 1000); Code:
sprintf (buf, "Assuming no factors below 2^%.2g and %.2g primality test%s saved if a factor is found.\n", w>sieve_depth, w>tests_saved, w>tests_saved == 1.0 ? "" : "s"); Last fiddled with by kriesel on 20211105 at 01:45 

20211105, 01:18  #9 
6809 > 6502
"""""""""""""""""""
Aug 2003
101×103 Posts
10100000000110_{2} Posts 
IIRC it truncates. 1.15 was treated as 1.1

20211105, 03:59  #10  
Jun 2003
2×3^{2}×293 Posts 
Quote:
I can suggest an actual framework to reason about this. Leave the tests saved as 1 (or 1.1 or whatever based on modelling the CERT overhead + abandoned tests etc.). Compute optimal B1/B2 per usual. Calculate the savings (in number of squarings). Keep going to higher B1s and calculate the savings. Stop when savings falls below (say) 90% of that of the optimal B1/B2. Use this higher B1/B2 for actual work. This gives us increased factor probability without compromising gimps thruput "too much". Here the 90% value can be something else  but atleast it gives a way to more systematically reason about the cost/savings rather than pulling some indirect "2 tests saved" out of nowhere. PS: Does P95 implement PRP+Stage 1? If not, is that in pipeline? I feel like that would give substantial reduction to P1 cost. In fact, it might even make sense to do Stage1only P1 with that (assuming sufficient RAM is available). Last fiddled with by axn on 20211105 at 04:00 

20211105, 06:13  #11  
P90 years forever!
Aug 2002
Yeehaw, FL
5^{2}·311 Posts 
I cannot disagree. I only offer this as an explanation as to why I've not been in any hurry to adjust the 2 downward.
Quote:
1) Prime95 would require a lot of memory during the PRP+Stage1 part of the PRP test. Thus, not an option that can be turned on in a default install. A prime95 default install is not supposed to interfere with normal work. 2) There is some overhead involved in PRP+Stage1. I need to revisit the process to quantify  it's been a long while since I last looked at it. IIRC, if you have 128 temporaries you get a theoretical 7x increase in stage 1 performance  I don't recall what I expected the overhead to knock that 7x down to. That said, it is worth implementing. The last year I've been working on needed gwnum library improvements as well as improving P1 (and P+1/ECM) stage 2 in preparation for implementing this. 30.7 is close to a finished product, but Pavel and I are working on yet another stage 2 improvement! 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
How to set 2^78 as default on trial factoring  piforbreakfast  PrimeNet  14  20210324 20:54 
How to change default job type  piforbreakfast  Information & Answers  2  20210308 13:30 
Bootable Standalone Prime95?  ant  Software  9  20160727 16:45 
Default ECM assignments  lycorn  PrimeNet  9  20150109 16:32 
Search default (threads or posts)  schickel  Forum Feedback  15  20090405 14:50 