20220104, 03:12  #1 
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
1010010011110_{2} Posts 
Let's Optimize P1 for low exponents. TL;DR in post #1. More in posts 60 and 61.
This project is made possible (or dare I say necessary?) because of the speed of P1 factoring under Prime95 v30.8.
I cannot take credit for this idea. It has been suggested by others; whom shall remain nameless ... mostly because I'm sure if I try I will miss someone. In short we want to use v30.8+ to factor lower exponents (<20M?) to the more optimal levels this version affords us. We will encourage members who may be interested and who have PC's with a decent amount of RAM to participate. It has been suggested that 16GB+ is preferred but in my experience even 8GB is adequate. ============================================================== TL;DR 0. Be sure you have upgraded mprime or Prime95 to the latest build of v30.8. See here. 1. Choose a range not already listed above and inform me to add it. Preferably at least 0.1M or 1.0M in size. 2. Calculate the recommended minimum B1 to use (feel free to round this value). Code:
2.2^LOG(20,000,000/<exponent>,2)*1,000,000 Code:
sqrt( 16 / your GB RAM) × proposed B1 5. Generate Pminus statements for the exponents you deem worth redoing P1 to the recommended bounds. Popular opinion is we want the new B1 to be at least 10x the current unless the current B2 is VERY large. If the current P1 was done prior to v30.8, B2 will be about 20x B1 whereas with v30.8 it will be hundreds or thousands time larger. If some of you know how to use Pfactor to get similar Bounds let me know. Code:
Pminus1=N/A,1,2,<exponent>,1,<B1Bound>,0,<TF level> If you are ambitions you can run individual threads for Stage 1 then adjust the Worker Windows to use all Workers and as much RAM as you can spare for Stage 2. ====================================================== Code:
Known assignments (If no activity in the last month I assume it is no longer active): .03 nordi  COMPLETE .06 George  COMPLETE .07 George  COMPLETE .08 George  COMPLETE .09 George  COMPLETE 0.10.15 Kruoli 0.2 petrw1  COMPLETE (35) 0.3000.325 Kruoli  COMPLETE (24) 0.6 masser  COMPLETE (51) 0.7 Lycorn 0.8 Lycorn 0.9 axn  COMPLETE (94) 1.0 axn & Lord Julius  COMPLETE (73) 1.2 petrw1  COMPLETE (67) 1.6 Naegi Makoto  COMPLETE (81) 1.9 RichD0 2.1 chris 2.41 Denial140 2.5 Chris 2.6 kurly 2.8 mikr 2.9 mikr 3.0 petrw1  IN PROGRESS 3.1 petrw1  IN PROGRESS 3.2 petrw1  STARTING February 3.6 Jocelyn Larouche  COMPLETE 3.7 mikr 3.8 DrobinsonPE  COMPLETE (64) 3.94 Denial140 4.0 DrobinsonPE  COMPLETE (129) 4.1 DrobinsonPE  COMPLETE (72) 4.2 DrobinsonPE  COMPLETE (164) 4.3 DrobinsonPE  COMPLETE (165) 4.4 xss  COMPLETE (30) 4.4 petrw1 (cleanup in aisle 4.4 :D ) 4.5 petrw1  IN PROGRESS 4.6 petrw1  IN PROGRESS 4.7 petrw1  IN PROGRESS 4.8 petrw1  STARTING SOON 4.905 petrw1  IN PROGRESS 4.96 congsz  IN PROGRESS 4.979 prism019  COMPLETE (40) 5.0 petrw1  COMPLETE (100) 5.1 petrw1  COMPLETE (184) 5.2 petrw1  COMPLETE (45) 5.3 petrw1  COMPLETE (146) 5.4 petrw1  COMPLETE (73) 5.5 petrw1  COMPLETE (108) 5.6 petrw1  COMPLETE (157) 5.7 petrw1  COMPLETE (149) 5.8 petrw1  COMPLETE (144) 5.9 petrw1  COMPLETE (67) 6.5 takahashi 7.0 DrobinsonPE  COMPLETE (63) 7.1 DrobinsonPE 7.2 yoyorocks1 7.3 DrobinsonPE 7.4 DrobinsonPE 7.5 DrobinsonPE 7.6 DrobinsonPE 7.8 yoyorocks1 8.0 linament 8.1 Flauktorist  COMPLETE (102) 8.2 linament  COMPLETE (83) 8.4 yoyorocks1 8.5 Tha 8.5 Tha 8.7 linament  COMPLETE (71) 8.8 linament  COMPLETE (161) 8.9 congsz 9.0 Tha 9.1 Tha 9.2 Tha 9.3 Tha 9.4 Tha 9.5 Tha  COMPLETE 9.6 Tha 9.7 Tha 9.8 Tha  COMPLETE 9.9 Tha  COMPLETE (141) 10.2 mikr 12.5 Naegi Makoto 13.0  13.9 nordi 17.1 Alpertron  COMPLETE (101) Code:
Already Factored Exponents 0.02 nordi 4.04.9 harlee 14.014.9 nordi Last fiddled with by petrw1 on 20230207 at 15:33 Reason: Updating reserved ranges 
20220105, 12:54  #2 
Dec 2002
1101100011_{2} Posts 
Now that we are close to the target set originally, to get under 20,000 unfactored candidates per one million range (or 2,000 per one hundred thousand) and now that 30.8 has changed the landscape I think we should set a new target.
Rather than setting the same standard for each range I was thinking of each unfactored exponent having done an amount of P1 work on it that is reasonable. I currently work in the 9.1M range with B1 set to 2,000,000 and using a machine with 64 GiB of ram which sets B2 at 3,500,000,000. That amounts to about 90 GHzdays each. Two things now would be worthwhile: A. How to get P1 being done on machines with heavy ram resources? B. How to set a meaningful standard as the next target? Last fiddled with by petrw1 on 20220110 at 15:05 Reason: Fixed ... 
20220105, 14:53  #3  
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
2×7×13×29 Posts 
Quote:
I intend to continue with some kind of P1. What you suggest is a worth consideration. Thanks 

20220107, 22:49  #4  
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
2·7·13·29 Posts 
So giving this more thought....
Quote:
A. Is there a minimum/preferred RAM target? Or do lower RAM PC's just need manually specified Bound2 and take longer in order to achieve B. B. I'll start: How about minimum B1=1M / B2=200M for exponents under 20M  Or should the min B1 be a product of the exponent? Last fiddled with by petrw1 on 20220107 at 22:51 Reason: More ... 

20220107, 23:45  #5  
P90 years forever!
Aug 2002
Yeehaw, FL
2^{3}·1,021 Posts 
Quote:
Maybe you can target the same elapsed time effort for each exponent, attacking exponents with bounds significantly below your new recommended bounds. The goal bounds should be chosen such that the project is difficult but can be completed in a reasonable timeframe (I think that's one reason why 20M project caught fire). B1 should be a multiple (rounded to a nice even number) of either exponent or FFT length. B2 should be something a 16GB(?) machine can produce comfortably. My data point for you in helping you come up with your formula. 6 or 7 year old quad core, 8GB mem, ~3GHz, exponent = 80K, FFTlen=4K, stage 2 FFTlen=5K: B1 = 1B takes 16500 sec. for 4 or 4125 seconds each. B2 = 20.5T takes 5460 sec. using 6.7GB. Once you decide on your recommended bounds, if people have lots of RAM they should exceed your minimum B2 bounds. 

20220108, 04:38  #6  
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
12236_{8} Posts 
Quote:
If I use your "data point" as a reference: B1=1B/80K=12,500xexponent B2=20.5T=20,500xB1 If I extend that exponent/B1 ratio of 12,500 then for exponent 20M I'd need B1=250B. If I let Prime95 choose B2 I know it wouldn't be 20,500x maybe closer to 300x. This is not workable. It is not obvious to me what FFT to expect so I would not be able to choose a B1 as a multiple of FFT. So possibly a diminishing multiple of the exponent? Or as you suggested striving for a standard time duration per exponent? We could aim for a duration of 1 or 2 hours on a "typical" current PC. What I've noticed across my 5 PC's with RAM available varying from 5G to 24.5G is that less RAM decreases B2 so that both Stages run times are close. I can suggest a minimum RAM allocation to do this project justice but I won't stop anyone who wants to contribute. I could further suggest that if someone's available RAM is less than xGB that they rather specify the B2 as a certain multiple of B1, accepting that the Stage 2 run time will be relatively longer than Stage 1. My 7820x with 24.5GB takes about 12 Minutes for B1=1M in this 20M exponent range It chooses 328x for exponents in this range and takes about 10 minutes I tried 80177 on my 7820x with 24.5GB RAM. I used B1=100M. Prime95 chose B2=122,881xB1 Stage 1: FFT=4.5K/19 Minutes Stage 2: FFT=5K/17 Minutes 

20220108, 07:08  #7  
Dec 2002
3×17^{2} Posts 
Quote:
So maybe set B2 accordingly and have machines with lesser RAM do more cycles. 

20220108, 15:06  #8  
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
2×7×13×29 Posts 
Quote:
I'm thinking that once I'm below 20M I'll set B1 at a minimum of 1M and choose a decent v30.8 B2. Thinking about exponents between 10M and 19M Do you think a minimum of 300xB1 is reasonable? Or 500x or 1000x? Maybe a bigger ratio for 10M to 14M? Under 10M even my machine with 3.5GB available will choose about 300x or more. Last fiddled with by petrw1 on 20220108 at 15:12 Reason: First 2 sentences 

20220108, 16:14  #9  
Jun 2003
2·2,719 Posts 
Quote:
There can be three possible shape for B1 as a function of exponent. B1 increases with exponent (along the lines of how wavefront P1 works). B1 constant B1 decreases with exponent (along the lines of how <20M project typically works). Probably #3 is the best option. #1 would be increasingly costlier at higher ranges. #2 might undershoot lower ranges or overshoot higher ranges (or both). We can then select B1 as C1 / exp (for some const C1) or C2 / FFT (for another const C2) where FFT is the stage 1 FFT size used by P95. EDIT: Finally, we'll finesse B1 up or down based on RAM. If you have low RAM, you'll do higher than normal B1 to compensate. Last fiddled with by axn on 20220108 at 16:15 

20220108, 20:49  #10  
P90 years forever!
Aug 2002
Yeehaw, FL
2^{3}·1,021 Posts 
Quote:
My data point gives you an idea of the bounds you might expect in the 80K range. Quote:
FFT timings also go up a little more than linearly (double the FFT size, the time it takes to do a multiplication goes up by about a factor of 2.1) Stage 1 runtime is linearly linked to B1 (yeah! double B1 doubles the runtime). One can probably put all that together for a formula that generates a rough suggested B1 value. Something along the lines of every time you double the exponent suggested B1 drops by a factor of 2.2? Quote:
For B2, the project could recommend minimum B2 values or let the user choose the optimal B2 value for their situation or "require" low RAM machines to run longer stage 2 times to reach the recommended B2 value or encourage low RAM machines to do stage 1 only and partner up with a high RAM machine which does nothing but stage 2. Tough call. If you double the available RAM, how much of a B2 boost do you get. I wouldn't be surprised if it was 5x or more. Last fiddled with by Prime95 on 20220108 at 20:50 

20220108, 22:12  #11 
Jul 2003
Behind BB
2^{2}·17·29 Posts 
Wayne, have you looked at James' P1 Effort page recently? Screenshot below.
I don't have a nice, pithy way to describe this, but a good effort might be to start at the left nonzero column (0.06) and move each exponent in that column up two orders of magnitude in GhzD to say, 8 GhzD. When that's complete, move to the next column (0.12) and move those exponents up to the 16 GhzD column, and so on, ad nauseam. Like timespentperexponent, focusing on GhzD effort would have the effect that larger exponents would naturally get lower bounds and smaller exponents would get much higher bounds. We should also consider again where the optimal point for switching to ECM/P+1 might be, given the new radically larger B2 available for P1. Last fiddled with by masser on 20220108 at 22:16 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
How to optimize the sieving stage of QS?  Ilya Gazman  Factoring  6  20200826 22:03 
Placeholder: When is it legal to torrent BBC tv stuff?  kladner  Lounge  3  20181001 20:32 
Future project direction and server needs synopsis  gd_barnes  No Prime Left Behind  6  20080229 01:09 
Unreserving exponents(these exponents haven't been done)  jasong  Marin's Mersennearies  7  20061222 21:59 
A distributedcomputing project to optimize GIMPS FFT? Genetic algorithms  GP2  Software  10  20031209 20:41 