mersenneforum.org > Data Let's Optimize P-1 for low exponents. TL;DR in post #1. More in posts 60 and 61.
 Register FAQ Search Today's Posts Mark Forums Read

2022-01-09, 01:50   #12
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

41·193 Posts

Quote:
 Originally Posted by masser Like time-spent-per-exponent, focusing on GhzD effort would have the effect that larger exponents would naturally get lower bounds and smaller exponents would get much higher bounds. We should also consider again where the optimal point for switching to ECM/P+1 might be, given the new radically larger B2 available for P-1.
One issue is that 30.8 has "broken" the GHzD formula. That is, 30.8's huge B2 values means GHzD does no longer correlates with time spent on exponent.

Longer term, P+1 and to a lesser degree ECM should see huge B2 increases too.

2022-01-09, 02:48   #13
axn

Jun 2003

19×283 Posts

Quote:
 Originally Posted by Prime95 For B2, the project could recommend minimum B2 values or let the user choose the optimal B2 value for their situation or "require" low RAM machines to run longer stage 2 times to reach the recommended B2 value or encourage low RAM machines to do stage 1 only and partner up with a high RAM machine which does nothing but stage 2. Tough call. If you double the available RAM, how much of a B2 boost do you get. I wouldn't be surprised if it was 5x or more.
Quote:
 Originally Posted by axn EDIT:- Finally, we'll finesse B1 up or down based on RAM. If you have low RAM, you'll do higher than normal B1 to compensate.
I am beginning to think that this final point is a bit of genius.

1. Mathematically, the optimal thing to do is to let the program calculate the optimal B2 (for the given B1 & RAM).
2. Therefore forcing low-RAM machines to do same B2 as high-RAM machines is counter productive.
3. Given #1 & #2, we should force low-RAM machines to do higher-than-normal B1, and let program calculate optimal B2.

TL;DR - Recommend B1 based on exponent/FFT and RAM. Let program calculate B2 - don't give any direct recommendation for it.

2022-01-09, 03:52   #14
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

2·32·172 Posts
Sorry/Please. This is still a discussion point ...

DISCLAIMER:
Per the title we are still very early in the speculating, planning phases of a potential follow-up project.
Nothing that follows is to be taken as suggested or recommended.
But opinions, thoughts, etc. are more than welcome.
My focus still is completing the original Under-2K project; I estimate it will take about half of 2022.

Quote:
 Originally Posted by Prime95 Understood. My initial suggestion was poorly presented. I'm suggesting a project along the lines of all unfactored exponents below 20M are given x hours of P-1 effort where this is based on some reference setup (a 3 yr old quad core with 16GB memory?) My data point gives you an idea of the bounds you might expect in the 80K range. FFT size is a little more than linearly correlated with exponent (double the exponent will a little more than double the FFT size). FFT timings also go up a little more than linearly (double the FFT size, the time it takes to do a multiplication goes up by about a factor of 2.1) Stage 1 runtime is linearly linked to B1 (yeah! double B1 doubles the runtime). One can probably put all that together for a formula that generates a rough suggested B1 value. Something along the lines of every time you double the exponent suggested B1 drops by a factor of 2.2? Yes, this is easy to easy for B1. For B2, the project could recommend minimum B2 values or let the user choose the optimal B2 value for their situation or "require" low RAM machines to run longer stage 2 times to reach the recommended B2 value or encourage low RAM machines to do stage 1 only and partner up with a high RAM machine which does nothing but stage 2. Tough call. If you double the available RAM, how much of a B2 boost do you get. I wouldn't be surprised if it was 5x or more.
I have a 9 year old quad i5-3570 with 16GB; 14GB for P-1.
It can do a low 20M P-1 with 700K/164M in about 70 minutes.
...so maybe 100 minutes for 1M/234M.

and a 3 year old 8-core 7820x with 32GB; 24.5GB for P-1.
It can do a low 20M P-1 with 1M/328M in about 24 minutes.

Using the "Double Exponent B1 drops by a factor of 2.2" gives a table like this with 2 different starting values for 20M
Code:
Exponent	 B1	 B1
7813	 256,000,000 	 1,097,517,471
15625	 249,435,789 	 498,871,578
31250	 113,379,904 	 226,759,808
62500	 51,536,320 	 103,072,640
125000	 23,425,600 	 46,851,200
250000	 10,648,000 	 21,296,000
500000	 4,840,000 	 9,680,000
1000000	 2,200,000 	 4,400,000
2000000	 1,000,000 	 2,000,000
If we are going to recommend B2 we would need a similar table.
If we choose a "benchmark / standard" PC we need to know what B2-multiplier it will choose based on the exponent.

For example my 7820X (24.5 GB) has these values:
I chose a B1 value a little higher than the current B1.
I chose TF-bits of 74 for all exponents; not the actual level...just in case it affects the B2 calculation.
This table shows B2 kind-of doubles as the exponents halves....
The first Multiplier is the one actually used; the second was initially chosen as in the following excerpt

Code:
[Work thread Jan 8 19:14] Inversion of stage 1 result complete. 5 transforms 1 modular inverse. Time: 0.071 sec.
[Work thread Jan 8 19:14] With trial factoring done to 2^74 optimal B2 is 13030*B1 = 84695000000.
[Work thread Jan 8 19:14] If no prior P-1 chance of a new factor is 10.3%
[Work thread Jan 8 19:14] Switching to AVX-512 FFT length 40K Pass1=128 Pass2=320 clm=1 8 threads
...
[Work thread Jan 8 19:14] With trial factoring done to 2^74 optimal B2 is 9585*B1 = 62302500000.
[Work thread Jan 8 19:14] If no prior P-1 chance of a new factor is 9.99%
[Work thread Jan 8 19:14] Using 24837MB of memory.  D: 150150, 14400x63835 polynomial multiplication.
Code:
Exponent	B1		B2	Multipliers
625057		6500000	62306500000	9586	13030
1256201		1500000	5449500000	3633	6065
2502391		1500000	3001500000	2001	3044
5001049		3700000	5561100000	1503	1660
10022263	1500000	1081500000	721	851
20852933	800000	260668980	326	366
Drum roll ... NOTE: This is just one sample opinion; by no means assumed gospel.
Hmm, this is a lot of Blah Blah Blah ... but I can for example use the tables above to suggest:

If you have at least 16(?) GB of RAM available for P-1; let Prime95 choose B2
Pminus1=N/A,1,2,10022263,-1,2200000,0,73

If you have less RAM I would suggest: ... but accept your iron/your choice.
(This is 500x ... a little less than the chart above ... but the PC used for that chart had 24.5GB RAM for P-1)
Pminus1=N/A,1,2,10022263,-1,2200000,1100000000

'Nuff said for now.
Wayne

2022-01-09, 04:03   #15
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

2·32·172 Posts

Quote:
 Originally Posted by masser Wayne, have you looked at James' P-1 Effort page recently? Screenshot below. I don't have a nice, pithy way to describe this, but a good effort might be to start at the left non-zero column (0.06) and move each exponent in that column up two orders of magnitude in GhzD to say, 8 GhzD. When that's complete, move to the next column (0.12) and move those exponents up to the 16 GhzD column, and so on, ad nauseam. Like time-spent-per-exponent, focusing on GhzD effort would have the effect that larger exponents would naturally get lower bounds and smaller exponents would get much higher bounds. We should also consider again where the optimal point for switching to ECM/P+1 might be, given the new radically larger B2 available for P-1.
No, I haven't noticed this report before. But it is worth looking at. Thanks

Maybe I misunderstand but I don't think we want to process a given exponent more than once.

GhzD is certainly an option to consider along with time.
Though I have noticed that 30.8 gives very high GhzD for very low exponents and very large B2;
that suggests that if we randomly chose 25 GhzD that would allow for decent bounds for a larger exponent but not large enough bounds for the very small.

Haven't I read somewhere on this forum that the infamous RDS showed that P-1 is always(?) more efficient at finding factoers per GhzD than ECM???

 2022-01-09, 12:09 #16 firejuggler     "Vincent" Apr 2010 Over the rainbow B1D16 Posts some data: one a i7 8700, using 9 G of ram 3 core Code: Sending result to server: UID: firejuggler/Maison, M8524427 completed P-1, B1=1000000, B2=333779160, Wi4: CF5F1AA7, AID: 955.... PrimeNet success code with additional info: CPU credit is 8.0907 GHz-days. about 26 minutes Code: PrimeNet success code with additional info: CPU credit is 9.9472 GHz-days. Sending result to server: UID: firejuggler/Maison, M8561477 completed P-1, B1=1200000, B2=410879040, Wi4: C5F098E4, AID: DA3... about 30-35 min Code: PrimeNet success code with additional info: CPU credit is 13.3236 GHz-days. Sending result to server: UID: firejuggler/Maison, M8577563 completed P-1, B1=1560000, B2=551166330, Wi4: FD07A4FD, AID: 2A9ABA... 40 to 45 min
 2022-01-09, 12:17 #17 lycorn     "GIMFS" Sep 2002 Oeiras, Portugal 22×32×43 Posts The third trial took nearly twice as long as the first one, for an increase in the probability of success from 6,8% to 7,7%. Not really exciting...
2022-01-09, 12:22   #18
S485122

"Jacob"
Sep 2006
Brussels, Belgium

111000111112 Posts

Quote:
 Originally Posted by Prime95 One issue is that 30.8 has "broken" the GHzD formula. That is, 30.8's huge B2 values means GHzD does no longer correlates with time spent on exponent. ...
If a reprogramming of the GHzDays is done, it could be the time to add another correction : deduct the credit earned by previous P-1 attempts from the credit given (it would prevent abuse of the credit system for what it is worth ;-)

2022-01-09, 13:05   #19
firejuggler

"Vincent"
Apr 2010
Over the rainbow

B1D16 Posts

Quote:
 Originally Posted by petrw1 ..snip... GhzD is certainly an option to consider along with time. Though I have noticed that 30.8 gives very high GhzD for very low exponents and very large B2; that suggests that if we randomly chose 25 GhzD that would allow for decent bounds for a larger exponent but not large enough bounds for the very small. ...snip...

Maybe, we need something like a chart, depending on the amount of ECM already done?

For the very low exponent, the payout in GhzD for a relativelly short run is frightening : I got 86k GhZ day for a 10H job (and a 0.2xxxx% chance to find a factor due to ECM, wich in retrospect wasn't needed at all).

2022-01-09, 17:32   #20
tha

Dec 2002

2×52×17 Posts

Quote:
 Originally Posted by petrw1 Haven't I read somewhere on this forum that the infamous RDS showed that P-1 is always(?) more efficient at finding factoers per GhzD than ECM???
I also recall him saying that redoing P=1 only makes sense when you choose B1 >= 10 x (B1 value during the previous run on that exponent).

So, maybe we should view this as in how much time (in years) do we want in between a P-1 run on an exponent and the next run on it with better hardware (or new software based on different math).

2022-01-09, 18:27   #21
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

2·32·172 Posts

Quote:
 Originally Posted by tha I also recall him saying that redoing P=1 only makes sense when you choose B1 >= 10 x (B1 value during the previous run on that exponent). So, maybe we should view this as in how much time (in years) do we want in between a P-1 run on an exponent and the next run on it with better hardware (or new software based on different math).
I assume RDS made that 10X B1 comment when the optimal B2 was 30xB1. Now that it is 300x or 3000x does his rule of thumb still apply?

2022-01-09, 22:34   #22
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

145216 Posts

Quote:
 Originally Posted by axn TL;DR - Recommend B1 based on exponent/FFT and RAM. Let program calculate B2 - don't give any direct recommendation for it.
I like this idea

 Similar Threads Thread Thread Starter Forum Replies Last Post Ilya Gazman Factoring 6 2020-08-26 22:03 kladner Lounge 3 2018-10-01 20:32 gd_barnes No Prime Left Behind 6 2008-02-29 01:09 jasong Marin's Mersenne-aries 7 2006-12-22 21:59 GP2 Software 10 2003-12-09 20:41

All times are UTC. The time now is 04:24.

Thu Jul 7 04:24:18 UTC 2022 up 84 days, 2:25, 0 users, load averages: 1.15, 1.62, 1.69