![]() |
|
|
#133 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
100000010101112 Posts |
Quote:
In the example, our memory budget limits us to 523 temps. If we use, D=2310 we'd use 240 temps for the first poly and only 283 temps for the second poly. This would let us evaluate the first poly at 283-2*240+1 points -- clearly won't work. |
|
|
|
|
|
|
#134 |
|
Jun 2003
23·683 Posts |
Suggestion for a minor (but essentially free) optimization.
You would be doing something like: B2 start = Requested B2 / q where q is the smallest prime not dividing D. But then actual B2 will be B2 start + k * batch size, k selected such that actual B2 >= Requested B2. But since actual B2 >= Requested B2, we could adjust B2 start upwards by doing B2 start = Actual B2 / q Which would then bump Actual B2. Which would then be fed back to higher B2 start, etc... until it converges. Simply, if x = (Actual B2-Requested B2), then we can adjust B2 start and Actual B2 upwards by x/(q-1). Concrete example: D: 8190, 864x3390 polynomial multiplication (batch size = 13619970) B1=2M, Requested B2=400M q=11 B2 start = (400M/11) rounded down to the nearest multiple of D = 36363600 num batches = (400M-b2 start)/batch size, rounded up = 27 Actual B2 = B2 start + batch size * 27 = 404102790 The actual matches with what P95 is outputting. Now, we calculate the excess x = 404102790 - 400M = 4102790 Adjustment = x/(q-1) = 4102790/10 = 410279, rounded down to nearest multiple of D = 409500 New B2start = 36363600+409500 = 36773100 New actual B2 = 404102790 + 409500=404512290 Sanity check = 404512290/11 rounded down is 36773100 !! In this case, the change is small since Actual B2 was very close to Requested B2 (due to smaller batch size). In cases where batch size is larger, the corresponding change also will be larger. Most importantly, this is free. Last fiddled with by axn on 2021-12-09 at 08:53 |
|
|
|
|
|
#135 | |
|
"Robert Gerbicz"
Oct 2005
Hungary
110011010012 Posts |
Quote:
let new step=11*1050=11550, then eulerphi(step)=10*240=2400, partition the reduced residue system modulo 11550 into 10 equal size sets, where within each set if you see c in a given set, then 11550-c is also in the same set. You can do this, because: 2400 is divisible by 10, and it is true that c and 11550-c is coprime to 11550 at the same time. Any such partition would be good, say if you put c in a set then put also 11550-c, stop if you reach 240 as a size. At the end all sets will be symmetrical to the midpoint 11550/2. Call your code for each set/polynom seperately, now the step is 11 times larger what was before, but you eliminated all P, where P is divisible by 11, so you have ten problem/polynom, you can process each polynom 11 times faster what was before. Each polynom has degree=240/2=120 and you evaulate at once with polymult 403-2*120+1=164 points, "eliminating" 1050*164=172200 basepoints coprime to 11550. If t is the init time and T is total time (which includes init time) for your original run, then my suggestion will use the same memory and runs in: 10*t+10/11*(T-t) time and this will be smaller than T iff t<T/100, what is definately will be true if memory is fixed and B2->inf since here t is constant (fixed). |
|
|
|
|
|
|
#136 |
|
Dec 2016
127 Posts |
My 30.8b4 got stuck while starting stage 2. In worktodo.txt I had
Code:
Pminus1=N/A,1,2,20479,-1,5000000000,5000000000000,123,"2948977,180092327,710908007,26014268911,686893590107252399,6294660677916349147081,310138432659875910149871768953,12124683172430190710498405470618889" mprime's last words were Code:
[Work thread Dec 9 21:29] M20479 stage 1 complete. 21057674982 transforms. Total time: 22107.028 sec. [Work thread Dec 9 21:29] Inversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 0.001 sec. [Work thread Dec 9 21:29] Switching to FMA3 FFT length 1280 [Work thread Dec 9 21:29] Using 59400MB of memory. D: 7657650, 691200x5283889 polynomial multiplication. [Work thread Dec 9 21:29] Setting affinity to run polymult helper thread on logical CPUs 2 (zero-based) [Work thread Dec 9 21:29] Setting affinity to run polymult helper thread on logical CPUs 4 (zero-based) [Work thread Dec 9 21:29] Setting affinity to run polymult helper thread on logical CPUs 6 (zero-based) According to strace, the process is waiting for some lock: Code:
# strace -p 27536 -q futex(0x7f92d21f69d0, FUTEX_WAIT, 27572, NULL |
|
|
|
|
|
#137 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17×487 Posts |
|
|
|
|
|
|
#138 |
|
Dec 2016
127 Posts |
|
|
|
|
|
|
#139 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
201278 Posts |
|
|
|
|
|
|
#140 |
|
Dec 2002
881 Posts |
GHz-days for P-1 needs to be re-adjusted.
A couple of parameters on mersenne.ca as well. |
|
|
|
|
|
#141 | |
|
"Lisander Viaene"
Oct 2020
Belgium
24×7 Posts |
Quote:
If (but I assume when) the GHzDs calculation is adjusted I'd definitely like to request for my own results produced with 30.8 to be recalculated! |
|
|
|
|
|
|
#142 |
|
Aug 2020
79*6581e-4;3*2539e-3
10110110102 Posts |
I gave it a try for exponents in the 60.5M range it the increase in speed is incredible:
Code:
i9-10900K, 16GB RAM, Ubuntu 60514973 NF-PM1 2021-12-10 13:44 B1=2000000, B2=174051780 41.6883 60507347 NF-PM1 2021-12-10 09:00 B1=2000000, B2=122000000 27.5025 60507143 NF-PM1 2021-12-09 23:29 B1=2000000, B2=122000000 27.5025 60507119 NF-PM1 2021-12-09 14:06 B1=2000000, B2=122000000 27.5025 60505651 NF-PM1 2021-12-09 04:42 B1=2000000, B2=122000000 27.5025 60505337 NF-PM1 2021-12-08 19:14 B1=2000000, B2=122000000 27.5025 That range is just off cat1 DC and that should definitely impact the validation rate. Who came up with this optimization? Is it a new idea or based on something published elsewhere? And, just out of curiosity, why wasn't it used before? Is it specific for Mersenne numbers or could it be used more generally like Proth or even any integer? And, a last question (for now), could it be applied to P+1 as well? Last fiddled with by bur on 2021-12-10 at 14:05 |
|
|
|
|
|
#143 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
7,823 Posts |
Quote:
Last fiddled with by kriesel on 2021-12-10 at 14:19 |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Do not post your results here! | kar_bon | Prime Wiki | 40 | 2022-04-03 19:05 |
| what should I post ? | science_man_88 | science_man_88 | 24 | 2018-10-19 23:00 |
| Where to post job ad? | xilman | Linux | 2 | 2010-12-15 16:39 |
| Moderated Post | kar_bon | Forum Feedback | 3 | 2010-09-28 08:01 |
| Something that I just had to post/buy | dave_0273 | Lounge | 1 | 2005-02-27 18:36 |