![]() |
|
|
#89 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17·487 Posts |
Quote:
We create a polynomial with 120 coefficients that must be evaluated at multiples of D. Montgomery/Silverman/Kruppa show how to evaluate the polynomial at multiple points using polynomial multiplication. The 403 is the number of polynomial coefficients I can allocate for the second polynomial. FFT size and available memory dictate this number. A single polynomial multiply evaluates the first polynomial at 403-2*120+1 points. Thus advancing toward B2 in steps of 1050 * 164 = 172200. |
|
|
|
|
|
|
#90 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
205716 Posts |
The number of transforms is only part of the stage 2 cost. The other significant cost is the polynomial multiplies. At present, there is no data output on the number of polymults or how expensive they were.
|
|
|
|
|
|
#91 | |
|
Jun 2003
23·683 Posts |
Quote:
Gotcha. |
|
|
|
|
|
|
#92 |
|
"Seth"
Apr 2019
2×3×83 Posts |
With `MaxHighMemoryWorkers=1` 30.8v2 will resume two high memory workers at the same time.
Code:
$ cat worktodo.txt [Worker #1] Pminus1=1,2,50111,-1,3000000,1000000000 [Worker #2] Pminus1=1,2,50227,-1,6000000,10000000000 [Worker #3] Pminus1=1,2,50263,-1,9000000,100000000000 Code:
five:~/Downloads/GIMPS/p95$ ./mprimev308b2 -m -d [Main thread Dec 4 03:39] Mersenne number primality test program version 30.8 [Main thread Dec 4 03:39] Optimizing for CPU architecture: AMD Zen, L2 cache size: 12x512 KB, L3 cache size: 4x16 MB Your choice: 4 Worker to start, 0=all (0): 0 Your choice: [Main thread Dec 4 03:39] Starting workers. [Worker #2 Dec 4 03:39] Waiting 5 seconds to stagger worker starts. [Worker #3 Dec 4 03:39] Waiting 10 seconds to stagger worker starts. [Worker #1 Dec 4 03:39] P-1 on M50111 with B1=3000000, B2=1000000000 [Worker #2 Dec 4 03:39] P-1 on M50227 with B1=6000000, B2=10000000000 [Worker #3 Dec 4 03:39] P-1 on M50263 with B1=9000000, B2=100000000000 [Worker #1 Dec 4 03:39] M50111 stage 1 complete. 8656318 transforms. Total time: 22.501 sec. [Worker #1 Dec 4 03:39] Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 0.002 sec. [Worker #1 Dec 4 03:39] Available memory is 7916MB. [Worker #1 Dec 4 03:39] Using 7916MB of memory. D: 510510, 46080x279844 polynomial multiplication. ... [Worker #2 Dec 4 03:40] M50227 stage 1 complete. 17311478 transforms. Total time: 45.504 sec. [Worker #2 Dec 4 03:40] Exceeded limit on number of workers that can use lots of memory. [Worker #2 Dec 4 03:40] Looking for work that uses less memory. [Worker #2 Dec 4 03:40] No work to do at the present time. Waiting. ... [Worker #3 Dec 4 03:40] M50263 stage 1 complete. 25971112 transforms. Total time: 68.424 sec. [Worker #3 Dec 4 03:40] Exceeded limit on number of workers that can use lots of memory. [Worker #3 Dec 4 03:40] Looking for work that uses less memory. [Worker #3 Dec 4 03:40] No work to do at the present time. Waiting. ... [Worker #1 Dec 4 03:41] Stage 2 GCD complete. Time: 0.001 sec. [Worker #1 Dec 4 03:41] M50111 completed P-1, B1=3000000, B2=95867651880, Wi8: 53020C14 [Worker #1 Dec 4 03:41] No work to do at the present time. Waiting. [Worker #2 Dec 4 03:41] Restarting worker with new memory settings. [Worker #3 Dec 4 03:41] Restarting worker with new memory settings. [Worker #2 Dec 4 03:41] Resuming. [Worker #3 Dec 4 03:41] Resuming. ... [Worker #2 Dec 4 03:41] P-1 on M50227 with B1=6000000, B2=10000000000 [Worker #3 Dec 4 03:41] P-1 on M50263 with B1=9000000, B2=100000000000 Segmentation fault (core dumped) |
|
|
|
|
|
#93 | |
|
Oct 2021
U. S. / New York, NY
2·3·52 Posts |
Quote:
By the logic of your suggestion, we might recompute the TF credit formula, since the current one is still from when TF was done by CPU even though today's TF is run on GPUs with vastly greater throughput. While superficially reasonable, this probably doesn't make sense because we can see that having "inflated" credit on offer incentivizes GPU owners to run the more efficient TF and not the less efficient primality testing. Last fiddled with by techn1ciaN on 2021-12-05 at 02:30 Reason: Clarifying adjective |
|
|
|
|
|
|
#94 |
|
Aug 2002
Buenos Aires, Argentina
152310 Posts |
It appears that 30.8 runs faster than previous versions on P-1 not only when there are large amounts on RAM, but also on small exponents.
In my case (using 8GB of RAM in an I5 3470) Prime95 required 5 days to get the following: Code:
processing: P-1 no-factor for M9325159 (B1=50,000,000, B2=50,001,265,860) CPU credit is 1312.7590 GHz-days. The difference between 1 hour and 5 days (to get half the credit) cannot be explained only by the amount of RAM in the system. |
|
|
|
|
|
#95 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
827910 Posts |
|
|
|
|
|
|
#96 | |
|
Jun 2003
155816 Posts |
Quote:
Anyway, whenever you release build 3(?) (with this and other bug fixes), i'll switch over from build 1 which so far seems to be working fine for my use case. |
|
|
|
|
|
|
#97 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
827910 Posts |
This version adds SSE2, FMA, AVX-512 support. Non-power-of-two FFTs in polymult. Stage 2 now takes advantage of an FFT's ability to do circular convolution. The upshot is stage 2 is now faster.
Fixed some bugs. Linux version required upgrade to GCC 8 for AVX512 support. This could pose GCC library issues for some users. To address the over-aggressive B2 calculations, I added option Pm1CostFudge=n to prime.txt. Default value is 2.5. This option says multiple the stage 2 cost estimate by n. This option may disappear when I get around to writitng a more accurate costing function. Added Stage2ExtraThreads=n to prime.txt. Hyperthreading might help polymult. This gives polymult more threads to chew on. Untested. Highest priority next is save files, interruptability, some status reporting. And major bug fixes. Should you wish to try 30.8, same warnings as before. Links are below.
Windows 64-bit: https://mersenne.org/ftp_root/gimps/p95v308b3.win64.zip Linux 64-bit: https://mersenne.org/ftp_root/gimps/...linux64.tar.gz Last fiddled with by Prime95 on 2021-12-05 at 05:57 |
|
|
|
|
|
#98 |
|
Jun 2003
23·683 Posts |
Wow! 330s -> 212s
|
|
|
|
|
|
#99 | |
|
"Florian"
Oct 2021
Germany
CE16 Posts |
Quote:
With this setting enabled, stage 2 went from 745s to 725s on a 25.6M exponent (B1/B2 = 700k and 450M). Stage 2 init went from ~90s to ~60s though Last fiddled with by Luminescence on 2021-12-05 at 07:58 Reason: Last line |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Do not post your results here! | kar_bon | Prime Wiki | 40 | 2022-04-03 19:05 |
| what should I post ? | science_man_88 | science_man_88 | 24 | 2018-10-19 23:00 |
| Where to post job ad? | xilman | Linux | 2 | 2010-12-15 16:39 |
| Moderated Post | kar_bon | Forum Feedback | 3 | 2010-09-28 08:01 |
| Something that I just had to post/buy | dave_0273 | Lounge | 1 | 2005-02-27 18:36 |