mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Prime95 30.8 (big P-1 changes, see post #551) (https://www.mersenneforum.org/showthread.php?t=27366)

nordi 2021-11-28 23:43

I also benchmarked four Zen2 cores (=1 core complex) working on [M]11977759[/M] (FFT length in stage 2 768K) with B2=50,000,000 (which mprime modified a bit) and different RAM settings. The timings are for stage 2 init and stage 2 itself, plus the total time.

[code]
8.5 GB 10.8 + 315.6 = [B]326.4 seconds[/B] B2=51,228,870
17 GB 24.6 + 165.7 = [B]190.3 seconds[/B] B2=51,278,370
34 GB 43.4 + 138.3 = [B]181.7 seconds[/B] B2=[B]72[/B],162,090
[/code]Doubling RAM from 8.5 to 17 GB gave 72% more throughput.
Doubling RAM again to 34 GB gave 5% more throughput at a much higher B2.
Even with 96GB available, mprime still used 'only' 34GB, so no more benchmark results. But still, this version wants LOTS of RAM and puts it to excellent use.

Prime95 2021-11-29 00:36

[QUOTE=nordi;594108]The automatically chosen B2 was too aggressive![/QUOTE]

That will be a problem for a while. Optimal B2 uses a cost function which I have not worked on much. There's little point working on the cost function while the stage 2 code is still being optimized.

I noticed the same thing here on exponents around 80K. B1 of 300 million (2 hours) is getting a B2 of 12 trillion (4 hours).

Zhangrc 2021-11-29 04:54

B2=90M for wavefront P-1(108M)
 
[code]
[Nov 29 12:46] Setting affinity to run worker on CPU core #2
[Nov 29 12:46] Optimal P-1 factoring of M108390077 using up to 11571MB of memory.
[Nov 29 12:46] Assuming no factors below 2^77 and 2 primality tests saved if a factor is found.
[Nov 29 12:46] Optimal bounds are B1=956000, [B]B2=89586000[/B]
[Nov 29 12:46] Chance of finding a factor is an estimated 4.7%
[Nov 29 12:46]
[Nov 29 12:46] Using FMA3 FFT length 5760K, Pass1=768, Pass2=7680, clm=4, 4 threads
[/code]
Impressive.

Glenn 2021-11-30 18:57

Prime95 30.8 (pre-beta) (FOR P-1 USERS ONLY; SMALL EXPONENTS ONLY)
 
Looks like 30.8 builds are now available. I just downloaded build 2. This should be made a Sticky as soon as possible.

Uncwilly 2021-11-30 19:04

30.8 is pre-beta. It should not be stickied yet.
See here for the current issues: [url]https://www.mersenneforum.org/showpost.php?p=594097&postcount=988[/url]

Prime95 2021-11-30 19:24

30.8 is [B]not ready for prime-time[/B]!

I made this version available much earlier than normal because it has significant improvements for P-1 stage 2 on "smaller" exponents. This version is only for P-1 users.

Glenn 2021-11-30 20:26

Understood. I won’t start using it yet. Hopefully later builds will fix things.

I couldn’t download the stable version of 30.7, only build 9, which I’m currently using.

techn1ciaN 2021-11-30 20:54

[QUOTE=Glenn;594225]I couldn’t download the stable version of 30.7, only build 9, which I’m currently using.[/QUOTE]

That is the stable version. James Heinrich said in the 30.7 thread that the problem with the mersenne.org download should already have been fixed, unless you were experiencing a different one.

lisanderke 2021-11-30 22:53

Perhaps the title of this post could be edited to reflect (on first glance) that it is not ready for all users, at least until that version comes out of pre-beta. (something like: "Prime95 30.8 (ONLY FOR P-1 USERS)")
I think it might be nice to move discussion/bug reports from the sub two k thread to here, in the software category, since there are quite a lot of posts to do with mostly this release/pre-beta version there.


Just a suggestion ofcourse, and thanks for all the continued hard work on this software!!

axn 2021-12-01 07:26

Build 2 is bad with multithreading:
[CODE]P-1 on M5401951 with B1=8000000, B2=8000000000
Setting affinity to run helper thread 1 on CPU core #2
Setting affinity to run helper thread 3 on CPU core #4
Setting affinity to run helper thread 4 on CPU core #5
Setting affinity to run helper thread 2 on CPU core #3
Using FMA3 FFT length 280K, Pass1=896, Pass2=320, clm=2, 6 threads
Setting affinity to run helper thread 5 on CPU core #6
Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 1.024 sec.
Setting affinity to run helper thread 1 on CPU core #2
Setting affinity to run helper thread 3 on CPU core #4
Switching to FMA3 FFT length 336K, Pass1=448, Pass2=768, clm=1, 6 threads
Setting affinity to run helper thread 4 on CPU core #5
Setting affinity to run helper thread 2 on CPU core #3
Setting affinity to run helper thread 5 on CPU core #6
Using 56770MB of memory. D: 43890, 4320x16961 polynomial multiplication.
Round off: 0, poly_size: 2, EB: 1.67728, SM: 3.33496
Round off: 0, poly_size: 4
Round off: 0, poly_size: 8
Round off: 0, poly_size: 16
Round off: 0, poly_size: 32
Round off: 0, poly_size: 64
Round off: 0, poly_size: 128
Round off: 0, poly_size: 256
Round off: 0, poly_size: 512
Round off: 0, poly_size: 1024
Round off: 0, poly_size: 2048
Round off: 0, poly_size: 4096
Round off: 0, poly_size: 8192
Stage 2 init complete. 148272 transforms. Time: 158.998 sec.
Round off: 0
M5401951 stage 2 is 0.00% complete.
M5401951 stage 2 complete. 2128051 transforms. Total time: 2374.162 sec.
Stage 2 GCD complete. Time: 0.652 sec.
M5401951 completed P-1, B1=8000000, B2=8285685870[/CODE]
Compare to build 1:
[CODE]P-1 on M5401993 with B1=8000000, B2=8000000000
Using FMA3 FFT length 280K, Pass1=896, Pass2=320, clm=2, 6 threads
Setting affinity to run helper thread 3 on CPU core #4
Setting affinity to run helper thread 2 on CPU core #3
Setting affinity to run helper thread 1 on CPU core #2
Setting affinity to run helper thread 5 on CPU core #6
Setting affinity to run helper thread 4 on CPU core #5
Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 1.021 sec.
Setting affinity to run helper thread 1 on CPU core #2
Switching to FMA3 FFT length 336K, Pass1=448, Pass2=768, clm=1, 6 threads
Setting affinity to run helper thread 3 on CPU core #4
Setting affinity to run helper thread 2 on CPU core #3
Setting affinity to run helper thread 4 on CPU core #5
Setting affinity to run helper thread 5 on CPU core #6
Using 56770MB of memory. D: 43890, 4320x16961 polynomial multiplication.
Setting affinity to run polymult helper thread on CPU core #2
Setting affinity to run polymult helper thread on CPU core #3
Setting affinity to run polymult helper thread on CPU core #4
Setting affinity to run polymult helper thread on CPU core #5
Setting affinity to run polymult helper thread on CPU core #6
Stage 2 init complete. 148272 transforms. Time: 112.924 sec.
M5401993 stage 2 is 0.00% complete.
M5401993 stage 2 complete. 2128051 transforms. Total time: 942.714 sec.
Stage 2 GCD complete. Time: 0.663 sec.
M5401993 completed P-1, B1=8000000, B2=8285685870[/CODE]

2374s vs 942s. top shows build 2 is using 200% (with occasional spikes to 500+%) whereas build 1 is consistently pegged at ~600%

kruoli 2021-12-01 15:46

[QUOTE=kruoli;594103]My test case was two workers. The first had a known factor. The second had some other work:
[CODE][Worker #1]
Pminus1=N/A,1,2,22463209,-1,1000000,324000000,75
[Worker #2]
Pminus1=N/A,1,2,21362113,-1,1000000,32400000,75
Pminus1=N/A,1,2,21362903,-1,1000000,32400000,75[/CODE]
It started normally, but was not stating which B2 it wanted to use. I had a stage 1 file which it used successfully. While stage 2 in worker #1 was running (using 110-115 % of the memory I had allowed it), stage 1 of the first assignment in worker #2 completed and the second assessment was started. After the factor was found, the worktodo entry in worker #1 was removed. It then crashed with error code 0xc0000005 at 0x000000000208b09a.

I tried to start the program again. When entering the worker #2 start (it now tried to start stage 2 of the first assignment of worker #2), it gave a B2 value this time, but crashed again. So I ran it in the debugger and got an error at 0x00007FF7093CB09A in prime95.exe: 0xC0000005: access violation exception reading 0xFFFFFFFFFFFFFFE4.[/QUOTE]

George, do you need a save file for that? I tested some more (stage 1 done by 30.8b2) and got this again with another exponent, but some exponents are fine. I omitted the system details… This was on a 1950X.


All times are UTC. The time now is 12:29.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.