mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2020-12-12, 10:16   #1
ZFR
 
ZFR's Avatar
 
Feb 2008
Bray, Ireland

149 Posts
Default Assigning too much memory slows down P-1 stage 2?

Using mprime. I have a machine with 32GB memory. I'm running PRPs/DC on 3 workers, but thought I'd assign one worker for P-1.

I assigned 4GB memory initially. For the first exponent, when it reached stage 2, the bounds were as follows:
Quote:
[Worker #2 Dec 10 09:50] Optimal P-1 factoring of M107608939 using up to 4000MB of memory.
[Worker #2 Dec 10 09:50] Assuming no factors below 2^77 and 2 primality tests saved if a factor is found.
[Worker #2 Dec 10 09:50] Optimal bounds are B1=740000, B2=20837000
[Worker #2 Dec 10 09:50] Chance of finding a factor is an estimated 3.51%
And then the times as follows

Quote:
[Worker #3 Dec 10 10:06] M108745423 stage 2 is 1.59% complete. Time: 950.778 sec.
[Worker #2 Dec 10 10:06] M107608939 stage 2 is 2.16% complete. Time: 951.162 sec.
[Worker #2 Dec 10 10:22] M107608939 stage 2 is 4.31% complete. Time: 944.688 sec.
Then I thought, why not give it 24GB memory. I only run mprime when the machine is idle anyway.

So for next eponent, got this at Stage 2

Quote:
[Worker #2 Dec 11 11:58] Optimal P-1 factoring of M107610803 using up to 24000MB of memory.
[Worker #2 Dec 11 11:58] Assuming no factors below 2^77 and 2 primality tests saved if a factor is found.
[Worker #2 Dec 11 11:58] Optimal bounds are B1=744000, B2=21803000
[Worker #2 Dec 11 11:58] Chance of finding a factor is an estimated 3.55%
and timings
Quote:
[Worker #2 Dec 11 12:32] M107610803 stage 2 is 0.06% complete.
[Worker #2 Dec 11 12:49] M107610803 stage 2 is 0.93% complete. Time: 1031.830 sec.
[Worker #2 Dec 11 13:06] M107610803 stage 2 is 1.80% complete. Time: 1043.622 sec.
or

Quote:
[Worker #2 Dec 11 15:45] M107610803 stage 2 is 3.40% complete. Time: 961.250 sec.
[Worker #2 Dec 11 16:01] M107610803 stage 2 is 4.28% complete. Time: 959.943 sec.
After increasing memory from 4GB to 24GB the time taken almost doubled. But the increase in bounds and the chance to find a factor barely increased. Is this really optimal?

Last fiddled with by ZFR on 2020-12-12 at 10:16
ZFR is online now   Reply With Quote
Old 2020-12-12, 15:39   #2
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

112538 Posts
Default

Never seen that happen- it looks like memory access has slowed for the larger memory footprint. Is this a dual-socket / NUMA box? What OS?
VBCurtis is offline   Reply With Quote
Old 2020-12-12, 16:00   #3
ZFR
 
ZFR's Avatar
 
Feb 2008
Bray, Ireland

149 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
Never seen that happen- it looks like memory access has slowed for the larger memory footprint. Is this a dual-socket / NUMA box? What OS?
Linux Mint 19.

Quote:
Machine:
Type: Desktop Mobo: Gigabyte model: Z97X-Gaming GT v: x.x serial: <filter>
BIOS: American Megatrends v: F7 date: 09/19/2015

CPU:
Topology: Quad Core model: Intel Core i7-4790K bits: 64 type: MT MCP
arch: Haswell rev: 3 L2 cache: 8192 KiB
flags: lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 64003
Speed: 1682 MHz min/max: 800/4000 MHz Core speeds (MHz): 1: 1625 2: 1355
3: 1515 4: 1469 5: 1867 6: 1494 7: 1342 8: 1886

Memory:
RAM: total: 31.22 GiB
Array-1: capacity: 32 GiB slots: 4 EC: None
Device-1: ChannelA-DIMM0 size: 8 GiB speed: 1333 MT/s
Device-2: ChannelA-DIMM1 size: 8 GiB speed: 1333 MT/s
Device-3: ChannelB-DIMM0 size: 8 GiB speed: 1333 MT/s
Device-4: ChannelB-DIMM1 size: 8 GiB speed: 1333 MT/s

ZFR is online now   Reply With Quote
Old 2020-12-12, 16:14   #4
axn
 
axn's Avatar
 
Jun 2003

2·3·827 Posts
Default

Something funky is going on. This is not supposed to happen. However...

Your "before" data shows two Stage 2 running in parallel. Is that right?

How many stage 2 are running in parallel in the "after" scenario?

Also, you're not showing how much memory is actually being used and how many relative primes are being processed. Can you provide those details as well (preferably for both before & after).

Finally, are you seeing any harddisk activity indicating potential thrashing of memory?
axn is offline   Reply With Quote
Old 2020-12-12, 16:32   #5
ZFR
 
ZFR's Avatar
 
Feb 2008
Bray, Ireland

9516 Posts
Default

Quote:
Originally Posted by axn View Post
Something funky is going on. This is not supposed to happen. However...

Your "before" data shows two Stage 2 running in parallel. Is that right?

How many stage 2 are running in parallel in the "after" scenario?
Yes, in the before scenario, one of the PRP threads was doing a P-1 on its exponent too.

In the "after" it's only this one that's doing P-1.



Quote:
Originally Posted by axn View Post
Also, you're not showing how much memory is actually being used and how many relative primes are being processed. Can you provide those details as well (preferably for both before & after).
Sure. I've got the output log.

Here is the full output of the "before". I've removed the other workers.
Quote:
[Worker #2 Dec 10 09:50] Setting affinity to run worker on CPU core #2
[Worker #2 Dec 10 09:50] Optimal P-1 factoring of M107608939 using up to 4000MB of memory.
[Worker #2 Dec 10 09:50] Assuming no factors below 2^77 and 2 primality tests saved if a factor is found.
[Worker #2 Dec 10 09:50] Optimal bounds are B1=740000, B2=20837000
[Worker #2 Dec 10 09:50] Chance of finding a factor is an estimated 3.51%

[Worker #2 Dec 10 09:50] Using FMA3 FFT length 5760K, Pass1=1280, Pass2=4608, clm=4

[Worker #2 Dec 10 09:50] Ignoring suggested B1 value, using B1=610000 from the save file
[Worker #2 Dec 10 09:50] Ignoring suggested B2 value, using B2=7022000 from the save file

[Worker #2 Dec 10 09:50] Available memory is 1925MB.
[Worker #2 Dec 10 09:50] Using 1909MB of memory. Processing 37 relative primes (0 of 480 already processed).

[Worker #2 Dec 10 09:50] M107608939 stage 2 is 0.00% complete.

[Worker #2 Dec 10 10:06] M107608939 stage 2 is 2.16% complete. Time: 951.162 sec.

[Worker #2 Dec 10 10:22] M107608939 stage 2 is 4.31% complete. Time: 944.688 sec.
And the one from "after"

Quote:
[Worker #2 Dec 11 11:12] Optimal P-1 factoring of M107610803 using up to 24000MB of memory.
[Worker #2 Dec 11 11:12] Assuming no factors below 2^77 and 2 primality tests saved if a factor is found.
[Worker #2 Dec 11 11:12] Optimal bounds are B1=744000, B2=21803000
[Worker #2 Dec 11 11:12] Chance of finding a factor is an estimated 3.55%
[Worker #2 Dec 11 11:12] Using FMA3 FFT length 5760K, Pass1=1280, Pass2=4608, clm=4
[Worker #2 Dec 11 11:12] M107610803 stage 1 is 99.85% complete.

[Worker #2 Dec 11 11:14] M107610803 stage 1 complete. 4336 transforms. Time: 113.176 sec.
[Worker #2 Dec 11 11:14] Starting stage 1 GCD - please be patient.
[Worker #2 Dec 11 11:15] Stage 1 GCD complete. Time: 48.280 sec.
[Worker #2 Dec 11 11:15] Available memory is 23742MB.
[Worker #2 Dec 11 11:15] Using 22960MB of memory. Processing 480 relative primes (0 of 480 already processed).

[Worker #2 Dec 11 11:44] M107610803 stage 2 is 0.00% complete. Time: 1771.636 sec.

[Worker #2 Dec 11 11:46] Worker stopped.

[Worker #2 Dec 11 11:58] Waiting 5 seconds to stagger worker starts.

[Worker #2 Dec 11 11:58] Worker starting
[Worker #2 Dec 11 11:58] Setting affinity to run worker on CPU core #2
[Worker #2 Dec 11 11:58] Optimal P-1 factoring of M107610803 using up to 24000MB of memory.
[Worker #2 Dec 11 11:58] Assuming no factors below 2^77 and 2 primality tests saved if a factor is found.
[Worker #2 Dec 11 11:58] Optimal bounds are B1=744000, B2=21803000
[Worker #2 Dec 11 11:58] Chance of finding a factor is an estimated 3.55%

[Worker #2 Dec 11 11:58] Using FMA3 FFT length 5760K, Pass1=1280, Pass2=4608, clm=4

[Worker #2 Dec 11 11:58] Available memory is 23742MB.
[Worker #2 Dec 11 11:58] Using 22960MB of memory. Processing 480 relative primes (0 of 480 already processed).

[Worker #2 Dec 11 12:32] M107610803 stage 2 is 0.06% complete.

[Worker #2 Dec 11 12:49] M107610803 stage 2 is 0.93% complete. Time: 1031.830 sec.

[Worker #2 Dec 11 13:06] M107610803 stage 2 is 1.80% complete. Time: 1043.622 sec.
I don't think there is any thrashing going on.


As I said, I keep the full output logs. So if you need me to post more, let me know.

Last fiddled with by ZFR on 2020-12-12 at 16:37
ZFR is online now   Reply With Quote
Old 2020-12-12, 17:43   #6
ZFR
 
ZFR's Avatar
 
Feb 2008
Bray, Ireland

149 Posts
Default

Is it normal that whenever I start mprime I get "Processing 480 relative primes (0 of 480 already processed)."
ZFR is online now   Reply With Quote
Old 2020-12-12, 18:40   #7
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

110338 Posts
Default

Quote:
Originally Posted by ZFR View Post
Is it normal that whenever I start mprime I get "Processing 480 relative primes (0 of 480 already processed)."
It's a function of how much RAM you allocate.
In all my experience the bigger this number the faster it completes.

Processing 480 relative primes
petrw1 is offline   Reply With Quote
Old 2020-12-12, 19:20   #8
ZFR
 
ZFR's Avatar
 
Feb 2008
Bray, Ireland

100101012 Posts
Default

Is there a way to restart stage 2 only, while keeping the stage 1 result? I want to test different memory values to see what it looks like.

I have

m107610803
m107610803.bu
m107610803.bu2
m107610803.write

files. If I remove them, it starts from P-1 stage 1.
ZFR is online now   Reply With Quote
Old 2020-12-12, 20:30   #9
ZFR
 
ZFR's Avatar
 
Feb 2008
Bray, Ireland

149 Posts
Default

By the way, not sure if it matters, but P-1 stage 1 was stopped midway before doing stage 2.

Basically, initially I only had around 300MB assigned.

Did stage 1 of exponent1
Started stage 1 of exponent2
(I wonder why stage 2 of exponent1 didn't start, checked logs and saw too little memory)
Increase to 4GB, stage 2 of exponenent1 starts. Normal speed.
At the same time stage 2 of one of the PRPs starts. Normal speed.
Stage 2 of exponent1 finishes.
Stage 1 of exponent2 completes.
Stage 2 of exponent2 starts. Slow speed.

Anyway, I've removed those m107610803 files and am restarting this. Let me see how it goes.
ZFR is online now   Reply With Quote
Old 2020-12-12, 22:35   #10
ZFR
 
ZFR's Avatar
 
Feb 2008
Bray, Ireland

149 Posts
Default

OK, that looks like the reason.

The first one used the original bounds for its stage 2; the ones that were used at the end of its stage 1.

The second one switched bounds in the middle of its stage 1.

So while the second one is slower, it has much bigger bounds.

I'm not 100% familiar with the 2 stage P-1 algorithm. Are you allowed to change bounds in the middle of stage 1? When I change memory allocation while in stage 1, the bounds and speed change accordingly. But when I change memory allocation while in stage 2, I get the message "Ignoring optimal bounds, using bounds from save file".

Is this correct?
ZFR is online now   Reply With Quote
Old 2020-12-13, 02:38   #11
axn
 
axn's Avatar
 
Jun 2003

2×3×827 Posts
Default

Code:
[Worker #2 Dec 10 09:50] Ignoring suggested B1 value, using B1=610000 from the save file
[Worker #2 Dec 10 09:50] Ignoring suggested B2 value, using B2=7022000 from the save file
Mystery solved. The first one was not doing the B1/B2 it is claiming to be doing.

Last fiddled with by axn on 2020-12-13 at 02:38
axn is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Stage 2 Memory Setting - Again Antonio Software 6 2012-09-04 12:48
Stage 2 Memory Settings gamer30 Software 17 2012-08-23 20:02
memory usage in P-1 stage 1 James Heinrich Software 5 2005-03-22 20:05
memory usage in stage 2 of P-1 factoring gckw Software 3 2003-09-07 06:56
Cheesy memory slows down prime95? nomadicus Hardware 9 2003-03-01 00:15

All times are UTC. The time now is 20:49.

Tue May 11 20:49:17 UTC 2021 up 33 days, 15:30, 1 user, load averages: 1.73, 1.78, 1.87

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.