mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Reply
 
Thread Tools
Old 2009-01-21, 15:43   #111
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,537 Posts
Default

Quote:
Originally Posted by Andi47 View Post
Does P95 check for day/nighttime memory settings when trying to get p-1 or ecm assignments?
The server does. According to http://www.mersenne.org/thresholds/ both day and night memory must exceed 200MB to get P-1 assignments.

Have you looked at the CPU's web page to see if the server knows the current day and night memory settings?
Prime95 is offline   Reply With Quote
Old 2009-01-22, 03:54   #112
Andi47
 
Andi47's Avatar
 
Oct 2004
Austria

2×17×73 Posts
Default

Quote:
Originally Posted by Prime95 View Post
The server does. According to http://www.mersenne.org/thresholds/ both day and night memory must exceed 200MB to get P-1 assignments.

Have you looked at the CPU's web page to see if the server knows the current day and night memory settings?
The server sees what I have specified in options / CPU - and ooops, seems that I have specified 128 instead of 256 MB. I will stop / restart the client when I'm back in office today to see if it recognizes the line in prime.txt then - and I will change the daytime settings to at least 200 MB.
Andi47 is offline   Reply With Quote
Old 2009-01-23, 00:07   #113
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

202A16 Posts
Default

The documentation suggests that if you have a quad core box with a lot of memory and want each worker thread to use 1GiB each you would have something like this in local.txt:

[Worker #1]
Memory=1024
[Worker #2]
Memory=1024
[Worker #3]
Memory=1024
[Worker #4]
Memory=1024


But, when we try this it does not work. Mprime acts like the global setting is 4GiB. Some worker threads get a lot of memory and some get very little.
Xyzzy is offline   Reply With Quote
Old 2009-01-23, 00:34   #114
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

22·691 Posts
Default

Four P-1s is probably not a good idea anyway. I think running two is best with one other TF and one TF or LL. Can't answer your question though.
garo is offline   Reply With Quote
Old 2009-01-23, 01:38   #115
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

753710 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
The documentation suggests that if you have a quad core box with a lot of memory and want each worker thread to use 1GiB each you would have something like this in local.txt:

[Worker #1]
Memory=1024
[Worker #2]
Memory=1024
[Worker #3]
Memory=1024
[Worker #4]
Memory=1024


But, when we try this it does not work. Mprime acts like the global setting is 4GiB. Some worker threads get a lot of memory and some get very little.

I'll add this to the bug list to investigate after vacation. Sorry, not a high priority.
Prime95 is offline   Reply With Quote
Old 2009-01-23, 10:51   #116
Andi47
 
Andi47's Avatar
 
Oct 2004
Austria

1001101100102 Posts
Default

Quote:
Originally Posted by Prime95 View Post
I'll add this to the bug list to investigate after vacation. Sorry, not a high priority.
This bug might prevent people from setting P-1 as default work type for multiple workers, and thus narrow down the amount of P-1 work done. If the overall amount of P-1 work done is still insufficient (as said in the starting post of this thread), the priority of fixing this bug should not be too low.
Andi47 is offline   Reply With Quote
Old 2009-01-23, 12:00   #117
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

65358 Posts
Default

Quote:
Originally Posted by Andi47 View Post
This bug might prevent people from setting P-1 as default work type for multiple workers
I don't think it would prevent anyone from setting P-1 as a default worktype, it would only affect the ability to customize distribution of RAM to the individual workers.
James Heinrich is offline   Reply With Quote
Old 2009-01-23, 13:20   #118
Andi47
 
Andi47's Avatar
 
Oct 2004
Austria

1001101100102 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I don't think it would prevent anyone from setting P-1 as a default worktype, it would only affect the ability to customize distribution of RAM to the individual workers.
If I understand the symptoms of the bug (described by Xyzzy) correctly (please correct me if I'm wrong!), it could happen, that e.g. two workers are doing stage 2 at the same time and using nearly all memory (with very high B2), and a third worker which wants to start stage 2 can't do so (or might use a tiny B2 and thus might produce a candidate for the "poorly p-1'd exponents"-thread?), thus slowing down P-1 factorization? (a user seeing such behaviour might think that it's not a good idea to set more than one or two threads to P-1.)
Andi47 is offline   Reply With Quote
Old 2009-01-23, 14:01   #119
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11·311 Posts
Default

What I expect would happen, based on how I've seen P-1 work behave, is that bounds are set by maximum available memory (in this example 4GB) and it simply works out that the worker that gets only a small amount of memory (say 200MB) simply has to do so a little less efficiently (more passes with fewer relative primes per pass) while the RAM-hogging worker is working more efficiently than expected (processing more relative primes per pass for fewer passes). In the end I don't know that one method is really better than the other -- autoallocation of RAM would probably still have better throughput since some of the 4 workers are using all 4GB of RAM at any given time, whereas with forcibly limiting to 1GB/worker then I'd expect to see only 2GB used most of the time, and 3GB used some of the time (while 1 or 2 workers are doing stage1).
James Heinrich is offline   Reply With Quote
Old 2009-01-23, 17:51   #120
Mr. P-1
 
Mr. P-1's Avatar
 
Jun 2003

100100100012 Posts
Default

I do almost esclusively P-1 work. My way of working is as described above. I set MaxHighMemWorkers to 1 untill I have built up a backlog of stage2 work to do, then clear it with MaxHighMemWorkers set to 2. Rinse and repeat.

I'm still using V25.7 so it's possible that the memory allocation algorithms have changed, but what I see is that with MaxHighMemWorkers set to 1, sometimes the last pass uses only a small amount of memory. What I'd like it to do in this situation is start stage2 work on the other worker, temporarily ignoring the MaxHighMemWorkers setting, and continue with low memory work on the first, once it completes.

With MaxHighMemWorkers set to 2, what happens is that the memory is divided between the two workers more or less equally initially, but then one reaches its final pass and takes a smaller share of the memory. Then the other grabs the rest, locking the first into its low memory allocation, until the second reaches its final pass. However if there is no memory available, for example if a worker has only just finished stage 1 and the other has all the memory, then the latter is restarted.

Here's my wishlist:

I'd like to see an end to the restarts in stage 2, which I understand incur a substantial overhead. Instead make the worker wait until another thread finishes it's pass, doing low memory work in the meantime.

I'd like a local.txt setting which sets a minimum number of relative primes a worker can process in a pass. If there is insufficient memory then the worker should wait. The exception to this would be on the final pass where fewer relative primes might be needed.

I'd like workers on their final pass not to count toward the MaxHighMemWorkers limit.

Finally I'd like to see an option to suppress restarts when the total available memory is reduced, even if that means that the program remains over the limit for a while. I like to manually control the available memory, according to what else I'm doing, and I'd like to be able to say to the program "Reduce your memory usage soon" rather than "now". If I need more memory "now" then I can aways restart the threads myself.
Mr. P-1 is offline   Reply With Quote
Old 2009-01-24, 16:03   #121
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

22×3×17×23 Posts
Default

Quote:
Originally Posted by petrw1 View Post
On my Q9550 I have 3 workers on LL and 1 on P-1 with 1200Mb.
I find Stage 1 takes 21 hours and Stage 2 takes 37 hours.

However, the BAD news is that when I compare P-1 alone against P-1 with 3 LL workers the P-1 stage 2 runs about 60% slower with the 3 LL's running. I didn't test Phase 1 yet.
Is anyone running P-1 on a PIV in the 3.0 to 3.4 Ghz Range?
Can you tell me your memory allocations and run time for a 50M exponent.

Because of my slow down I am wondering if running P-1 on a dedicated 3.4 PIV and running another LL on the Quad will overall give me more thruput without losing my P-1 contribution.

Today:
29M LL on the 3.4 PIV takes about 18 days. P-1 on 1 Quad-Core takes 57 hour with 1200K
If I reverse the assignments and get:
29M LL down to 15 days (3 days less) and the P-1 goes to 81 hours (1 day more) I am ahead overall. The PIV has 1GB RAM so I'd have to drop the P-1 RAM to about 700K?

That is, unless adding a 4th LL slows the other 3 LL by a day each too then I LOSE overall.

Last fiddled with by petrw1 on 2009-01-24 at 16:06 Reason: Last line
petrw1 is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 11:03.


Mon Aug 2 11:03:19 UTC 2021 up 10 days, 5:32, 0 users, load averages: 1.59, 1.77, 1.66

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.