![]() |
|
|
#111 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
7,537 Posts |
Quote:
Have you looked at the CPU's web page to see if the server knows the current day and night memory settings? |
|
|
|
|
|
|
#112 | |
|
Oct 2004
Austria
248210 Posts |
Quote:
|
|
|
|
|
|
|
#113 |
|
"Mike"
Aug 2002
2·23·179 Posts |
The documentation suggests that if you have a quad core box with a lot of memory and want each worker thread to use 1GiB each you would have something like this in local.txt:
[Worker #1] Memory=1024 [Worker #2] Memory=1024 [Worker #3] Memory=1024 [Worker #4] Memory=1024 But, when we try this it does not work. Mprime acts like the global setting is 4GiB. Some worker threads get a lot of memory and some get very little. |
|
|
|
|
|
#114 |
|
Aug 2002
Termonfeckin, IE
ACC16 Posts |
Four P-1s is probably not a good idea anyway. I think running two is best with one other TF and one TF or LL. Can't answer your question though.
|
|
|
|
|
|
#115 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
7,537 Posts |
Quote:
I'll add this to the bug list to investigate after vacation. Sorry, not a high priority. |
|
|
|
|
|
|
#116 |
|
Oct 2004
Austria
46628 Posts |
This bug might prevent people from setting P-1 as default work type for multiple workers, and thus narrow down the amount of P-1 work done. If the overall amount of P-1 work done is still insufficient (as said in the starting post of this thread), the priority of fixing this bug should not be too low.
|
|
|
|
|
|
#117 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
11×311 Posts |
|
|
|
|
|
|
#118 | |
|
Oct 2004
Austria
1001101100102 Posts |
Quote:
|
|
|
|
|
|
|
#119 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
11·311 Posts |
What I expect would happen, based on how I've seen P-1 work behave, is that bounds are set by maximum available memory (in this example 4GB) and it simply works out that the worker that gets only a small amount of memory (say 200MB) simply has to do so a little less efficiently (more passes with fewer relative primes per pass) while the RAM-hogging worker is working more efficiently than expected (processing more relative primes per pass for fewer passes). In the end I don't know that one method is really better than the other -- autoallocation of RAM would probably still have better throughput since some of the 4 workers are using all 4GB of RAM at any given time, whereas with forcibly limiting to 1GB/worker then I'd expect to see only 2GB used most of the time, and 3GB used some of the time (while 1 or 2 workers are doing stage1).
|
|
|
|
|
|
#120 |
|
Jun 2003
22218 Posts |
I do almost esclusively P-1 work. My way of working is as described above. I set MaxHighMemWorkers to 1 untill I have built up a backlog of stage2 work to do, then clear it with MaxHighMemWorkers set to 2. Rinse and repeat.
I'm still using V25.7 so it's possible that the memory allocation algorithms have changed, but what I see is that with MaxHighMemWorkers set to 1, sometimes the last pass uses only a small amount of memory. What I'd like it to do in this situation is start stage2 work on the other worker, temporarily ignoring the MaxHighMemWorkers setting, and continue with low memory work on the first, once it completes. With MaxHighMemWorkers set to 2, what happens is that the memory is divided between the two workers more or less equally initially, but then one reaches its final pass and takes a smaller share of the memory. Then the other grabs the rest, locking the first into its low memory allocation, until the second reaches its final pass. However if there is no memory available, for example if a worker has only just finished stage 1 and the other has all the memory, then the latter is restarted. Here's my wishlist: I'd like to see an end to the restarts in stage 2, which I understand incur a substantial overhead. Instead make the worker wait until another thread finishes it's pass, doing low memory work in the meantime. I'd like a local.txt setting which sets a minimum number of relative primes a worker can process in a pass. If there is insufficient memory then the worker should wait. The exception to this would be on the final pass where fewer relative primes might be needed. I'd like workers on their final pass not to count toward the MaxHighMemWorkers limit. Finally I'd like to see an option to suppress restarts when the total available memory is reduced, even if that means that the program remains over the limit for a while. I like to manually control the available memory, according to what else I'm doing, and I'd like to be able to say to the program "Reduce your memory usage soon" rather than "now". If I need more memory "now" then I can aways restart the threads myself. |
|
|
|
|
|
#121 | |
|
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
22×3×17×23 Posts |
Quote:
Can you tell me your memory allocations and run time for a 50M exponent. Because of my slow down I am wondering if running P-1 on a dedicated 3.4 PIV and running another LL on the Quad will overall give me more thruput without losing my P-1 contribution. Today: 29M LL on the 3.4 PIV takes about 18 days. P-1 on 1 Quad-Core takes 57 hour with 1200K If I reverse the assignments and get: 29M LL down to 15 days (3 days less) and the P-1 goes to 81 hours (1 day more) I am ahead overall. The PIV has 1GB RAM so I'd have to drop the P-1 RAM to about 700K? That is, unless adding a 4th LL slows the other 3 LL by a day each too then I LOSE overall. Last fiddled with by petrw1 on 2009-01-24 at 16:06 Reason: Last line |
|
|
|
|