![]() |
[QUOTE=Andi47;159716]
Does P95 check for day/nighttime memory settings when trying to get p-1 or ecm assignments?[/QUOTE] The server does. According to [url]http://www.mersenne.org/thresholds/[/url] both day and night memory must exceed 200MB to get P-1 assignments. Have you looked at the CPU's web page to see if the server knows the current day and night memory settings? |
[QUOTE=Prime95;159720]The server does. According to [url]http://www.mersenne.org/thresholds/[/url] both day and night memory must exceed 200MB to get P-1 assignments.
Have you looked at the CPU's web page to see if the server knows the current day and night memory settings?[/QUOTE] The server sees what I have specified in options / CPU - and ooops, seems that I have specified 128 instead of 256 MB. I will stop / restart the client when I'm back in office today to see if it recognizes the line in prime.txt then - and I will change the daytime settings to at least 200 MB. |
The documentation suggests that if you have a quad core box with a lot of memory and want each worker thread to use 1GiB each you would have something like this in local.txt:
[FONT=Courier New][Worker #1] Memory=1024 [Worker #2] Memory=1024 [Worker #3] Memory=1024 [Worker #4] Memory=1024[/FONT] But, when we try this it does not work. Mprime acts like the global setting is 4GiB. Some worker threads get a lot of memory and some get very little. |
Four P-1s is probably not a good idea anyway. I think running two is best with one other TF and one TF or LL. Can't answer your question though.:cry:
|
[QUOTE=Xyzzy;159919]The documentation suggests that if you have a quad core box with a lot of memory and want each worker thread to use 1GiB each you would have something like this in local.txt:
[FONT=Courier New][Worker #1] Memory=1024 [Worker #2] Memory=1024 [Worker #3] Memory=1024 [Worker #4] Memory=1024[/FONT] But, when we try this it does not work. Mprime acts like the global setting is 4GiB. Some worker threads get a lot of memory and some get very little.[/QUOTE] I'll add this to the bug list to investigate after vacation. Sorry, not a high priority. |
[QUOTE=Prime95;159929]I'll add this to the bug list to investigate after vacation. Sorry, not a high priority.[/QUOTE]
This bug might prevent people from setting P-1 as default work type for multiple workers, and thus narrow down the amount of P-1 work done. If the overall amount of P-1 work done is still insufficient (as said in the starting post of this thread), the priority of fixing this bug should not be too low. |
[QUOTE=Andi47;159996]This bug might prevent people from setting P-1 as default work type for multiple workers[/QUOTE]I don't think it would prevent anyone from setting P-1 as a default worktype, it would only affect the ability to customize distribution of RAM to the individual workers.
|
[QUOTE=James Heinrich;160006]I don't think it would prevent anyone from setting P-1 as a default worktype, it would only affect the ability to customize distribution of RAM to the individual workers.[/QUOTE]
If I understand the symptoms of the bug (described by Xyzzy) correctly (please correct me if I'm wrong!), it could happen, that e.g. two workers are doing stage 2 at the same time and using nearly all memory (with very high B2), and a third worker which wants to start stage 2 can't do so (or might use a tiny B2 and thus might produce a candidate for the "[URL="http://www.mersenneforum.org/showthread.php?t=6261"]poorly p-1'd exponents[/URL]"-thread?), thus slowing down P-1 factorization? (a user seeing such behaviour might think that it's not a good idea to set more than one or two threads to P-1.) |
What I expect would happen, based on how I've seen P-1 work behave, is that bounds are set by maximum available memory (in this example 4GB) and it simply works out that the worker that gets only a small amount of memory (say 200MB) simply has to do so a little less efficiently (more passes with fewer relative primes per pass) while the RAM-hogging worker is working more efficiently than expected (processing more relative primes per pass for fewer passes). In the end I don't know that one method is really better than the other -- autoallocation of RAM would probably still have better throughput since some of the 4 workers are using all 4GB of RAM at any given time, whereas with forcibly limiting to 1GB/worker then I'd expect to see only 2GB used most of the time, and 3GB used some of the time (while 1 or 2 workers are doing stage1).
|
I do almost esclusively P-1 work. My way of working is as described above. I set [i]MaxHighMemWorkers[/i] to 1 untill I have built up a backlog of stage2 work to do, then clear it with [i]MaxHighMemWorkers[/i] set to 2. Rinse and repeat.
I'm still using V25.7 so it's possible that the memory allocation algorithms have changed, but what I see is that with [i]MaxHighMemWorkers[/i] set to 1, sometimes the last pass uses only a small amount of memory. What I'd like it to do in this situation is start stage2 work on the other worker, temporarily ignoring the [i]MaxHighMemWorkers[/i] setting, and continue with low memory work on the first, once it completes. With [i]MaxHighMemWorkers[/i] set to 2, what happens is that the memory is divided between the two workers more or less equally initially, but then one reaches its final pass and takes a smaller share of the memory. Then the other grabs the rest, locking the first into its low memory allocation, until the second reaches its final pass. However if there is no memory available, for example if a worker has only just finished stage 1 and the other has all the memory, then the latter is restarted. Here's my wishlist: I'd like to see an end to the restarts in stage 2, which I understand incur a substantial overhead. Instead make the worker wait until another thread finishes it's pass, doing low memory work in the meantime. I'd like a local.txt setting which sets a minimum number of relative primes a worker can process in a pass. If there is insufficient memory then the worker should wait. The exception to this would be on the final pass where fewer relative primes might be needed. I'd like workers on their final pass not to count toward the [i]MaxHighMemWorkers[/i] limit. Finally I'd like to see an option to suppress restarts when the total available memory is reduced, even if that means that the program remains over the limit for a while. I like to manually control the available memory, according to what else I'm doing, and I'd like to be able to say to the program "Reduce your memory usage soon" rather than "now". If I need more memory "now" then I can aways restart the threads myself. |
[QUOTE=petrw1;158705]On my Q9550 I have 3 workers on LL and 1 on P-1 with 1200Mb.
I find Stage 1 takes 21 hours and Stage 2 takes 37 hours. However, the BAD news is that when I compare P-1 alone against P-1 with 3 LL workers the P-1 stage 2 runs about 60% slower with the 3 LL's running. I didn't test Phase 1 yet.[/QUOTE] Is anyone running P-1 on a PIV in the 3.0 to 3.4 Ghz Range? Can you tell me your memory allocations and run time for a 50M exponent. Because of my slow down I am wondering if running P-1 on a dedicated 3.4 PIV and running another LL on the Quad will overall give me more thruput without losing my P-1 contribution. Today: 29M LL on the 3.4 PIV takes about 18 days. P-1 on 1 Quad-Core takes 57 hour with 1200K If I reverse the assignments and get: 29M LL down to 15 days (3 days less) and the P-1 goes to 81 hours (1 day more) I am ahead overall. The PIV has 1GB RAM so I'd have to drop the P-1 RAM to about 700K? That is, unless adding a 4th LL slows the other 3 LL by a day each too then I LOSE overall. |
| All times are UTC. The time now is 10:25. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.