![]() |
[QUOTE=Prime95;165030]To all sufferers of the unreserve bug:
By any chance was the client doing P-1 when this happened? P-1 sometimes reports percent complete over 100% - I haven't figured out that bug. A side effect, fixed in 25.9, is that estimated completion dates were not calculated properly leading to erroneous unreserves.[/QUOTE] At the time this affected PC was on (and still is on) 25.9 Build 3. I don't think it was doing P-1; I say that because this PC never has produced any P-1 results; unless it was doing P-1 and that was one of the assignments dropped at the time ... and before it competed the P-1. There are a lot of important bugs fixed in 25.9 (GOOD!) however, admittedly it is a few percent slower for PIV machines - 4 of my 7 - (NOT GOOD!). This makes the decision to upgrade somewhat cloudy. In my experience, I find that V24 actually seems to do a better job estimating completion dates. I had a machine offline for about a week recently; the PIV Equivalency dropped and all the estimated completion dates were pessimistic (about 15%). I have a Quad that due to some kind of typical contention when all 4 cores are running they all slow down a little (maybe also 20%); however the estimated completion times there are too optomistic in that they seem to be based on theoretical thruput. This all being said it does get better over time so maybe once these Equivalency Factors have time to level off the estimates will be better. Thanks. |
[QUOTE=Prime95;165028]Done. Assignments are now to 2^64 for exponents between 100M and 400M, 2^65 to 800M, and 2^66 to 1B.
[/QUOTE] Thanks P.S. And welcome back...any trip hilights? |
[QUOTE=petrw1;165038]...any trip hilights?[/QUOTE]
All of Australia was a highlight, probably our best vacation to date. |
[QUOTE=petrw1;165037]At the time this affected PC was on 25.9 Build 3.
I don't think it was doing P-1; I say that because this PC never has produced any P-1 results; [/QUOTE] Thanks, that's what I needed to know. I'll install some more safety measures. |
Two more questions, were the unreserved exponents from one worker or both workers? In prime.log were the new exponents reserved seconds after unreserving or several minutes/hours later?
|
[QUOTE=Prime95;165042]Two more questions, were the unreserved exponents from one worker or both workers? In prime.log were the new exponents reserved seconds after unreserving or several minutes/hours later?[/QUOTE]
Worker 1 of a Duo ... I'll check the log tomorrow but I believe it was immediate. |
[QUOTE=Prime95;165042]Two more questions, were the unreserved exponents from one worker or both workers? In prime.log were the new exponents reserved seconds after unreserving or several minutes/hours later?[/QUOTE]In my case: only one worker affected out of four on a quad, and seconds later. There was no p-1 running or lined up. The client is mprime 25.9 build 3, on Ubuntu 64-bit. I had just stopped the client, entered "./mprime -m", and it immediately unreserved & reserved.
At that stage I hadn't got around to disabling speedstep in bios, if that might be a factor. |
[QUOTE=Prime95;165030]P-1 sometimes reports percent complete over 100% - I haven't figured out that bug.[/QUOTE]
A possibly related issue is that P-1 sometimes 'sticks' at 100%, so that if I set reporting to every 100 iterations, say, the percentage reported during sucessive outputs might be 99.2%, 99.5%, 99.8%, 100.0%, 100.0%, 100.0%,..., before eventually going to I suspect that the percentage reported is actually a little higher than the amount actually done. (Client version 25.7 build 3) |
[QUOTE=petrw1;165043]Worker 1 of a Duo ... I'll check the log tomorrow but I believe it was immediate.[/QUOTE]
See log excerpt immediately following ... there is no time stamp between the Unreserving and Getting Assignment so can I assume it was immedidate, that is without a gap between? This was not the regular "Send new completion dates" time and it did not complete an assignment at this time an need to communicate for another one. There might have been system restart at this time but we can't remember for sure ... could the trigger have been something on the server? P.S. My time zone is 6 hours behind the server ... this would have happened 14:16:31 on your end. [QUOTE] [Tue Feb 17 08:16:31 2009 - ver 25.9] Unreserving M28650653 URL: [url]http://v5.mersenne.org/v5server/?v=0.95&px=GIMPS&t=au&g=ddaa0b55c505cc507d0e9650f885192c&k=07718BE28C7914674B7B4F514E2B6F8A&ss=5214&sh=5AE761189DABBD68F695B3E9D9CD1A5B[/url] RESPONSE: pnErrorResult=0 pnErrorDetail=SUCCESS ==END== ------------ 13 more 'Unreserving' sections deleted -------------- Unreserving M28260691 URL: [url]http://v5.mersenne.org/v5server/?v=0.95&px=GIMPS&t=au&g=ddaa0b55c505cc507d0e9650f885192c&k=7353FCF8C1B021F5528DD87C1A87A9A2&ss=15987&sh=A0A017CF7D80A203E389D269E4DE7D8A[/url] RESPONSE: pnErrorResult=0 pnErrorDetail=SUCCESS ==END== Getting assignment from server URL: [url]http://v5.mersenne.org/v5server/?v=0.95&px=GIMPS&t=ga&g=ddaa0b55c505cc507d0e9650f885192c&c=0&ss=14878&sh=A819EE00D5A646299F4DD66A9C5C62C9[/url] RESPONSE: pnErrorResult=0 pnErrorDetail=Server assigned Lucas Lehmer primality test work. g=ddaa0b55c505cc507d0e9650f885192c k=8CCEA7271AA92BD8AC194D817A194FA4 A=1 b=2 n=28203107 c=-1 w=100 sf=68 p1=1 ==END== PrimeNet success code with additional info: Server assigned Lucas Lehmer primality test work. Got assignment 8CCEA7271AA92BD8AC194D817A194FA4: LL M28203107 Sending expected completion date for M28203107: Mar 25 2009 URL: [url]http://v5.mersenne.org/v5server/?v=0.95&px=GIMPS&t=ap&g=ddaa0b55c505cc507d0e9650f885192c&k=8CCEA7271AA92BD8AC194D817A194FA4&c=0&p=0.0000&d=86400&e=3108832&ss=11266&sh=86CB7086995B0BB2B78DAE36E5D39F56[/url] RESPONSE: pnErrorResult=0 pnErrorDetail=SUCCESS ==END== -------------- 5 more 'Getting Assignment' sectons ---------- [/QUOTE] |
[QUOTE=Mr. P-1;165072]A possibly related issue is that P-1 sometimes 'sticks' at 100%, so that if I set reporting to every 100 iterations, say, the percentage reported during sucessive outputs might be 99.2%, 99.5%, 99.8%, 100.0%, 100.0%, 100.0%,...[/QUOTE]
This is the "over 100%" bug. The output code works around the bug by displaying 100% for anything over 100%. The root cause of how the percent complete goes over 100 remains unsolved. |
[QUOTE=petrw1;165074]This was not the regular "Send new completion dates" time and it did not complete an assignment at this time an need to communicate for another one. There might have been system restart at this time but we can't remember for sure ... could the trigger have been something on the server?[/QUOTE]
This is definitely not a server problem. Is this system also afflicted with SpeedStep? That could possibly cause unreserves, but it should not immediately then do reserves. Those timestamps are not output if less than 5 minutes have passed. I suppose it is possible that the client unreserved because of a slow cpu speed, cpu speed changed, and new exponents were reserved all withing 5 minutes. In summary, we have 2 or 3 causes of unreserves: 1) P-1 over 100% complete bug. "Fixed" in 25.9. 2) Speedstep. You can workaround this by setting CpuSpeed in local.txt or by making UnreserveDays really huge. 3) Some other undiagnosed cause. |
| All times are UTC. The time now is 21:49. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.