![]() |
|
|
#34 |
|
"Mark"
Apr 2003
Between here and the
24×397 Posts |
Lennart, can you tell me what the client does with the workunits when this happens? Does it delete them or save them and try again?
|
|
|
|
|
|
#35 |
|
"Lennart"
Jun 2007
25·5·7 Posts |
Code:
[2009-08-03 16:44:08 GMT] Total Time: 2:12:11 Total Tests: 15 Total PRPs Found: 0 [2009-08-03 16:44:53 GMT] crus: Returning work to server nplb-gb1.no-ip.org at port 3000 [2009-08-03 16:47:10 GMT] nplb-gb1.no-ip.org:3000 connect to socket failed [2009-08-03 16:47:10 GMT] nplb-gb1.no-ip.org:3000 connect to socket failed [2009-08-03 16:47:10 GMT] nplb-gb1.no-ip.org:3000 connect to socket failed [2009-08-03 16:47:11 GMT] nplb-gb1.no-ip.org:3000 connect to socket failed [2009-08-03 16:47:11 GMT] nplb-gb1.no-ip.org:3000 connect to socket failed [2009-08-03 16:47:11 GMT] 27121: Getting work from server prpnet.primegrid.com at port 12006 [2009-08-03 17:49:36 GMT] 27121: 27*2^1543462+1 is not prime. Residue 2D44561896DD41CE [2009-08-03 17:49:36 GMT] Total Time: 3:17:39 Total Tests: 16 Total PRPs Found: 0 [2009-08-03 17:49:36 GMT] 27121: Returning work to server prpnet.primegrid.com at port 12006 [2009-08-03 17:49:38 GMT] 27121: INFO: Test for candidate 27*2^1543462+1 accepted [2009-08-03 17:49:38 GMT] 27121: INFO: All 1 test results were accepted [2009-08-03 17:49:38 GMT] crus: Returning work to server nplb-gb1.no-ip.org at port 3000 [2009-08-03 17:49:43 GMT] crus: ERROR: Workunit 124221*6^148285+1 not found on server [2009-08-03 17:49:43 GMT] crus: The client will delete this workunit [2009-08-03 17:49:44 GMT] crus: INFO: Test for candidate 74612*6^148287+1 accepted [2009-08-03 17:49:45 GMT] crus: INFO: Test for candidate 172257*6^148286+1 accepted [2009-08-03 17:49:45 GMT] crus: INFO: 2 of 3 test results were accepted [2009-08-03 17:49:46 GMT] crus: Getting work from server nplb-gb1.no-ip.org at port 3000 [2009-08-03 17:49:47 GMT] crus: INFO: No available candidates are left on this server. [2009-08-03 17:49:48 GMT] crus: Getting work from server nplb-gb1.no-ip.org at port 3000 [2009-08-03 17:49:49 GMT] crus: INFO: No available candidates are left on this server. [2009-08-03 17:49:50 GMT] crus: Getting work from server nplb-gb1.no-ip.org at port 3000 [2009-08-03 17:49:51 GMT] crus: INFO: No available candidates are left on this server. [2009-08-03 17:49:52 GMT] crus: Getting work from server nplb-gb1.no-ip.org at port 3000 [2009-08-03 17:49:53 GMT] crus: INFO: No available candidates are left on this server. [2009-08-03 17:49:54 GMT] crus: Getting work from server nplb-gb1.no-ip.org at port 3000 [2009-08-03 17:49:55 GMT] crus: INFO: No available candidates are left on this server. [2009-08-03 17:49:56 GMT] crus: Getting work from server nplb-gb1.no-ip.org at port 3000 [2009-08-03 17:49:57 GMT] crus: INFO: No available candidates are left on this server. [2009-08-03 17:49:57 GMT] 27121: Getting work from server prpnet.primegrid.com at port 12006 [2009-08-03 18:43:33 GMT] 27121: 27*2^1543856+1 is not prime. Residue DE6D07D9F6EA4450 [2009-08-03 18:43:33 GMT] Total Time: 4:11:36 Total Tests: 17 Total PRPs Found: 0 [2009-08-03 18:43:34 GMT] 27121: Returning work to server prpnet.primegrid.com at port 12006 [2009-08-03 18:43:37 GMT] 27121: INFO: Test for candidate 27*2^1543856+1 accepted I have started setting all on debug=1 Lennart |
|
|
|
|
|
#36 |
|
"Mark"
Apr 2003
Between here and the
24×397 Posts |
|
|
|
|
|
|
#37 |
|
"Lennart"
Jun 2007
25·5·7 Posts |
|
|
|
|
|
|
#38 | |
|
May 2007
Kansas; USA
28A316 Posts |
Quote:
Gotta have all interfaces (as weel as barfing) fixed including grammar/spacing/clarity/etc. before we load n>150K even if it requires a new PRPnet release. We'll keep rerunning n=140K-150K until we get a clean test. The "n's remaining" of 4065 is misleading. It should say something like "pairs remaining" (assuming that is what it is referring to.) Thanks for getting that going. It's good to see the # of k's remaining is correct now. Thanks, Gary |
|
|
|
|
|
|
#39 |
|
May 2007
Kansas; USA
242438 Posts |
Could these problems with "barfing" be as a result of my servers not being able to handle a very big load? That's quite a bit of crunching power on there by Lennart. (I have the equivalent of 2 cores on there, i.e. a 50-50 split with another effort on a full quad.)
We should definitively know about load being a possible problem when port G5000 at NPLB gets rolling with the very teeny tests that will only take a few secs. each. That should be a big load even with just a few quads on it!
|
|
|
|
|
|
#40 |
|
May 2007
Kansas; USA
1040310 Posts |
Something new for this time around:
First, I believe this drive is being processed by n-vaule. Based on that, I see that at http://nplb-gb1.no-ip.org:3000/ k=124125 and 124221 have a min n of 140006 and 140005 respectively even though most of this testing effort is at n>143K. Could it be because someone has received some pairs that haven't been returned to the server in a long time. I'm trying to determine if the "min n" is updating properly on all k's. Second, the max n is showing as n=~148.7K for nearly all k's. (~147.6K for a few k's, perhaps because they are lower weight?) It should be showing as n=~150K for all k's unless the full n=140K-150K range was not loaded. This looked correct last time. How come it doesn't look correct this time? Final question: How frequently is the "min n" and "max n" by k page updated? Thanks, Gary Last fiddled with by gd_barnes on 2009-08-04 at 04:48 |
|
|
|
|
|
#41 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
11000011010012 Posts |
Quote:
Regarding the barfing possibly being related to server stress: no, I've talked with Mark and he's definitely confirmed that there's a server bug that needs to be fixed, as well as possibly a client bug pending further investigation of debug.log files. He's sent me a fix for the server side of things (which should definitely fix the barfing), though he asked me not to apply the fixed version to the server yet so Lennart's clients can get a chance to catch a log of their end of the barfing in their debug.log files for him to examine and see if there's a bug in the client as well. |
|
|
|
|
|
|
#42 |
|
May 2007
Kansas; USA
101·103 Posts |
OK, great. I'm glad to hear we've nailed down the "server barfing" problems.
Thanks for the "absorbing of changes" to min/max n explanation. It's a little clearer to me now. It seems we still have a "min n" and "max n" problem though; mainly "max n". The "min n" issue could be as a result of a few pairs not having been returned yet, although that seems a little suspect since Lennart and I are the main ones on there and our machines have remained connected (I think). The "max n" should be n=~150K for all k's. Can you look into that? Is the full n=140K-150K file loaded into the server? Gary Last fiddled with by gd_barnes on 2009-08-04 at 05:00 |
|
|
|
|
|
#43 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3×2,083 Posts |
Quote:
BTW, Lennart, did you catch the barfing in one of your client debug.log's? I see a small bit of barfing on the server from around August 3, 21:15 GMT; here are the clients involved: _31, _206, _162, _127, _71, _31, and last but not least, humpford (one of Gary's finest ). (Of course since Gary doesn't have his clients set to debug logging, that last one is rather irrelevant; no big deal, there should be plenty of data from Lennart's logs.)Interestingly enough, last night doesn't seem to be a big one for barfing; I had to go all the way back to the time of the abovementioned barf in order to find any instance of it. Last fiddled with by mdettweiler on 2009-08-04 at 13:07 |
|
|
|
|
|
|
#44 | |
|
"Lennart"
Jun 2007
21408 Posts |
Quote:
Lennart |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| PRPNet server for personal use | johnadam74 | Software | 2 | 2016-01-01 15:58 |
| New SR5 PRPnet server online | ltd | Sierpinski/Riesel Base 5 | 15 | 2013-03-19 18:03 |
| First PSP PRPnet 4.0.6 server online | ltd | Prime Sierpinski Project | 9 | 2011-03-15 04:58 |
| PRPnet 3.1.3 stress-test server | mdettweiler | No Prime Left Behind | 40 | 2010-01-30 18:05 |
| First pass PRPNet server out of work? | opyrt | Prime Sierpinski Project | 6 | 2009-09-24 18:14 |