![]() |
|
|
#45 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Okay, I've switched the G3000 server over to v1.0.1. I've also attached my Linux build of the v1.0.1 client. Just drop it in over the old one and you should be good.
|
|
|
|
|
|
#46 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
624910 Posts |
Hmm...I've just noticed something odd in the PRPnet server. Earlier today, I'd noticed that there were a lot of abandoned candidates, likely due to the fact that the way the client shuts down sometimes ends up abandoning some k/n pairs along the way. So, I decreased the deadline for all candidates to 3 hours, and immediately the server began assigning older work, as expected.
However, when I looked through the prpserver.candidates file a few minutes ago, I was surprised to find that some of the lowest Sierp. base 6 numbers (they have the lowest k's of the bunch, so thus it assigns work for those first when available) had two residuals collected! Of course, the residuals were identical, but since I had not set the server to assign doublechecks, needless to say this was quite surprising. So, I stopped the server, set the first-pass deadline back to 3 days, and it immediately went back to handing out first-pass Riesel base 3 work. Rogue, do you know why it was handing out doublechecks even though I've got the doublecheck setting set to 0 in prpserver.ini? Thanks, Max
|
|
|
|
|
|
#47 | |
|
"Mark"
Apr 2003
Between here and the
635210 Posts |
Quote:
You could run the server with debugging on for a couple of days to see if there are any more odd results like this. A couple of questions I have are "Is the client shutting down normally or is there an issue causing it to terminate without sending a message to the server? |
|
|
|
|
|
|
#48 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
624910 Posts |
Quote:
Then, after the tests had been abandoned for about 10-12 hours, I went in and changed the first-pass deadline to 3 hours. The server then immediately got started on handing out the Sierp. base 6 work again--*all* of them, from the beginning, regardless of whether they had been abandoned, or had had a result returned. Thus there are many such results with more than one result listed in prpserver.candidates. Once I noticed this, I changed the deadline back to 3 days (I have both firstpass and doublecheck deadlines set to 3 days, even though doublecheck is set to 0--I presume that means that it's turned off?). The server went back to handing out tests from the leading edge of the Riesel base 3 work that it had been handing out before I had changed the deadline to 3 hours earlier. Now for why the clients keep abandoning all these candidates. What I've noticed is that, if a client is stopped while it's on the first candidate of its batch, it will immediately shut down, leaving the G3000.save and phrot.chkpt files in place as it should. When restarted, the client resumes from the in-progress test, as it should, and continues normally. However, if there are results waiting around in G3000.save when the client is exited, it will stop, send any completed results to the server, then *delete the G3000.save file*, and exit. By deleting the G3000.save file, it essentially removes the client's memory of the remaining untested candidates, and thus they are abandoned. |
|
|
|
|
|
|
#49 | |
|
"Mark"
Apr 2003
Between here and the
143208 Posts |
Quote:
I will modify the behavior where undone tests are lost. I presume that the client is stopped because you have terminated the process, not because it detected an error and terminated itself. If I am wrong on that I need to know. Last fiddled with by rogue on 2008-12-31 at 04:22 |
|
|
|
|
|
|
#50 | ||
|
A Sunny Moo
Aug 2007
USA (GMT-5)
186916 Posts |
Quote:
Quote:
|
||
|
|
|
|
|
#51 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
624910 Posts |
Okay, I turned on debugging, set the deadline to 3 hours, and here's what I got:
Code:
[2008-12-31 06:04:32 GMT] socket 4 <<<< FROM bugmesticky@googlemail.com Core2Duo [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com connecting from 74.37.226.253 [2008-12-31 06:04:32 GMT] socket 4 <<<< GETWORK 1.0.0 10 [2008-12-31 06:04:32 GMT] socket 4 >>>> ServerVersion: 1.0.0 [2008-12-31 06:04:32 GMT] First check candidate 0, 26375*6^125217+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125217+1 1230703472 26375 6 125217 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125217+1 [2008-12-31 06:04:32 GMT] First check candidate 1, 26375*6^125221+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125221+1 1230703472 26375 6 125221 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125221+1 [2008-12-31 06:04:32 GMT] First check candidate 2, 26375*6^125301+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125301+1 1230703472 26375 6 125301 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125301+1 [2008-12-31 06:04:32 GMT] First check candidate 3, 26375*6^125325+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125325+1 1230703472 26375 6 125325 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125325+1 [2008-12-31 06:04:32 GMT] First check candidate 4, 26375*6^125341+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125341+1 1230703472 26375 6 125341 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125341+1 [2008-12-31 06:04:32 GMT] First check candidate 5, 26375*6^125397+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125397+1 1230703472 26375 6 125397 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125397+1 [2008-12-31 06:04:32 GMT] First check candidate 6, 26375*6^125545+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125545+1 1230703472 26375 6 125545 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125545+1 [2008-12-31 06:04:32 GMT] First check candidate 7, 26375*6^125565+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125565+1 1230703472 26375 6 125565 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125565+1 [2008-12-31 06:04:32 GMT] First check candidate 8, 26375*6^125629+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125629+1 1230703472 26375 6 125629 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125629+1 [2008-12-31 06:04:32 GMT] First check candidate 9, 26375*6^125637+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> WorkUnit: 26375*6^125637+1 1230703472 26375 6 125637 1 [2008-12-31 06:04:32 GMT] bugmesticky@googlemail.com (Core2Duo) at 74.37.226.253: Sent 26375*6^125637+1 [2008-12-31 06:04:32 GMT] socket 4 >>>> End of Message [2008-12-31 06:04:33 GMT] socket 4 <<<< GETGREETING [2008-12-31 06:04:33 GMT] socket 4 >>>> ############ [2008-12-31 06:04:33 GMT] socket 4 >>>> Welcome to the CRUS G3000 PRPnet beta test server! :-D [2008-12-31 06:04:33 GMT] socket 4 >>>> Server is running PRPnet v1.0.1 [2008-12-31 06:04:33 GMT] socket 4 >>>> ############ [2008-12-31 06:04:33 GMT] socket 4 >>>> OK. [2008-12-31 06:04:33 GMT] socket 4 <<<< QUIT [2008-12-31 06:04:33 GMT] closing socket 4 ![]() Maybe the server is forgetting which candidates have already been returned when it is stopped and restarted? Methinks it might work a little better if the server marked the finished numbers as "inactive" and dumped them to a "finished" file, sort of like it does when you find a PRP with the sierpinskiriesel=1 option set. In fact, having it handle finished candidates like this might even make processing the results a *lot* easier. ![]() Edit: Meanwhile, I've put the server back on the Riesel base 3 numbers so that we're not throwing our CPU time out the window on unnecessary triple-checks.
Last fiddled with by mdettweiler on 2008-12-31 at 06:14 |
|
|
|
|
|
#52 |
|
May 2007
Kansas; USA
101000101000112 Posts |
Max,
Thanks for all of the excellent testing here! It's good to get the small issues weeded out in the Beta process. Since we're using "production" files from actual drives here, I'll leave it up to you to balance the k/n pairs results returned by the server to the original sieve file. Also, spot-double-checking some of the residuals might be a good idea. Gary |
|
|
|
|
|
#53 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
![]() Specifically, as for the Sierp. base 6 numbers, those are just going to be, in turn, submitted back to the IB6 LLRnet server, so they'll end up being balanced with the original sieve file anyway when I process the results from that server.
|
|
|
|
|
|
|
#54 | |
|
"Mark"
Apr 2003
Between here and the
24×397 Posts |
Quote:
Code:
int32_t Candidate::AddTest(uint64_t testID, char *program, char *residue, char *emailID, char *machineID, uint32_t logTest)
{
test_t *tPtr;
char theMessage[BUFFER_SIZE];
Log *prpLog;
if (!m_Test)
{
m_Test = new test_t;
tPtr = m_Test;
}
else
{
tPtr = m_Test;
while (tPtr)
{
// We already know about this test, so ignore this result
if (testID == tPtr->l_TestID &&
!strcmp(emailID, tPtr->s_EmailID) &&
!strcmp(machineID, tPtr->s_ClientID))
return RC_FAILURE;
// We have two tests with the same residue, thus no more double-checking is needed
if (strcmp(tPtr->s_Residue, "inprogress") && !strcmp(residue, tPtr->s_Residue))
b_NeedsDoubleCheck = 0;
if (!tPtr->m_Next)
break;
tPtr = (test_t *) tPtr->m_Next;
}
tPtr->m_Next = new test_t;
tPtr = (test_t *) tPtr->m_Next;
}
tPtr->l_TestID = testID;
strcpy(tPtr->s_Program, program);
strcpy(tPtr->s_Residue, residue);
strcpy(tPtr->s_EmailID, emailID);
strcpy(tPtr->s_ClientID, machineID);
tPtr->m_Next = 0;
if (strcmp(tPtr->s_Residue, "inprogress"))
i_TestsPerformed++;
if (!strcmp(tPtr->s_Residue, "PRP") || !strcmp(tPtr->s_Residue, "PRIME"))
{
b_IsPRP = 1;
b_IsActive = 0;
if (logTest)
{
if (!strcmp(tPtr->s_Residue, "PRP"))
sprintf(theMessage, "%s: PRP returned by %s (%s) using %s!", s_Name, emailID, machineID, program);
else
sprintf(theMessage, "%s: Prime returned by %s (%s) using %s!", s_Name, emailID, machineID, program);
prpLog = new Log(0, "PRP.log", 0, NULL);
prpLog->LogMessage(theMessage);
delete prpLog;
}
}
return RC_SUCCESS;
}
|
|
|
|
|
|
|
#55 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
186916 Posts |
Quote:
I'll get the fix compiled and loaded into the server shortly. And then...back to 3 hours deadline to finally clean out some of those pesky Sierp. base 6 numbers that we've got hanging around!
Last fiddled with by mdettweiler on 2008-12-31 at 17:29 |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| PRPNet 5.4.3 Released | rogue | Software | 178 | 2021-06-24 11:56 |
| PSP goes prpnet | ltd | Prime Sierpinski Project | 86 | 2012-06-06 02:30 |
| PRPNet 4.0.0 Released | rogue | Software | 84 | 2011-11-16 21:20 |
| PRPNet 4.0.1 Released | Joe O | Sierpinski/Riesel Base 5 | 1 | 2010-10-22 20:11 |
| PRPNet 3.0.0 Released | rogue | Conjectures 'R Us | 220 | 2010-10-12 20:48 |