mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > No Prime Left Behind

Reply
 
Thread Tools
Old 2010-01-22, 21:35   #12
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

624110 Posts
Default

In other news, the server seems to be responding quite well to multiple simultaneous connections: I tried refreshing the main web page at the same time my client hit the server, and neither was impacted.
mdettweiler is offline   Reply With Quote
Old 2010-01-22, 21:41   #13
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

11000011000012 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
Now, in response to the recent change:
I can now get and return work on both computers, but now on both I sometimes (pretty often, maybe every 5-10 times I communicate) get the "No available candidates" message. I know there are fewer candidates, but it's not THAT much less!
Yeah, same here:
Code:
[2010-01-22 21:34:47 GMT] G7465: Returning work to server nplb-gb1.no-ip.org at port 7465
[2010-01-22 21:34:48 GMT] G7465: INFO: Test for 2001*2^52481-1 was accepted
[2010-01-22 21:34:48 GMT] G7465: INFO: All 1 test results were accepted
[2010-01-22 21:34:48 GMT] G7465: Getting work from server nplb-gb1.no-ip.org at port 7465
[2010-01-22 21:34:51 GMT] G7465: INFO: No available candidates are left on this server.
It seems that while the timeout issue was fixed, these still pop up once in a while. I wonder what's causing them? They seem to be somewhat harmless as the client can generally grab a pair successfully on the next try, but nonetheless they are a problem. I wonder if this is because (say) two clients are trying to connect simultaneously and the DB can't give them both a new pair at the same time?

BTW, I eat my words about 3.1.2 clients being cross-comptatible with 2.4.6 servers. After one of these "no available candidates" thingies, my client pulled a candiadate from G2000, which is on 2.4.6. That worked all right (albeit with one or two little errors that didn't seem to impact anything), but now the client's driving G2000 nuts trying to return the result. It would appear that 2.4.6 doesn't take well to 3.1.2's trying to send it the test time, a new feature added in version 3.
mdettweiler is offline   Reply With Quote
Old 2010-01-22, 21:42   #14
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

100111111100112 Posts
Default

Hum. Max, can you send me a care package for the newest server? I won't be able to dogpile on it until later this evening.

I'm seeing a couple of error messages coming across:

"Could not open file: Greeting.txt"

(Looks like no big deal. Do you need a greeting file there, Max?)

And of course our favorite, which should be a concern:

"Nothing was received on socket 5, therefore the socket was closed."

Better check into that last one.


Gary
gd_barnes is offline   Reply With Quote
Old 2010-01-22, 21:49   #15
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

186116 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
Hum. Max, can you send me a care package for the newest server? I won't be able to dogpile on it until later this evening.
Sure. But, actually, you may want to hold off on the dogpile for a wee bit; I'd kind of like to nail down this problem with the "no candidates on server" message first.

Quote:
I'm seeing a couple of error messages coming across:

"Could not open file: Greeting.txt"

(Looks like no big deal. Do you need a greeting file there, Max?)
Correct, no big deal. I could put something there but there's no particular need for it.

Quote:
And of course our favorite, which should be a concern:

"Nothing was received on socket 5, therefore the socket was closed."

Better check into that last one.
I just took a look at the server (you probably noticed ) and didn't see any of those, though surely they are out there. I'm not sure exactly what, if any, connection that has to the "no candidates on server" message we're seeing here, though it might be the server's end of that. I'll keep checking the server and see if I can spot it.
mdettweiler is offline   Reply With Quote
Old 2010-01-22, 22:04   #16
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

5,953 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
Now, in response to the recent change:
I can now get and return work on both computers, but now on both I sometimes (pretty often, maybe every 5-10 times I communicate) get the "No available candidates" message. I know there are fewer candidates, but it's not THAT much less!
What OS are you using? 3.1.4 has a patch that is specific to some instances of *nix (notably Ubuntu) that peg the CPU when using select() on a socket.

Regarding test expiration, what do you have in prpserver.delay? It seems that tests are expiring too quickly.

I have rarely seen the 'No available candidates" message, but not after I switched the database engine. Are you using the InnoDB database engine? I suggest turning on debugging (set debuglevel=3 in prpserver.ini) and sending the log to me when you see this happen again. I suspect it to be a database issue (and not a code issue), unless there is some fundamental misunderstanding I have regarding MySQL.

Another thing you could try is to add this index to the Candidate table.

alter table Candidate add index ix_test (HasPendingTest, CompletedTests, DoubleChecked, DecimalLength);

Maybe this will also address slowdowns in communications when a client gets work.
rogue is offline   Reply With Quote
Old 2010-01-22, 22:27   #17
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts
Default

Quote:
Originally Posted by rogue View Post
What OS are you using? 3.1.4 has a patch that is specific to some instances of *nix (notably Ubuntu) that peg the CPU when using select() on a socket.
All of my clients and servers are Windows.
I don't know if it's related to using select() or what, but I've noticed that (with 3.1.3 on Windows, at least) the clients basically peg a core during the time that it's communicating with the server (making it keep the core pegged almost constantly, not just when having work for its helper apps).
I think the GB servers are run on Linux.

Last fiddled with by Mini-Geek on 2010-01-22 at 22:57
Mini-Geek is offline   Reply With Quote
Old 2010-01-22, 23:16   #18
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

11000011000012 Posts
Default

Quote:
Originally Posted by rogue View Post
What OS are you using? 3.1.4 has a patch that is specific to some instances of *nix (notably Ubuntu) that peg the CPU when using select() on a socket.
As Mini-Geek said, the GB servers are all running on Linux, specifically Ubuntu in fact. (Unless this is just an issue with the client?)

Quote:
Regarding test expiration, what do you have in prpserver.delay? It seems that tests are expiring too quickly.
prpserver.delay is set to 2 days for all candidate sizes.

Quote:
I have rarely seen the 'No available candidates" message, but not after I switched the database engine. Are you using the InnoDB database engine? I suggest turning on debugging (set debuglevel=3 in prpserver.ini) and sending the log to me when you see this happen again. I suspect it to be a database issue (and not a code issue), unless there is some fundamental misunderstanding I have regarding MySQL.
I don't think I'm using InnoDB, though admittedly I have absoultely no idea what InnoDB is, so if it's the default then I might be using it.

I currently have the debug level set to log socket communication only (level 2). I don't see a level 3 in the choices; I presume you mean level 1 (socket+database)? I've set the server to do that now and will send you the log if I see any more of those errors. It seems all the other clients except mine have dropped off, so if the problem occurs only when multiple clients are hitting the server simultaneously as I'm suspecting, it probably won't occur just yet. (Hey, if anyone else wants to chuck a core or two back on there for a little while, it would be greatly appreciated...shouldn't take too long to trigger one of those errors on somebody's client somewhere.)
Quote:
Another thing you could try is to add this index to the Candidate table.

alter table Candidate add index ix_test (HasPendingTest, CompletedTests, DoubleChecked, DecimalLength);

Maybe this will also address slowdowns in communications when a client gets work.
Okay, I've added the index. I didn't notice any speedup when I looked at the client immediately after applying the index, but it's not like it was too terribly bad with this smaller number of candidates anyway, so there may not be much improvement to be had at this point. Once we've nailed down the issue with the "no available candidates" I'll try loading the full gamut of tests into the server and see if the index helps with that.
mdettweiler is offline   Reply With Quote
Old 2010-01-22, 23:28   #19
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

10111010000012 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
I don't think I'm using InnoDB, though admittedly I have absoultely no idea what InnoDB is, so if it's the default then I might be using it.

I currently have the debug level set to log socket communication only (level 2). I don't see a level 3 in the choices; I presume you mean level 1 (socket+database)? I've set the server to do that now and will send you the log if I see any more of those errors. It seems all the other clients except mine have dropped off, so if the problem occurs only when multiple clients are hitting the server simultaneously as I'm suspecting, it probably won't occur just yet. (Hey, if anyone else wants to chuck a core or two back on there for a little while, it would be greatly appreciated...shouldn't take too long to trigger one of those errors on somebody's client somewhere.)

Okay, I've added the index. I didn't notice any speedup when I looked at the client immediately after applying the index, but it's not like it was too terribly bad with this smaller number of candidates anyway, so there may not be much improvement to be had at this point. Once we've nailed down the issue with the "no available candidates" I'll try loading the full gamut of tests into the server and see if the index helps with that.
I posted in PRPNet announcements thread how to convert the tables to the InnoDB database engine. It's fairly straightforward. It is not the default. If you created the tables with the current script (instead of upgrading), they should be InnoDB. Use "show table status;" from the MySQL client to tell you which database engine is being used on the tables.

debuglevel=3 is not documented due to an oversight on my part. I'll post 3.1.4 later tonight. It will address the pegging CPU issue.

I need to modify ExpireTests() in prpserver.cpp to give more details behind tests expiring.
rogue is offline   Reply With Quote
Old 2010-01-23, 02:21   #20
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

27F316 Posts
Default

OK, just let me know when to dogpile on it.

Max, I saw the "socket not closed" message early on not long after the server started but I saw it again 5-10 mins. later after candidates were coming through "normally".
gd_barnes is offline   Reply With Quote
Old 2010-01-23, 05:27   #21
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

792 Posts
Default

Quote:
Originally Posted by rogue View Post
I posted in PRPNet announcements thread how to convert the tables to the InnoDB database engine. It's fairly straightforward. It is not the default. If you created the tables with the current script (instead of upgrading), they should be InnoDB. Use "show table status;" from the MySQL client to tell you which database engine is being used on the tables.

debuglevel=3 is not documented due to an oversight on my part. I'll post 3.1.4 later tonight. It will address the pegging CPU issue.

I need to modify ExpireTests() in prpserver.cpp to give more details behind tests expiring.
Ah, okay. In that case, then, yes, I'm using InnoDB, since I created the tables with the create_tables.sql script that came with 3.1.3.

BTW, what exactly does debuglevel=3 do?

Last fiddled with by mdettweiler on 2010-01-23 at 05:28
mdettweiler is offline   Reply With Quote
Old 2010-01-23, 14:49   #22
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

5,953 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Ah, okay. In that case, then, yes, I'm using InnoDB, since I created the tables with the create_tables.sql script that came with 3.1.3.

BTW, what exactly does debuglevel=3 do?
It will log the candidates selected by the candidate selector (for handing out tests). Although it won't state why a candidate is not sent to a client, it can at least indicate if candidates with no pending tests or completed tests are getting selected by the cursor.
rogue is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
PRPNet server for personal use johnadam74 Software 2 2016-01-01 15:58
New SR5 PRPnet server online ltd Sierpinski/Riesel Base 5 15 2013-03-19 18:03
First PSP PRPnet 4.0.6 server online ltd Prime Sierpinski Project 9 2011-03-15 04:58
First pass PRPNet server out of work? opyrt Prime Sierpinski Project 6 2009-09-24 18:14
PRPnet beta test server mdettweiler No Prime Left Behind 108 2009-07-15 00:03

All times are UTC. The time now is 22:01.

Tue Oct 27 22:01:16 UTC 2020 up 47 days, 19:12, 2 users, load averages: 2.12, 2.12, 2.04

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.