mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > No Prime Left Behind

Reply
 
Thread Tools
Old 2010-05-03, 12:46   #122
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

10110010011012 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
During our recently completed rally, we ran a PRPnet server for the 5th Drive on port 9000 alongside the primary LLRnet server on port 3000. It performed quite nicely, and per discussion between Gary and I we've decided to keep the server on as a permanent alternative for those who would like to use PRPnet on the drive that is to be our primary focus for the rest of the year.

Depending on how popular the server turns out to be, we'll keep it loaded with appropriately-sided ranges from the 5th Drive; at the very least we'll plan to load 1K ranges to keep it going even if it doesn't get much business. The idea is to have an alternative available at all times to the LLRnet servers that the majority of the project runs on, for anyone who finds it more convenient to use PRPnet given their specific setup.
This is news to me. Is this akin to saying that I can remove the alpha tag from PRPNet?
rogue is offline   Reply With Quote
Old 2010-05-03, 15:58   #123
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

11000011010012 Posts
Default

Quote:
Originally Posted by rogue View Post
This is news to me. Is this akin to saying that I can remove the alpha tag from PRPNet?
Well, there's one lingering issue with a segfault that I ran into; I'm currently running the server under gdb so I can catch a stacktrace on it and hopefully send something useful to you on it. Besides that, though, it looks pretty good, so yes, once the segfault is cleared up I'd say it can probably be made non-alpha.
mdettweiler is offline   Reply With Quote
Old 2010-05-03, 16:19   #124
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

164D16 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Well, there's one lingering issue with a segfault that I ran into; I'm currently running the server under gdb so I can catch a stacktrace on it and hopefully send something useful to you on it. Besides that, though, it looks pretty good, so yes, once the segfault is cleared up I'd say it can probably be made non-alpha.
I am not aware of any issues with 3.2.5. Could you possibly be using 3.2.4? I had fixed a segfault in that release.

Last fiddled with by rogue on 2010-05-03 at 16:23
rogue is offline   Reply With Quote
Old 2010-05-03, 17:24   #125
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by rogue View Post
I am not aware of any issues with 3.2.5. Could you possibly be using 3.2.4? I had fixed a segfault in that release.
I got the binary from Lennart; I'm pretty sure it's 3.2.5. Here, let me check...

Confirmed, it's 3.2.5. Directly prior to the segfault I see a whole swarm of log messages saying "x: client connecting from [Lennart's IP]" where each successive message increments x by one, at a rate of a few per second. x started around 710 and went until about 1020 (come to think of it, the last one might have been 1023 or 1024, which might be significant). It almost appeared that it was opening more and more sockets in some kind of infinite loop, until it ran out of sockets to open (possibly Linux has a limit of 1024 sockets?)...tell you what, I won't try to explain it here, I'll just send you the log. I was originally going to wait until I had a stacktrace before sending you anything, but now that I think about it, it's possible this is a rather rare error that doesn't show up often enough for me to get another example anytime soon, so I'd better send you what I've got.
mdettweiler is offline   Reply With Quote
Old 2010-05-03, 18:11   #126
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

3·11·173 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
I got the binary from Lennart; I'm pretty sure it's 3.2.5. Here, let me check...

Confirmed, it's 3.2.5. Directly prior to the segfault I see a whole swarm of log messages saying "x: client connecting from [Lennart's IP]" where each successive message increments x by one, at a rate of a few per second. x started around 710 and went until about 1020 (come to think of it, the last one might have been 1023 or 1024, which might be significant). It almost appeared that it was opening more and more sockets in some kind of infinite loop, until it ran out of sockets to open (possibly Linux has a limit of 1024 sockets?)...tell you what, I won't try to explain it here, I'll just send you the log. I was originally going to wait until I had a stacktrace before sending you anything, but now that I think about it, it's possible this is a rather rare error that doesn't show up often enough for me to get another example anytime soon, so I'd better send you what I've got.
Got your e-mail. I think something funky happened on the client side that caused it to open up a large number of sockets in the server. The server had difficulty handling that many connections. The server was never designed to handle 1000 connecting clients concurrently. Do you know if Lennart had a script that runs the client? If so, maybe that script was stuck in a loop.

Note that there are other limits that my software cannot address, such as the limit of concurrent open sockets (a TCP/IP setting in the OS) or a limit on the number of threads (an OS setting or a limit based upon available memory). I could modify the server to tell additional connecting clients to wait, but that would require a lot of work.

I do agree that the server shouldn't crash under this scenario. I can look into it and have it reject new connections once it hits a specified limit. That should be fairly easy to do.
rogue is offline   Reply With Quote
Old 2010-05-03, 21:21   #127
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

141518 Posts
Default

Quote:
Originally Posted by rogue View Post
Got your e-mail. I think something funky happened on the client side that caused it to open up a large number of sockets in the server. The server had difficulty handling that many connections. The server was never designed to handle 1000 connecting clients concurrently. Do you know if Lennart had a script that runs the client? If so, maybe that script was stuck in a loop.

Note that there are other limits that my software cannot address, such as the limit of concurrent open sockets (a TCP/IP setting in the OS) or a limit on the number of threads (an OS setting or a limit based upon available memory). I could modify the server to tell additional connecting clients to wait, but that would require a lot of work.

I do agree that the server shouldn't crash under this scenario. I can look into it and have it reject new connections once it hits a specified limit. That should be fairly easy to do.
Okay, that sounds good. I don't expect that we (or anyone else, for that matter) will ever have to deal with >1000 clients connecting concurrently, so having the server flat-out reject clients over a limit should be quite fine. The clients already handle such situations quite well, and would just try again or, failing that, move on to the next configured server.
mdettweiler is offline   Reply With Quote
Old 2010-05-04, 06:32   #128
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT)

32·17·37 Posts
Default

can i suggest that the retry time for clients is either changed from 10 minutes to something less or is made adjustable in the ini file
waiting 10 minutes is quite extreme a lot of the time
henryzz is offline   Reply With Quote
Old 2010-05-04, 06:46   #129
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by henryzz View Post
can i suggest that the retry time for clients is either changed from 10 minutes to something less or is made adjustable in the ini file
waiting 10 minutes is quite extreme a lot of the time
10 minutes is what Mark has in the prpclient.ini's by default in his "master copy" downloads from his website; PrimeGrid has apparently left this intact as well for their packages. For the NPLB client packages I've generally made sure to change them to 1 minute. However, since the NPLB client packages are currently outdated and therefore the PRPnet thread points downloaders to PrimeGrid's website instead, you'll end up with whatever they've got.

I'm really busy right now so I probably won't be able to get NPLB client packages posted any time within the next week or two; it's not really high on my priority list since PrimeGrid's are virtually identical and the retry time option is easy enough to change.
mdettweiler is offline   Reply With Quote
Old 2010-05-04, 12:46   #130
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

570910 Posts
Default

Quote:
Originally Posted by henryzz View Post
can i suggest that the retry time for clients is either changed from 10 minutes to something less or is made adjustable in the ini file
waiting 10 minutes is quite extreme a lot of the time
Change this line:

errortimeout=10

in prpclient.ini. This is the setting that affects that 10 minute delay.
rogue is offline   Reply With Quote
Old 2010-05-04, 16:39   #131
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT)

130358 Posts
Default

sorry for missing something that obvious
i am more familiar with the server ini than the client 1
henryzz is offline   Reply With Quote
Old 2010-05-04, 17:57   #132
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

3×11×173 Posts
Default

Quote:
Originally Posted by henryzz View Post
sorry for missing something that obvious
i am more familiar with the server ini than the client 1
It is akin to looking for a long time at a piece of source code to figure out why it won't compile only to have someone else look at it for ten seconds and point out the obvious mistake.
rogue is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
PRPnet Servers for CRUS MyDogBuster Conjectures 'R Us 76 2018-03-09 19:05
LLRnet servers for NPLB kar_bon No Prime Left Behind 1343 2014-08-20 09:38
Public PRPNet Servers rogue Open Projects 26 2013-01-16 01:33
PRPNet servers down? opyrt Prime Sierpinski Project 13 2009-11-04 21:33
Servers for NPLB gd_barnes No Prime Left Behind 0 2009-08-10 19:21

All times are UTC. The time now is 13:39.

Mon Jun 1 13:39:41 UTC 2020 up 68 days, 11:12, 2 users, load averages: 1.65, 1.75, 1.73

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.