mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > Conjectures 'R Us

Reply
 
Thread Tools
Old 2009-08-07, 12:46   #166
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

24·397 Posts
Default PRPNet 2.2.4 Released

Here is a list of changes:
  • all: Set timeout setting for reading from the socket to 30 seconds instead of 10 seconds
  • prpserver: Corrected server_status.html page to show "Min N" in the appropriate column.
  • prpserver: Reject tests from the client if a residue is not provided.
  • prpserver: Added a required htmltitle= in the prpserver.ini file so that the server can provide a title for the generated HTML.
  • prpserver: Tell the browser to close the connection after the client closes the connection. This addresses the browser "stalling" when someone requests one of the web pages.
  • prpclient: If the server does not respond before the timeout when returning workunits, the client will stop trying to send additional workunits so that it can remain in sync with the server

You can d/l it from here.
rogue is offline   Reply With Quote
Old 2009-08-11, 12:30   #167
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

24×397 Posts
Default PRPNet 2.2.5 Released

There is only one change, which is in the client:
  • Fixed two issues where the client was accessing memory not available to it. These would be triggered when the server rejects the test result or when the server no longer no longer has a record of the test

You can d/l it from here.

I recommend that users upgrade to this release as soon as they can.

I intend to multi-thread the server in the next release. I have never coded pthreads before, so I have a bit of learning to do. If anyone would like to volunteer some of their skill to that effort, it would be appreciated.
rogue is offline   Reply With Quote
Old 2009-08-12, 09:02   #168
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

101000101000112 Posts
Default

I believe we have some serious problems with the general design of PRPnet and would like to see some of these issues addressed and discussed amongst many of the technical folks here at mersenneforum. Here is what I am seeing:

1. The main file is completely in memory to facilitate updating. What happens if you lose power or the machines hang or your program goes into a loop? It would be even worse if you change the save frequency to hours instead of minutes. Your window of vulnerability is much greater.

2. There is a bottlenecking problem that is a result of self imposed deadlines on processing results. It would be much better to let the server program do all of its updating before allowing requests from clients to dictate things. The time savings from being able to use the new and improved LLR and PFGW greatly exceeds the need to save nano-seconds for the clients.

3. All of the stats stuff needs to be taken out of the main program and put into its own seperate program(s). It's probably not a good idea to have stats in an online program.


The first two are serious issues that should be looked into fairly soon. The third would make future modifications far easier. Every situation must be thought of ahead of time. Just because certain problems have not been encountered does not mean that there are not bugs. Think of it like this: If the worst happens, what will happen with PRPnet?

The worst things that I can think of in order of severity would be:

(1) A hard drive failure.
(2) A power outage.
(3) Too many clients overloading the system.
(4) Data coming up completely missing from the system.

I'm sure there are many others and all of them need to be coded and tested.


Thank you,
Gary

Last fiddled with by gd_barnes on 2009-08-12 at 09:12
gd_barnes is online now   Reply With Quote
Old 2009-08-12, 12:43   #169
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

24×397 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
I believe we have some serious problems with the general design of PRPnet and would like to see some of these issues addressed and discussed amongst many of the technical folks here at mersenneforum. Here is what I am seeing:

1. The main file is completely in memory to facilitate updating. What happens if you lose power or the machines hang or your program goes into a loop? It would be even worse if you change the save frequency to hours instead of minutes. Your window of vulnerability is much greater.
Agreed. The more candidates (and clients) the server has to manage, then the less stable it becomes. One of my main goals in the next major release is to support multiple concurrent client connections. It requires pthreads at which I am no expert. Another potential goal, which would probably make multithreading easier, would be to use a relational database behind the server. The database software isn't important as long as the database software is SQL92 compliant.

Quote:
Originally Posted by gd_barnes View Post
2. There is a bottlenecking problem that is a result of self imposed deadlines on processing results. It would be much better to let the server program do all of its updating before allowing requests from clients to dictate things. The time savings from being able to use the new and improved LLR and PFGW greatly exceeds the need to save nano-seconds for the clients.
??

Quote:
Originally Posted by gd_barnes View Post
3. All of the stats stuff needs to be taken out of the main program and put into its own seperate program(s). It's probably not a good idea to have stats in an online program.
If I put a database behind the server, then a separate web server could be written to handle on-line stats and it would be up to the server admin if they want to put those stats on-line.

The other major enhancement in all of this would be to have the client send a single message and have the server respond with a single message. That would require the client to send data in an XML format or something relatively easy for the server to parse.

Finally, although I could do all of this myself, this would happen much faster if I had assistance. Please e-mail me if you want to help with development. If I do get helpers, I will need to put PRPNet into SourceForge.
rogue is offline   Reply With Quote
Old 2009-08-13, 21:41   #170
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

101×103 Posts
Default

Quote:
Originally Posted by rogue View Post
??
Your question was related to the bottlenecking issue.

Please see this post: http://www.mersenneforum.org/showpos...1&postcount=86 as well as the posts directly above and below it for what we are referring to. It doesn't appear that the issue was fully fixed.
gd_barnes is online now   Reply With Quote
Old 2009-08-13, 22:31   #171
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

24×397 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
Your question was related to the bottlenecking issue.

Please see this post: http://www.mersenneforum.org/showpos...1&postcount=86 as well as the posts directly above and below it for what we are referring to. It doesn't appear that the issue was fully fixed.
I have a couple of choices for the next release. The first is to change the messaging to use a single block of data between the client and server. This would eliminate much of the waiting. The second is to multi-thread the server. The first is easier to do than the second. I could possibly complete it within a week or two, but the second would take at least a month.

Putting the data into a single block won't address bottlenecks, but it should address some of the oddities in communications between the client and server.

Last fiddled with by rogue on 2009-08-13 at 22:32
rogue is offline   Reply With Quote
Old 2009-08-13, 22:47   #172
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

141518 Posts
Default

Quote:
Originally Posted by rogue View Post
I have a couple of choices for the next release. The first is to change the messaging to use a single block of data between the client and server. This would eliminate much of the waiting. The second is to multi-thread the server. The first is easier to do than the second. I could possibly complete it within a week or two, but the second would take at least a month.

Putting the data into a single block won't address bottlenecks, but it should address some of the oddities in communications between the client and server.
Personally, I'd go with option #1, both because it's easier and because even with multithreading it would be useful. Also, one other thing that I'd suggest: have the server make any would-be connections wait until it's done with whatever it's doing. I think that should take care of the barfing problems. Would this be reasonably easy to do?
mdettweiler is offline   Reply With Quote
Old 2009-08-13, 23:45   #173
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

24·397 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Personally, I'd go with option #1, both because it's easier and because even with multithreading it would be useful. Also, one other thing that I'd suggest: have the server make any would-be connections wait until it's done with whatever it's doing. I think that should take care of the barfing problems. Would this be reasonably easy to do?
IIUC, you are suggesting that a second connection is causing multi-threading and the code isn't handling it, thus causing memory leaks. That is certainly possible. I hadn't thought about that. I'll see what I can do.
rogue is offline   Reply With Quote
Old 2009-08-14, 01:24   #174
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

24·397 Posts
Default

Quote:
Originally Posted by rogue View Post
IIUC, you are suggesting that a second connection is causing multi-threading and the code isn't handling it, thus causing memory leaks. That is certainly possible. I hadn't thought about that. I'll see what I can do.
I just ran a test to see what happens in this scenario. The server only handles one request at a time. If another client tries to connect, it will wait until the previous client closes the connection. That raises an interesting possibility with the server though. If a client happens to be the one that triggers writing back to the prpserver.candidates file, it would take longer. If there are enough candidates, it could cause the second client to time out. This can only be addressed with multi-threading, which will be much easier with a database.

My recommendation is that the server allow clients to grab more tests so that they don't communicate very often, maybe only once per hour or so. This requires changes in both the client config and the server config, but would reduce the thrashing done by the server which will happen when lots of clients connect in succession.

On the positive side, I did find out what is causing the client to hang when the server no longer has the test. I will provide a patch tomorrow.

Last fiddled with by rogue on 2009-08-14 at 01:36
rogue is offline   Reply With Quote
Old 2009-08-14, 03:13   #175
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

624910 Posts
Default

Quote:
Originally Posted by rogue View Post
I just ran a test to see what happens in this scenario. The server only handles one request at a time. If another client tries to connect, it will wait until the previous client closes the connection. That raises an interesting possibility with the server though. If a client happens to be the one that triggers writing back to the prpserver.candidates file, it would take longer. If there are enough candidates, it could cause the second client to time out. This can only be addressed with multi-threading, which will be much easier with a database.
Ah, I see. Okay, sounds like it's behaving exactly how I was thinking it *should* behave--namely, any clients that show up while a communication is in progress (or while the server is busy updating the prpserver.candidates file) have to wait in line, which sometimes might cause a timeout (which shouldn't really hurt the client, assuming they have a backup server configured).

As you suggested over in the "server bugs and barfing" thread, what would be really helpful is to implement a checksum in the communications protocol--sort of like what Prime95 does to make sure that PrimeNet receives the result correctly. The server would then reject any result that doesn't have the proper checksum.

One other thing that would be useful is to make the server more robust in how it handles an unexpected termination of a connection. Right now, it just spews out "error sending x to localhost:port" errors after timing out on each message, which not only seems to confuse the server, but also ties it up for the timeout value times however many lines it needs to send. Both the client and the server should be set so that if they get a "nothing received", they won't keep trying to talk to the other end of the line, but will just continue going on with their business.
mdettweiler is offline   Reply With Quote
Old 2009-08-14, 12:12   #176
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

18D016 Posts
Default PRPNet 2.2.6 Released

There is only one change, which is in the client:
  • Fixed an issue that causes the client to hang when the server does not respond when the client is returning work to the server.

You can d/l it from here.

I recommend that users upgrade to this release as soon as they can.
rogue is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
PRPNet 5.4.3 Released rogue Software 178 2021-06-24 11:56
PSP goes prpnet ltd Prime Sierpinski Project 86 2012-06-06 02:30
PRPNet 4.0.0 Released rogue Software 84 2011-11-16 21:20
PRPNet 4.0.1 Released Joe O Sierpinski/Riesel Base 5 1 2010-10-22 20:11
PRPNet 3.0.0 Released rogue Conjectures 'R Us 220 2010-10-12 20:48

All times are UTC. The time now is 09:42.


Tue Jul 27 09:42:34 UTC 2021 up 4 days, 4:11, 0 users, load averages: 2.02, 1.95, 1.88

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.