![]() |
|
|
#144 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
-Total # of untested pairs remaining -First untested pair remaining -Last untested pair remaining (the last two would both be ordered by whatever the server's sortoption= dictates--i.e., first/last pairs in the order they'll be handed out) -A downloadable file containing a complete list of the pairs remaining (untested) in the server The last one is a "might be nice to have", since right now we don't even have that set up for our LLRnet servers. |
|
|
|
|
|
|
#145 | |
|
"Mark"
Apr 2003
Between here and the
11×577 Posts |
Quote:
|
|
|
|
|
|
|
#146 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
|
|
|
|
|
|
|
#147 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Yesterday I upgraded all the NPLB/CRUS PRPnet servers to version 3.3.5 (though they identify themselves as 3.3.4...it's a long story). With this release, all PRPnet applications now report times in local time (in the servers' case, CDT/GMT-5 during the summer and CST/GMT-6 during the winter). Therefore, G9000 is now (finally!) in sync with the rest of the servers in the stats, not 5 hours "ahead" of them all!
![]() This should be quite handy in future rallies, since we'll only need one offset time for all servers in the DB. Last fiddled with by mdettweiler on 2010-07-22 at 15:50 |
|
|
|
|
|
#148 |
|
May 2007
Kansas; USA
33·5·7·11 Posts |
Max,
Lennart just came on port 9000 with quite a few cores, can you load n=910K-912K the first thing in the morning? Thanks. Gary |
|
|
|
|
|
#149 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
186916 Posts |
Quote:
Edit: I see now that he did a similar dump on the TPS port 12000 server shortly after it came up, thereby drying it out in very short order (there wasn't much left in it anyway). My revised theory is that he went on there, then fell back to port 9000 when that ran out. I'll go ahead and load 910K-912K after all since he may be back for more. ![]() Edit 2: 910K-912K is now loaded. Last fiddled with by mdettweiler on 2010-09-03 at 17:19 |
|
|
|
|
|
|
#150 |
|
"Mark"
Apr 2003
Between here and the
11×577 Posts |
I found the memory leak in the server. It was code that I had commented out some time ago because it was causing a crash. I don't know if the crash was due to a bug on my part or a bug in the MySQL ODBC driver. Anyways, if you uncomment this line:
SQLFreeHandle(SQL_HANDLE_DBC, sqlConnectionHandle); in the Disconnect() function of DBInterface.cpp, the memory leak will go away. I have been able to verify this on Mac at home. I have yet to test it on Windows, which is where I suspect I ran into the leak originally. I will put out a patched release of the server in a couple of days. |
|
|
|
|
|
#151 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
Only problem is, it doesn't build on the NPLB server machine. I've attached the console output from "make prpserver".
|
|
|
|
|
|
|
#152 | |
|
"Mark"
Apr 2003
Between here and the
11·577 Posts |
Quote:
Code:
HelperThread.cpp:61: error: ‘rc’ was not declared in this scope Can you verify the version you are building (in defs.h)? What is on line 61 in HelperThread.cpp? |
|
|
|
|
|
|
#153 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
624910 Posts |
Quote:
Downloading a fresh copy of the source from your website, applying the fix to that, and bulding prpserver worked. All noprimeleftbehind.net servers are now running the patched version. |
|
|
|
|
|
|
#154 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
186916 Posts |
As those of you who follow the PRPnet 4.0.x announcement thread in the Software forum will have noticed, v4.0.5 was released today with fixes for a rather major memory leak. As Mark, Lars (a.k.a. ltd) and I have discovered over the course of our investigations over the last week or so, this is actually the cause of the mysterious and extremely vexing "too many connections" bug that has plagued every NPLB rally since we added PRPnet to them (all except the last rally, in which we only managed to avoid crashes by restarting the PRPnet server every 12 hours).
It seems that whenever a client finishes talking to the server, it neglects to properly release memory from its communication with the database--leading to a memory leak of a few MB for each connection that goes by. What made this leak hard to spot, however, is that most of the leak goes straight into virtual memory, rather than active memory. The VM allocation would keep building up until it reached 256 GB (!), which was apparently the tipping point, at which the server would just stop responding to any communications. Incoming connections would then build up until whatever the admin-specified limit was (in the case of the NPLB servers, 1000), and then the server would respond to all queries with "too many connections". The server would continue to be unreachable until someone manually restarted it. During high-load periods like rallies, such crashes could occur quite frequently, sometimes on the order of once every day or two. The simplest workaround was to just restart the server every 12 hours as a preemptive measure (though we didn't know exactly why this worked at the time, just that it somehow prevented the crashes). As it turned out, Lars was able to find the root of the problem and come up with a patch which, while not completely addressing the memory leak, cuts it down from a few MB per connection to a few KB. Now, we should be able to survive an entire rally (and then some) without any crashes, no server restarts needed. The patch can be applied equally well to v3.3.6 or 4.0.4 (it's included in 4.0.5), so now all the noprimeleftbehind.net servers (NPLB, CRUS, and private) have been patched. Port 9000 is still running 3.3.6, and I've been meaning to upgrade it to 4.0.x sometime soon; the DB conversion takes a bit of time, however, so whether I will be able to get it upgraded before the upcoming rally will depending on how my schedule works out. Regardless, though, the memory leak is under sufficient control that it shouldn't give us any trouble in the rally. @Gary: this memory leak, I discovered, is actually the cause of the mysterious sluggishness on jeepford. As much as I would like to blame the absolutely horrible Ubuntu 9.04 as per our original theory, it seems that the many GB of leaked memory from the multiple running prpservers caused the system to be tied up while it shuffled data from active memory into virtual memory (to make room for normal GUI activities) every time someone used the computer after it had sat for a while. With the memory leaks now under control, most of the sluggishness should be gone from here on out. (Not to mention that it will spare the hard drive quite a bit of beating...) Last fiddled with by mdettweiler on 2010-12-24 at 21:05 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| PRPnet Servers for CRUS | MyDogBuster | Conjectures 'R Us | 83 | 2021-01-28 10:08 |
| LLRnet servers for NPLB | kar_bon | No Prime Left Behind | 1343 | 2014-08-20 09:38 |
| Public PRPNet Servers | rogue | Open Projects | 26 | 2013-01-16 01:33 |
| PRPNet servers down? | opyrt | Prime Sierpinski Project | 13 | 2009-11-04 21:33 |
| Servers for NPLB | gd_barnes | No Prime Left Behind | 0 | 2009-08-10 19:21 |