![]() |
|
|
#12 |
|
May 2007
Kansas; USA
33·5·7·11 Posts |
Just to clarifly: It was likely the attempt to get into the machine through VNC that caused the reboot. It apparently was not a random event. Since we know what caused that, that particular issue should hopefully not happen again.
|
|
|
|
|
|
#13 | |
|
"Mark"
Apr 2003
Between here and the
11·577 Posts |
Quote:
You can set the max size of the log file in prpserver.ini. When it reaches that size it will rename it to prpserver.log.old (after deleting the previous prpserver.log.old file). |
|
|
|
|
|
|
#14 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3×2,083 Posts |
Quote:
I should implement the max log file size feature as you suggest and set it to 500MB or so. I'll still try to rename them periodically anyway to prevent any content from being deleted, but that should at least provide a failsafe in case I forget. Also, something I've been meaning to do for a while but haven't yet gotten the chance is to write a script that automatically renames the logs and moves them to a log/ folder within the server directory--that would completely obviate the need for manual renaming.As far as this specific instance, I remember last time something like this happened the log file wasn't much help; still, I'll give it a look and see if I can come up with anything useful. |
|
|
|
|
|
|
#15 |
|
Jan 2005
Sydney, Australia
14F16 Posts |
Where's MyDogBuster? With the rally running now his #2 rank is looking shaky.
Question for the mods: Would it help reduce the load on the server if we cached more than 5 tasks? |
|
|
|
|
|
#16 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
![]() Also, note that this only applies to PRPnet port 9000. LLRnet (which is where I see your cores are) generates the same load on the server no matter what your cache size is; it is more accurately described as a queue system opearating on a FIFO (first in, first out) model, with each pair being returned as soon as it completes and a new one being added to the end of the queue at that time. So no matter what the queue size, it communicates with the server just as often. |
|
|
|
|
|
|
#17 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
I think I just figured out why the server got stuck in that "too many connections" issue earlier--and it didn't have anything to do with the size of the log file. I just realized that I had entirely forgotten to specify the maxclients= line in prpserver.ini! The maxclients feature was introduced in v3.3.0, but I didn't make any changes to any of the servers' .ini files when I upgraded them from 3.2.5--hence, maxclients= was unspecified and it used the default value of 10.
10 concurrent connections is far from impossible during a heavy-load period like this--and indeed it looks like it did happen: Code:
[2010-08-13 12:09:56 CDT] 7: client connecting from 91.149.36.77 [2010-08-13 12:09:57 CDT] 4: client connecting from 91.149.36.77 [2010-08-13 12:10:05 CDT] 6: client connecting from 77.21.236.127 [2010-08-13 12:10:07 CDT] 8: client connecting from 91.149.36.77 [2010-08-13 12:10:15 CDT] 9: client connecting from 77.21.236.127 [2010-08-13 12:10:17 CDT] 10: client connecting from 91.149.36.77 [2010-08-13 12:10:19 CDT] 11: client connecting from 91.149.36.77 [2010-08-13 12:10:27 CDT] 12: client connecting from 91.149.36.77 [2010-08-13 12:10:27 CDT] 13: client connecting from 91.149.36.77 [2010-08-13 12:10:28 CDT] 14: client connecting from 91.149.36.77 [2010-08-13 12:10:28 CDT] 14: sending [ERROR: Server cannot handle more connections] [2010-08-13 12:10:28 CDT] Server has reached max connections of 10. Connection from 91.149.36.77 rejected [2010-08-13 12:10:28 CDT] 14: closing socket [2010-08-13 12:10:30 CDT] 14: client connecting from 91.149.36.77 [2010-08-13 12:10:30 CDT] 14: sending [ERROR: Server cannot handle more connections] [2010-08-13 12:10:30 CDT] Server has reached max connections of 10. Connection from 91.149.36.77 rejected [2010-08-13 12:10:30 CDT] 14: closing socket [2010-08-13 12:10:30 CDT] 14: client connecting from 91.149.36.77 [2010-08-13 12:10:30 CDT] 14: sending [ERROR: Server cannot handle more connections] [2010-08-13 12:10:30 CDT] Server has reached max connections of 10. Connection from 91.149.36.77 rejected [2010-08-13 12:10:30 CDT] 14: closing socket Mark, this looks like a bug--any idea why it's doing this? @all: meanwhile, I've set the server's log file limit to 500 MB as previously discussed. It looks like it's not needed after all, but it shouldn't hurt. Also, I've set maxclients=1000, so we shouldn't run into any more of the above problem unless somebody's client really does go bonkers and get in some kind of loop. Last fiddled with by mdettweiler on 2010-08-14 at 05:27 |
|
|
|
|
|
#18 |
|
May 2007
Kansas; USA
101000100110112 Posts |
Good work Max.
On the logfile, I'm about ready to do my semi-daily renaming. One thing about your 500 MB limit: That won't help if that turns out to be an issue. It had reached 778 MB in right around a day after the rally started. [Or perhaps the logfile included a lot of logging from long before the rally started. I hadn't checked that.] Regardless, the twice daily renaming won't hurt. |
|
|
|
|
|
#19 |
|
Jan 2006
deep in a while-loop
2×7×47 Posts |
have a look at:
$man 8 logrotate Max should have the server doing that for you in a jiffy
|
|
|
|
|
|
#20 |
|
"Mark"
Apr 2003
Between here and the
11×577 Posts |
My best guess is that some sort of deadlock is occurring and the server isn't handling it correctly. Being a multi-threaded application, the server needs to lock certain resources so that only one thread can access them at a time. It is possible that two threads are deadlocking. IIRC, pthreads will tell you when a deadlock has occurred. Since I don't check for return codes from some pthread calls, it is possible that I'm not handing an error that needs to be handled. I need to investigate further.
|
|
|
|
|
|
#21 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
Thanks, I'll check it out. |
|
|
|
|
|
|
#22 |
|
Jan 2009
101002 Posts |
I'm a couple of days late (due to work) but I'm in the rally now!
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| LLRnet/PRPnet rally April 4th-11th | mdettweiler | No Prime Left Behind | 55 | 2011-04-25 09:35 |
| LLRnet/PRPnet rally January 3rd-10th | mdettweiler | No Prime Left Behind | 48 | 2011-01-12 10:14 |
| LLRnet/PRPnet rally Oct. 27th-Nov. 3rd | mdettweiler | No Prime Left Behind | 33 | 2010-12-24 19:16 |
| LLRnet/PRPnet rally June 4th-6th | gd_barnes | No Prime Left Behind | 61 | 2010-07-30 17:28 |
| LLRnet server rally 400<k<1001 August 8-10 | mdettweiler | No Prime Left Behind | 66 | 2008-08-11 03:00 |