![]() |
|
|
#474 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3×2,083 Posts |
No, it's just going down every 15 minutes (when it prunes the joblist and knpairs files), but I'm manually restarting it every time as soon as I see it. (I've got a VNC window open in the background so I can monitor it.) I'm currently working on a workaround to make it automatically restart the server every time it goes down (to serve as sort of a band-aid fix).
|
|
|
|
|
|
#475 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
11000011010012 Posts |
Quote:
|
|
|
|
|
|
|
#476 |
|
May 2007
Kansas; USA
101000101011102 Posts |
I'm typing from the machine right now. Is there anything I can do in the next 10 mins. to monitor it?
This has been, by far, my most stable machine. It's never been turned off and never run above 68 C (67 C right now) since I started running it in early May. Taking the cover off to swap out the hard drive was the first time I even had the cover off of it. Last fiddled with by gd_barnes on 2008-12-11 at 02:48 |
|
|
|
|
|
#477 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3×2,083 Posts |
Quote:
As for what's causing this: I honestly don't know. I doubt it could have anything to do with swapping the hard drive; I'm thinking something more along the lines of a messed-up binary. But then again, I don't know--it could be anything. In the meantime, nobody should need to move any machines off G4000; the workaround should keep it running OK for now. |
|
|
|
|
|
|
#478 |
|
I ♥ BOINC!
Oct 2002
Glendale, AZ. (USA)
3×7×53 Posts |
Check the perms chmod 664 *.txt should fix it.
Happens when a lot of folks are trying to get in, and it doesn't come up cleanly trying to lock onto the socket. When folks are hitting mine real heavy, and I take it down to give it some more pairs, it might take me 15 tries to get it to come back up. Sometimes I just let it sit for a minute or two then try again to. |
|
|
|
|
|
#479 |
|
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
17·251 Posts |
When I run out of work on the 18th (possibly 19th), assuming the drive is still running and I haven't lined up new files yet, is IB400 probably the most stable and best for me to run on?
|
|
|
|
|
|
#480 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
![]() As for it needing a few more minutes before it can be restarted: yeah, I've had that happen a few times, especially today with needing to restart the server so much. I've been doing the same thing you suggested--letting it sit for a few minutes and then trying again.
|
|
|
|
|
|
|
#481 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Probably. If we time everything right, it will be the last 1st Drive server running at the end.
|
|
|
|
|
|
#482 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
Maybe a "chmod 777 *" would work better? I'll try that momentarily... Edit: Okay, I've tried "chmod 777 *". We'll see if it still crashes after that. (Just to play it safe, though, I've enabled the while-loop thing once again, to ensure that the server isn't down for long periods of time if I'm not right at the computer when it crashes.)
Last fiddled with by mdettweiler on 2008-12-11 at 03:22 |
|
|
|
|
|
|
#483 | |
|
May 2007
Kansas; USA
1041410 Posts |
Quote:
Thanks for your attention to detail on this Max. Sounds like a mess. Any luck with stopping the crashing with this last attempt? |
|
|
|
|
|
|
#484 | |
|
May 2007
Kansas; USA
2·41·127 Posts |
Quote:
I'll add to what Max said here: Yes, IB400 is the most stable at this point. But the question will be: Where should you connect at that point? Likely it will be IB400 but it could be one of the others if they are not dried before IB400 is. Around the 16th or 17th, keep checking the threads. We'll post what needs to be finished off by that point. I'll attempt to balance my machines such that IB400 is the last remaining server with pairs for the 1st drive but I can't guarantee it. On another note: You know what the most cool thing about this is?: Actually having a general idea of when we will complete something! How many projects are out there that can estimate when they will complete an entire effort that is being run by 10+ people?! This is great! It allows for excellent forward planning on future efforts.Gary |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| PRPnet servers for NPLB | mdettweiler | No Prime Left Behind | 228 | 2018-12-26 04:50 |
| Servers for NPLB | gd_barnes | No Prime Left Behind | 0 | 2009-08-10 19:21 |
| LLRnet servers for CRUS | gd_barnes | Conjectures 'R Us | 39 | 2008-07-15 10:26 |
| NPLB LLRnet server discussion | em99010pepe | No Prime Left Behind | 229 | 2008-04-30 19:13 |
| NPLB LLRnet server #1 - dried | em99010pepe | No Prime Left Behind | 19 | 2008-03-26 06:19 |