mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > No Prime Left Behind

Reply
 
Thread Tools
Old 2009-11-16, 00:57   #1
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default Planned downtime on GB servers

Hi all,

Due to some recent problems with "crunchford", the computer that the GB servers are currently running on, we are moving them all to "dumpford", another one of Gary's Linux quads. They should be able to stay on dumpford until the new server is ready (which will be soon--Gary's got it about halfway done and is working on it gradually).

This means that the servers will be offline while I'm moving them. I'm going to try to do them one at a time, only taking each one offline long enough to move it to dumpford, and then bringing it back online. This should keep downtimes brief enough that nobody should have to move their clients to other servers.

I'll keep everyone posted on the progress of this in this thread.

Max
mdettweiler is offline   Reply With Quote
Old 2009-11-16, 01:04   #2
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

624910 Posts
Default

Whoops, looks like that's not going to happen after all just yet. Gary, what happened to dumpford? I can't connect to it.

Last fiddled with by mdettweiler on 2009-11-16 at 01:05
mdettweiler is offline   Reply With Quote
Old 2009-11-16, 02:58   #3
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

2×72×103 Posts
Default

OK, you can connect to dumpford now. But I have other issues. Here is how the last 4-5 days have gone for me:

1. I had a maintenance guy install 2 new circuits in my basement.

2. The mobo went out on humpford on Thurs.. I got a new one from the local shop Fri.

3. The new mobo was incorrect so I had to exchange it on Sat.

4. When replacing the correct new mobo on humpford Sat., I accidentally disconnected the internet cord from dumpford, putting it offline as Max discovered.

5. The mobo went out on spitford late Sat. night. I didn't have time to get a new one. I'll do that tomorrow and get it running again.

6. I moved all 4 Intel's over to one of the new circuits, which negated the need for a long extension cord across the room that would sometimes be a little warm. This means I had to shut them all down and restart them. (Yes, I did the Run VNC thing after turning them back on.)

7. Today I discovered that I cannot remotely connect to humpford to save my soul. That's the one with the new mobo. Max, you'll need to diagnose that one. Maybe something due to a different brand of motherboard?

As of 30 mins. ago, you could only connect to 1 of my 4 Intel's. Now it's 2. Fortunatelly one of them is the more stable of them...dumpford. One is down and one has a new mobo but likely something about the different brand is causing me not to be able to remotely connect to it.


This is the 1st blown mobos I've had on my 4 Intel's in the 18 months that I've had them. With the new one I just replaced, I rotated the fan 90 degrees and it now runs 5 degrees cooler. I'll be doing that on all of them in the near future.

There's no correlation between the new circuits and the blown mobos. I hadn't run those machines through the new circuits when they went down.

A lovely week for me.

...in case anyone is wondering what is delaying me getting the permanent server machine set up.

One last hassle. I'm going to be out of town from Tues. 11/17 to Wed. 11/25. That means that somehow I'm going to need to get Dumpford on the battery backup, which means physically moving it upstairs or moving Crunchford downstairs.

Max, my suggestion is to get all of the PRPnet servers moved over to Dumpford by noon Monday if you can. I'll then physically move the machine so that it is on battery backup while I'm away.

Man I get tired of this stuff. I didn't sign on for this techie junk when I started these projects. I got into them for the math aspect. I admit to a fair amount of frustration having to constantly deal with these techie problems, especially servers. But I'll keep chugging away and hopefully we'll get all the servers on my machines eventually with Max maintaining them logically. It'll still be up to me to keep them physically running though.


Gary

Last fiddled with by gd_barnes on 2009-11-16 at 03:04
gd_barnes is offline   Reply With Quote
Old 2009-11-16, 06:17   #4
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

186916 Posts
Default

Okay, thanks for the detailed explanation. Here's my plan of attack:

-Diagnose the problem with humpford. One quick question about that: did you need to reinstall the OS after getting the new mobo for that? If so, then I'll need to do the usual pre-setup before you can use VNC on it.

-Transfer all the servers to dumpford.

-Make dumpford the new "gateway machine", rather than crunchford as it is now.

After I've done the latter two, crunchford will be just as ordinary as any of your other AMD quads, and you can remove crunchford from the battery backup and place dumpford on it instead. I'm not sure exactly how you have things set up, but from what you were saying it sounds like crunchford is upstairs so it can be on the battery backup. If that's the case, then I'd recommend moving crunchford downstairs and moving dumpford back up in its place.
mdettweiler is offline   Reply With Quote
Old 2009-11-16, 06:50   #5
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

2×72×103 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Okay, thanks for the detailed explanation. Here's my plan of attack:

-Diagnose the problem with humpford. One quick question about that: did you need to reinstall the OS after getting the new mobo for that? If so, then I'll need to do the usual pre-setup before you can use VNC on it.

-Transfer all the servers to dumpford.

-Make dumpford the new "gateway machine", rather than crunchford as it is now.

After I've done the latter two, crunchford will be just as ordinary as any of your other AMD quads, and you can remove crunchford from the battery backup and place dumpford on it instead. I'm not sure exactly how you have things set up, but from what you were saying it sounds like crunchford is upstairs so it can be on the battery backup. If that's the case, then I'd recommend moving crunchford downstairs and moving dumpford back up in its place.

No, once I got the correct mobo in humpford, it worked just fine. Nothing more was needed. I've done the "Run VNC" thing 2 or 3 times now just to make sure. Here are some specifics on the new mobo: The brand is ASUS and the model is P5KPL-AM SE.

You are correct, I would expect to swap Dumpford and Crunchford places except that I was thinking that they'd both have to be on the battery backup while I'm away because I was assuming that you weren't going to move the LLRnet servers over yet.

If you can't have them all moved by noon (really ~2 PM CST), I should have a small amount of time after ~10:30 CST Monday night to swap the machine's places. I'll be leaving ~8:30 AM CST Tuesday morning. After 10:30, I'd have just enough time to get them swapped places and crunching again. I'd leave it up to you to restart the servers after the move unless you left me detailed instructions on restarting them. I have the commands somewhere but since they are new(er) PRPnet servers, I wouldn't want to trust that those were all that I would have to do.

Sorry to rush you while appearing to dawdle myself. It always seem like stuff hits the fan with stuff like this right before I'm going to be away for a little while.


Gary
gd_barnes is offline   Reply With Quote
Old 2009-11-16, 07:20   #6
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
No, once I got the correct mobo in humpford, it worked just fine. Nothing more was needed. I've done the "Run VNC" thing 2 or 3 times now just to make sure. Here are some specifics on the new mobo: The brand is ASUS and the model is P5KPL-AM SE.
Hmm...I don't know what's wrong with it then. I'll have to take a look at it tomorrow and see what's up.

Quote:
You are correct, I would expect to swap Dumpford and Crunchford places except that I was thinking that they'd both have to be on the battery backup while I'm away because I was assuming that you weren't going to move the LLRnet servers over yet.

If you can't have them all moved by noon (really ~2 PM CST), I should have a small amount of time after ~10:30 CST Monday night to swap the machine's places. I'll be leaving ~8:30 AM CST Tuesday morning. After 10:30, I'd have just enough time to get them swapped places and crunching again. I'd leave it up to you to restart the servers after the move unless you left me detailed instructions on restarting them. I have the commands somewhere but since they are new(er) PRPnet servers, I wouldn't want to trust that those were all that I would have to do.

Sorry to rush you while appearing to dawdle myself. It always seem like stuff hits the fan with stuff like this right before I'm going to be away for a little while.


Gary
Actually, I'm way ahead of you on that: I've just finished moving over all the servers right now. The servers themselves now reside on dumpford, though the http://nplb-gb1.no-ip.org/ pages still come from crunchford and thus won't update until I can move them over. This will cause the DB to not update until I can do that, though that should be no problem as it will get caught up again once they're back online. I'll move those over tomorrow (too late tonight).

You can go ahead and switch crunchford and dumpford as soon as you get the chance. Here's what you have to do:
-Log on to dumpford via VNC. This can be done in the usual way.
-You should see all the servers open just like they were before on crunchford. For the LLRnet ones, close out the tab on the terminal window. For the PRPnet servers, stop them with Ctrl-C and then close their tab.
-Shut down dumpford.
-Shut down crunchford. (Nothing fancy needed here.)
-Switch the machines.
-Start up crunchford. (Again, nothing fancy here.)
-Start up dumpford.
-Log on to dumpford via VNC as before.
-Open a terminal window and use the File>Open Tab option to open additional tabs, one for each server. Use the Terminal>Set Title option to rename each tab to match its server (use the server code, like G2000 for example).
-For each server, use the "cd" command to change to its directory, and start it using the appropriate method. For LLRnet servers, run this command:
while ./llrnet llrserver.lua >> stdout.log ; do echo "Restarting server! ****" >> restart.log; done
For PRPnet servers, run this:
./prpserver244
Note that this isn't the command you'd normally use for the PRPnet servers, but right now I've got a special version of the server running for them and thus you need to use the "244" version as specified above.

That's all! BTW, make sure not to forget the personal servers as well.

Once you've gotten the machines switched, then I can finish the setup with dumpford while it's up and running on the UPS. Most likely there will be no power outage anyway, so crunchford should be online the entire time as well for me to pull old stuff from. Even if you did have an outage, though, the only thing that would be delayed is my progress in finishing up the transferral of stuff to dumpford; the actual servers should remain online on dumpford.

Max
mdettweiler is offline   Reply With Quote
Old 2009-11-16, 08:12   #7
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

2·72·103 Posts
Default

Great work Max...your usual efficient self.

I'm with you there...too late tonight. I anticipate getting them moved and restarted sometime between 2 and 4 PM Monday afternoon. One thing I think I'll do to be extra safe if I have the proper # of outlets is let both machines reside upstairs plugged into the UPC backup while I'm gone. That way, there will be no possible issues. Then after you have all of the behind-the-scenes stuff moved over after I get back, I'll physically move crunchford back downstairs.

Another question that I'm not quite sure how to ask in a technical manner so here it is in laymen's terms: You made crunchford the "master remote machine", kind of like the mother machine over the others that allows you/me to access them all remotely. Will you be changing that over to Dumpford? I'm asking because we probably want to have that master machine on the UPC backup also. That's part of the reason that I want to allow you the extra time to make the full switchover in a safe manner by having both machines backed up for the 8 days that I'm gone.

One final question: Will you be creating a new PRPnet server for our 5th drive on Dumpford while I'm gone? David (Ironbits), if you're reading this, this finally begins the process: We will let IB4000 dry out and Max will be creating a new public PRPnet server that we'll load n=740K-750K into for our 5th drive. This will be our first "official" NPLB drive on a PRPnet server. Max, I think it will be near drying by the time I get back although that is a SWAG so you'll just want to keep an eye on it. I'll have to do so also so I can be prepared to move my clients over if it looks to dry before I get back. Once IB4000 is dry and everything verified, that's one less server that David will have to maintain on his end.


Gary
gd_barnes is offline   Reply With Quote
Old 2009-11-16, 14:28   #8
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3·2,083 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
Great work Max...your usual efficient self.

I'm with you there...too late tonight. I anticipate getting them moved and restarted sometime between 2 and 4 PM Monday afternoon. One thing I think I'll do to be extra safe if I have the proper # of outlets is let both machines reside upstairs plugged into the UPC backup while I'm gone. That way, there will be no possible issues. Then after you have all of the behind-the-scenes stuff moved over after I get back, I'll physically move crunchford back downstairs.
Sounds good. That will have the unfortunate effect of essentially halving the UPS's capacity in the case of an outage (i.e. if it would normally last an hour, it would now last 30 minutes), but it will probably still be OK. Usually the killer outages are really short anyway.

Quote:
Another question that I'm not quite sure how to ask in a technical manner so here it is in laymen's terms: You made crunchford the "master remote machine", kind of like the mother machine over the others that allows you/me to access them all remotely. Will you be changing that over to Dumpford? I'm asking because we probably want to have that master machine on the UPC backup also. That's part of the reason that I want to allow you the extra time to make the full switchover in a safe manner by having both machines backed up for the 8 days that I'm gone.
Yes, I will be changing that over to dumpford as well. Right now it's still crunchford, but I'll change that over after you've moved dumpford upstairs. Once I've changed it over, I'll send you a new copy of the Perl script that you use to VNC into the various machines; it will need a few minor tweaks due to the gateway change.

Quote:
One final question: Will you be creating a new PRPnet server for our 5th drive on Dumpford while I'm gone? David (Ironbits), if you're reading this, this finally begins the process: We will let IB4000 dry out and Max will be creating a new public PRPnet server that we'll load n=740K-750K into for our 5th drive. This will be our first "official" NPLB drive on a PRPnet server. Max, I think it will be near drying by the time I get back although that is a SWAG so you'll just want to keep an eye on it. I'll have to do so also so I can be prepared to move my clients over if it looks to dry before I get back. Once IB4000 is dry and everything verified, that's one less server that David will have to maintain on his end.
Yes, I'll work on getting that rolling while you're away. Now that I'll finally be able to upload a new version of the PRPnet server to the server machine, I'll upgrade everything to the latest PRPnet 2.4.6, which in Lennart's testing seems to fix all the annoying problems on Linux that I was having with the earlier versions. Once I've got all the existing servers working reliably, I'll start up the one to replace IB4000.
mdettweiler is offline   Reply With Quote
Old 2009-11-17, 01:43   #9
MyDogBuster
 
MyDogBuster's Avatar
 
May 2008
Wilmington, DE

2·13·109 Posts
Default

Any chance of getting the GB status page working again. Email notification also.
MyDogBuster is offline   Reply With Quote
Old 2009-11-17, 01:48   #10
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

624910 Posts
Default

Quote:
Originally Posted by MyDogBuster View Post
Any chance of getting the GB status page working again. Email notification also.
I'll get that going sometime within the next day or two. Right now I'm waiting for Gary to give me the OK after he's moved dumpford upstairs.
mdettweiler is offline   Reply With Quote
Old 2009-11-17, 01:52   #11
Lennart
 
Lennart's Avatar
 
"Lennart"
Jun 2007

25×5×7 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
I'll get that going sometime within the next day or two. Right now I'm waiting for Gary to give me the OK after he's moved dumpford upstairs.
Are all his computer a FORD

He need's a Jeep

Lennart

Last fiddled with by Lennart on 2009-11-17 at 01:52
Lennart is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Planned Abscesses davar55 Game 3 - ♚♛♝♞♜♟ - Morphy's Maniacs 3 2015-02-19 09:24
FactorDB is moving to another server this night -> some downtime Syd FactorDB 34 2012-02-03 19:18
PRPNet servers down? opyrt Prime Sierpinski Project 13 2009-11-04 21:33
Any changes planned for the larger Prescott cache? Digital Concepts Software 8 2004-03-06 06:54
Proxy Servers and 22.8 Prime95 Software 1 2002-09-07 19:01

All times are UTC. The time now is 22:40.

Sat Apr 4 22:40:43 UTC 2020 up 10 days, 20:13, 0 users, load averages: 0.88, 1.28, 1.51

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.