mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   No Prime Left Behind (https://www.mersenneforum.org/forumdisplay.php?f=82)
-   -   Server outrages (https://www.mersenneforum.org/showthread.php?t=13840)

mdettweiler 2010-09-02 16:37

Server outrages
 
I've created this thread so we don't keep cluttering up the News thread with various posts related to server outages. They don't exactly fit too well in the News thread, but until now that's been the best place to put them. I've also moved the conversations related to the last two outages here.

From now on, if the server goes offline, post about it here, NOT in the News thread.

[b]Edit (5/4/11): if for some reason the noprimeleftbehind.net email services go down (they're hosted on Google Apps separately from our other services, but a DNS problem could still potentially take them down), you won't be able to email me at my usual contact address, [email]max@noprimeleftbehind.net[/email]. If that happens, use [email]max@noprimeleftbehind.net.test-google-a.com[/email] instead--it will get to the same place and is completely independent of our DNS records.[/b]

Max :smile:

mdettweiler 2010-09-03 07:09

Server outages
 
Gary just sent me a text message informing that there's a power outage over on his end right now, and the internet is out as well. (Since his router equipment is on UPS backup along with the server machine, most likely the cable lines are down as well as power.)

I got that message about an hour ago, so it still hasn't been fixed yet. Hopefully it will be fixed by morning.

gd_barnes 2010-09-03 07:39

Everything is back up and running.

I don't understand why every time that I have a power outage, the internet goes out also despite everything being plugged into a UPS backup with all of the router/modem still having lights flashing (although not the ones that are needed on the modem). Unfortunately the main power lines are above ground on the main street near me (which is usually where the problem is) although the lines to the houses are underground but all of the cable lines are underground. Something just doesn't seem quite right about what happens. It makes the UPS backup almost pointless other than a convienience to avoid having to jump through all of the hoops of restarting the server machine.

But...in this case, the outage was 1-1/2 hours and my backup was exhausted at 1-1/4 hours so I was completely black for about 15 mins. So we would have had an outage regardless of the UPS this time around.

The outages seem to be virtually unrelated to weather. Severe storms seem to rarely knock it out. Although it rained most of the day here today, it was virtually perfectly clear when it went out near 1 AM CDT (6 AM GMT). It's the longest outage I've had in nearly 2 years so the UPS backup is sufficient a large majority of the time. The problem is the loss in internet connection, which I can't quite get a handle on right now.


Gary

henryzz 2010-09-03 07:49

[quote=gd_barnes;228244]I don't understand why every time that I have a power outage, the internet goes out [/quote]
It will be something that your ISP uses that is lacking in electricity. In England that would be the exchange or maybe the roadside box. It surprises me that they don't have some sort of UPS system.

mdettweiler 2010-09-03 17:06

[quote=henryzz;228246]It will be something that your ISP uses that is lacking in electricity. In England that would be the exchange or maybe the roadside box. It surprises me that they don't have some sort of UPS system.[/quote]
My thought exactly--it's pretty much the same here in the US. I was going to mention the possibility of a roadside box running off the same electric line to Gary last night but didn't want to pack too much into a text message. :smile:

Another possibility that came to mind at Gary's mention of the outages being seemingly unrelated to weather: possibly farther down the line, the cable lines go above ground, traveling along with the power lines. They could, therefore, be readily knocked out together.

Or a combination of the two could be the case: perhaps the power is often knocked out farther down (in this case, since it had been raining all day but the storm had moved on by the outage, maybe it hit lines farther down in Gary's curcuit). That then could have taken out one of the cable company's boxes.

Gary, I'd suggest calling your ISP and asking them about the internet outages. Tell them that despite your having UPS protection on your router, modem, and computer (best not to complicate things too much for them :smile:), the internet goes out whenever there's a power outage. Perhaps they'll know right off the bat whether there's something further down that depends on the power lines. It could be a roadside box, or possibly even some kind of transfer box at the entry point to your house--the latter case in which you'll want to ask them "how can I put a UPS on that too?" :smile:

Flatlander 2010-09-05 22:32

G3000 is not responding.

kar_bon 2010-09-05 22:38

[QUOTE=Flatlander;228590]G3000 is not responding.[/QUOTE]

Seems all servers are offline. I've PMed Max 10min ago and he's offline now, hope he fixes it.

Just got answer from Max: Seems a power outrage and tries to contact Gary.

mdettweiler 2010-09-05 22:44

[quote=kar_bon;228591]Seems all servers are offline. I've PMed Max 10min ago and he's offline now, hope he fixes it.

Just got answer from Max: Seems a power outrage and tries to contact Gary.[/quote]
Yes, everything is down--probably another power outage. I've text messaged Gary about the problem and will report back here when I get a response.

Edit: just got a reply. It is another power outage.

kar_bon 2010-09-05 22:52

... and back again? Stats-pages online!

mdettweiler 2010-09-05 22:58

[quote=kar_bon;228596]... and back again? Stats-pages online![/quote]
Yes, everything looks online here as well. That was fast... :smile:

Well, at least the UPS was able to do us some good this time: the power outage was brief enough that when everything (including the internet) came back on, the server was able to pick up right where it left off without Gary having to restart all of the servers manually.

Oh, and BTW: I moved all posts related to this outage (and the previous one) to a new thread here. The News thread really was a bad place for those, since they're not exactly news items in the traditional sense.

Flatlander 2010-09-05 23:00

[QUOTE=Flatlander;228590]G3000 is not responding.[/QUOTE]

When the clients try to send results and the server is down they give up after a few minutes and close the clients. How do I extend that time?

[B]edit re. answers below[/B]
Okay I've changed op_connect to TRUE, also changed the cache from 5 to 30.

gd_barnes 2010-09-05 23:02

[quote=Flatlander;228600]When the clients try to send results and the server is down it gives up after a few minutes and close the client. How do I extend that time?[/quote]

As Max said, everything is back now. Yes, the outage was only 30 mins. this time so the UPS held its own and everything restarted automatically without the machine actually turning off. Now I just have to manually restart my other 11 machines. Ugh!

Chris, the clients should automatically keep trying once per minute forever until you hit Ctl-C.

mdettweiler 2010-09-05 23:03

[quote=Flatlander;228600]When the clients try to send results and the server is down they give up after a few minutes and close the client. How do I extend that time?[/quote]
Set op_connect=TRUE in do.bat. That will tell the client to keep retrying until the server comes back online. I'm actually not sure why that's set to FALSE by default...I suppose it could be useful if you want to have the computer fall back to some manual work (by stringing it up after the LLRnet client in a batch file) when the server goes down.

@Gary: that's how do.pl behaves. do.bat closes out after two failures (IIRC) with op_connect=FALSE, which is the default.

mdettweiler 2010-09-13 06:09

Another outage
 
Servers are down right now--probably another power outage. I am currently awaiting word from Gary to confirm this.

mdettweiler 2010-09-13 06:14

[QUOTE=mdettweiler;229535]Servers are down right now--probably another power outage. I am currently awaiting word from Gary to confirm this.[/QUOTE]
Just heard from Gary--it's not a power outage. Apparently, both the phone and internet went out simultaneously, which is rather odd since his internet is cable-based (so they're not on the same line or anything).

mdettweiler 2010-09-13 06:25

[quote=mdettweiler;229537]Just heard from Gary--it's not a power outage. Apparently, both the phone and internet went out simultaneously, which is rather odd since his internet is cable-based (so they're not on the same line or anything).[/quote]
Just heard something rather interesting from Gary...apparently the TV still works, which means that the cable line itself is functioning. Since power is on but telephone is not, that means that somehow, somewhere along the line, the cable internet is apparently dependent on the telephone connection! Strange but true.

Edit: Just realized something funny. Gary's cable company (Time Warner, which also services my area) has the usual cable phone service offering (which I don't believe Gary has) which they position as a direct competitor to standard phone company service. For the uninitiated, these are thinly disguised VoIP setups--they use the cable internet as a conduit for voice data. The funny thing is, though, if the cable internet goes out whenever the phone line goes out...that means that the cable phone service also goes out whenever the "real" phone line goes out. Go figure. :smile:

Edit 2: On second thought, scratch that...now that I think about it, I do remember Gary mentioning once that does use the cable phone service. In that case, it would not be at all surprising that the phone and internet would be out but not the power or TV. Rather, the cable line itself would seem to be functioning, but somewhere along the line the internet is not working. Thus, the phone wouldn't work either since its conduit is gone.

gd_barnes 2010-09-13 16:17

Kind of being a Mr. drama queen there. lol The outage was < 30 minutes and wasn't even worth analyzing. Its entirely normal that the phone and internet are out but not the cable TV. I have Time Warner for all 3 and the phone and internet almost always go out together.

MyDogBuster 2010-09-14 04:54

I've got just the opposite problem in Delaware (Comcast).

My cable TV goes up and down like a yoyo, but my phone and internet are ALWAYS on. Same company for all three, and when I call them they never have a problem BUT the cable comes back in about a minute after I hang up.
Go figure. Always about a minute. I can set my watch by it. Sounds like a re-boot on a balky server to me.

Brucifer 2010-09-20 01:39

hmm.... appears that the servers are offline as I can't get a ping through on a new system I just brought online. :-( 6:40pm PST Tried both 3000 and 7000

mdettweiler 2010-09-20 02:52

[quote=Brucifer;230519]hmm.... appears that the servers are offline as I can't get a ping through on a new system I just brought online. :-( 6:40pm PST Tried both 3000 and 7000[/quote]
Strange...seems to be working now. And I have an open SSH session to one of Gary's other machines via the server (for GPU testing); that didn't get interrupted, so either there was an outage on your end or something else strange is keeping you from connecting.

Can you connect to the server now? If you can't get to the port 3000 and 7000 servers, can you load [URL]http://www.noprimeleftbehind.net/[/URL] in a browser?

Edit: Also, what is the IP address you're trying to connect from? This is just a hunch, but I figured I'd check just in case.

Brucifer 2010-09-21 06:18

wasn't an ip address issue. Most likely if anything was a Slackware-64 related issue. The 64-suse box hooked up fine. But then I don't know cause the slack hooked up fine to other projects. ???? got me.

mdettweiler 2010-09-21 06:56

[QUOTE=Brucifer;230725]wasn't an ip address issue. Most likely if anything was a Slackware-64 related issue. The 64-suse box hooked up fine. But then I don't know cause the slack hooked up fine to other projects. ???? got me.[/QUOTE]
Very strange indeed. When you say that hte Slackware box connected fine to other projects, are you referring to other projects' LLRnet servers, or another type of application entirely? If the latter, I'd try recompiling LLRnet from [URL="http://www.noprimeleftbehind.net/downloads/llrnet-sources.tgz"]source[/URL]. (Once compiled, swap your new llrnet binary into the LLRnet2010 folder as previously installed.)

mdettweiler 2010-12-07 07:42

Outage?
 
The noprimeleftbehind.net website and all services hosted on the same machine are [url=http://downforeveryoneorjustme.com/noprimeleftbehind.net]unreachable[/url] at this time. From the PRPnet logs on my quad, this appears to have been the case for at least the last 30 minutes.

At this point I can't say if it's a temporary outage or will be longer-term; Gary is out of town until Wednesday so if jeepford is down, it's probably not coming back up until then. (The most probable explanation would be a power outage; if that's the case, then whether or not service is restored in the next half-hour or so, i.e. approximately when the UPS battery runs out, will determine whether the noprimeleftbehind.net services come back online tonight or Wednesday.)

mdettweiler 2010-12-07 07:43

[QUOTE=mdettweiler;240463]The noprimeleftbehind.net website and all services hosted on the same machine are [url=http://downforeveryoneorjustme.com/noprimeleftbehind.net]unreachable[/url] at this time. From the PRPnet logs on my quad, this appears to have been the case for at least the last 30 minutes.

At this point I can't say if it's a temporary outage or will be longer-term; Gary is out of town until Wednesday so if jeepford is down, it's probably not coming back up until then. (The most probable explanation would be a power outage; if that's the case, then whether or not service is restored in the next half-hour or so, i.e. approximately when the UPS battery runs out, will determine whether the noprimeleftbehind.net services come back online tonight or Wednesday.)[/QUOTE]
And what do you know, all is functioning again. Whatever it was, it's fixed now. :smile:

AMDave 2010-12-08 13:07

The host has been up for 92 continuous days, although I can't vouch for the LLR ports.

@Max &/ Gary
there is an increasing number of patches and updates pending to the host.
Please run the updates and then the upgrades.

"IFF" (If and only if, heh heh) the update process says it is necessary, please schedule a restart.
About 25% of an hour should do it.
Best at xx:30 hrs and the hour being bang-howl (where bang is "!" and howl is midnight)
The reboot may not be required, but I suspect that the 92 days uptime includes the release date of at least one Sec.Patch that the upgrade process will tell you requires a reboot.
cheers

@Max
Your post above about the site being unreachable coincides with a similar post I made regarding another site that is hosted (physically and logically) not so far away from noprimeleftbehind.
It looks very much like there were many sites that experienced a major network slow down and even network outages at that time.
Ask the ISP if you need further info.

mdettweiler 2011-02-18 04:34

FYI @all: noprimeleftbehind.net seems to be down from my end, and [url]http://downforeveryoneorjustme.com/noprimeleftbehind.net[/url] confirms this as well. Gary's not home right now but I did get a hold of him via text message and he said he'll take a look at it when he gets home (probably sometime tonight, since I believe he got back from his latest business trip a few days ago).

gd_barnes 2011-02-18 06:17

Unfortunately the server machine is down completely. It won't come on after several attempts. My guess...yet another blown mobo. Although I can't visually see anything burned out, 80% of the time, that's what it's been on my machines; most of the rest of the time a bad power supply. My main sieving machine, a Windows I7, also went down while I was away on a business trip. My guess...a bad mobo.

I wonder if others have the problems with motherboards that I do. I get so sick of replacing them and the expense has become very high. In 3 years time, I've replaced an average of more than one motherboard for every machine that I have.

This is a terrible inconvenience for me at this point to have my 2 most important machines go down within a few days of each other. These 2 machines are supposed to have the best mobo for the money and they are just over a year old. I'll see about swapping the server machine's hard drive to another one of my Intel machines and hope to get it running in a few hours here.

The timing is so bad in my personal life that I'll probably just take the 2 machines into the shop, have them diagnose them, and fix them, at an expense of likely $100+ extra over figuring it out myself and replacing the bad part(s) myself and at a loss of probably several extra days of processing. I just don't have the time to continue dealing with these things.

Right now, I'm completely disgusted.

mdettweiler 2011-02-18 06:52

[QUOTE=gd_barnes;252890]I'll see about swapping the server machine's hard drive to another one of my Intel machines and hope to get it running in a few hours here.[/QUOTE]
Just a little note as to what you might expect when doing this: because your four original Intels are not [I]quite[/I] the same configuration as jeepford (they are Q6600's, and jeepford is a Q9something), it will probably boot up, but the graphics may not work. If that turns out to be the case, everything should still be OK--just log in on the command line and let me know, and I'll pop in remotely and start up the servers from there.

Worst case scenario, jeepford's hard drive won't boot up at all in your other Intels; I'd say this is an unlikely possibility, but nonetheless possible. In that case, I can set up some temporary servers on one of your other machines (or even on my own computer if for some strange reason that doesn't work out). We wouldn't have the website and DB (well, I might be able to get a temporary kludge set up for the CRUS web pages) but we'd at least have LLRnet and PRPnet servers.

mdettweiler 2011-02-18 08:05

All is working again now. :grin: Gary was able to get jeepford's hard drive switched into humpford, our previous server box. As it turned out, their graphics chipsets were similar enough that we didn't have any problems with getting that working. (Thankfully, Linux is much hardier than Windows with such things...Windows would probably have given a BSOD on bootup from the different hard disk controller on the new mobo.)

So, as far as the server stuff is concerned, nothing has really changed; humpford now thinks it is jeepford in every way. :smile: Now the trick will be to get the real jeepford working again...perhaps it wiill turn out to just be a blown power supply (which is easy and cheap to repair), though more likely it will require a mobo replacement. :ermm:

nuggetprime 2011-02-18 12:20

Just courious,what PSUs is Gary using on his machines?
Low-quality power supplies can both go bad fairly soon and damage or reduce lifespan of your mobo/graphics card due to voltage spikes and stuff.
I would recommend to take a model from Corsair,Seasonic or OCZ. They are all excellent.

mdettweiler 2011-02-18 19:10

[QUOTE=nuggetprime;252914]Just courious,what PSUs is Gary using on his machines?
Low-quality power supplies can both go bad fairly soon and damage or reduce lifespan of your mobo/graphics card due to voltage spikes and stuff.
I would recommend to take a model from Corsair,Seasonic or OCZ. They are all excellent.[/QUOTE]
I'm not sure what PSU he's using but he usually just gets cases from his local computer store and uses the PSUs in them. He has had a few blown power supplies in the past, but in each case he's observed blackening around the spot where the PSU's cable connects to the motherboard. (Amazingly, both the motherboard and CPU survived in all such cases. :shock:) This time, however, there was no visible blackening, which is why we're thinking the mobo is bad. (In the past whenever he's had a bad mobo, there's been visual evidence about 50% of the time.)

Good point, though, about the cheap power supplies reducing the life of the other components...that may have been a contributing factor in this case. Regardless of whether the PSU or mobo turned out to be the problem in this case, I should look into higher-grade server power supplies for jeepford--that way it will have better protection going forward.

nuggetprime 2011-02-18 19:34

Sorry,but preinstalled-in-case power supplies are about the lowest grade ones you can get:smile:
From personal experience, I would recommend an OCZ StealthXStream model. A 500W one can be had for just 70 dollars at NewEgg,and Gary certainly doesn't have to replace his mobo so often then.

Nugget

gd_barnes 2011-02-18 19:38

Max is incorrect. I've always bought boxes without power supplies and then bought separate power supplies. I would not trust a pre-installed one. I've usually looked for middle-of-the road power supplies; not the cheapest; not the most expensive. But for these 2 most important boxes, high-end ones would be better.

mdettweiler 2011-02-18 19:40

[QUOTE=gd_barnes;252954]Max is incorrect. I've always bought boxes without power supplies and then bought separate power supplies. I would not trust a pre-installed one. I've usually looked for middle-of-the road power supplies; not the cheapest; not the most expensive. But for these 2 most important boxes, high-end ones would be better.[/QUOTE]
Hmm, I see...I didn't know that. How much did you usually end up paying for the power supplies?

vaughan 2011-02-19 09:08

I'm buying the [url=http://www.corsair.com/power-supplies/modular-psus/professional-series-gold-2.html]Professional Gold series power supplies from Corsair[/url] in recent months. They are Gold certified (high efficiency) and they are modular so I don't have my crunchers filled with unnecessary cables. The 750W version is powerful enough except for multi GPU machines.

gd_barnes 2011-02-19 09:11

[QUOTE=mdettweiler;252955]Hmm, I see...I didn't know that. How much did you usually end up paying for the power supplies?[/QUOTE]

$25-35. They are generally 470-500 watts. The cheapies are $18-25 at the store that I frequent.

You should know. When you originally sent me the specs for my first 6 AMD machines, you had power supplies with them.

em99010pepe 2011-02-19 09:46

You should be spending like $100 per power supply.

mdettweiler 2011-02-19 20:53

[QUOTE=gd_barnes;253016]$25-35. They are generally 470-500 watts. The cheapies are $18-25 at the store that I frequent.

You should know. When you originally sent me the specs for my first 6 AMD machines, you had power supplies with them.[/QUOTE]
Yes, but I knew that for the Intels you bought the power supplies locally; I didn't know what grade they were (some computer stores might not even stock the totally bargain-basement ones).

Yeah, $25-35 is still pretty cheap. That kind of price tier is generally OK for less-crucial stuff like crunching boxes, but for a server we'll want something better. Thanks Nugget and Vaughan for the recommendations--I'll do some shopping around to find a better replacement for jeepford (and also for Gary's i7 while I'm at it). Even if both of their existing power supplies are actually OK, it will be good to have better ones in there going forward and he can always use the two leftover cheapies as spares or for future crunching boxes. :smile:

gd_barnes 2011-02-25 22:26

The server machine will be down for about 30 minutes while I run some updates and perform some maintenance.

gd_barnes 2011-02-25 22:54

The server is back up now. Everything went smooth.

mdettweiler 2011-04-26 21:00

All noprimeleftbehind.net services seem to be down right now. No information thusfar on the cause of the problem.

Oddball 2011-04-27 05:21

[QUOTE=mdettweiler;259675]All noprimeleftbehind.net services seem to be down right now. No information thusfar on the cause of the problem.[/QUOTE]
Every outage is an outrage! :wink:

BTW, has there ever been any loss of data from these problems? It would suck to have a hard drive crash and lose all of the LLR residues.

AMDave 2011-04-27 06:45

Yes. Losing 25,862,597 residuals (to-date) would indeed suck.

Thus we have a DRP based on nightly off-server backups.
I have successfully tested the recovery from these backups.
I have even documented and tested a complete server s/w re-build from scratch.

RPO based on Maximum loss is 1 day plus 30 minutes (per scheduled backup propagation)

RTO is less than a week.

My target is 48 hours.

My tested recovery time was 5 hours from response time depending on hardware availability.
I made hardware available.
Hardware availability and config took longer than expected on the last test run, but the stats code has been upgraded to run on the newest versions of the middle-ware since that issue was identified so that problem has been eliminated.

Last DRP test run date was ... rummaging for the accurate date time ... 17-MAR-2011

Next DRP test run should be around September.

/edit - I checked the last off-site backup, it's getting a bit old. I'll archive a new one in the next 24 hours - edit/

AMDave 2011-04-27 07:06

BTW the outage appears to be over.
I logged in and checked the procs
Gary / Max has resumed the ports, both prpnet and llrnet
I also verified the scheduled stats tasks are running as scheduled.

"2011-04-27 02:03:48 and all is well, Oh Yae!" :smile:
/EDIT - that was server time, not post-time -EDIT/

AMDave 2011-04-27 07:58

The NPLB unplanned outRage was in the network.
Server uptime is currently 60 days 9 hours 24 mins which is how long ago the last planned outage was for software updates. (see previous posts in this thread)
For the parts that we have control of we are doing pretty well, IMO.

kar_bon 2011-04-27 08:40

[QUOTE=Oddball;259702]It would suck to have a hard drive crash and lose all of the LLR residues.[/QUOTE]

Those ~25 Mio residues are also hold on different drives at Gary and me, too.

gd_barnes 2011-04-27 08:43

Sorry all. Strange thing happened. After losing internet access on my laptop, I went to inspect things. Apparently the cleaning lady bumped the router while cleaning it and the internet chord came partially out. I plugged it back in and the internet came back up on my laptop so I went on my marry way. Alas, after many hours I found out that the server machine still didn't have internet access. I then disconnected and reconnected the internet chord from the back and that did the trick.

Stupid cleaning ladies have messed me up more than once now. Sorry about the problems.

To all concerned: As David alluded to, we have a very detailed and frequent backup process in place.

AMDave 2011-04-27 09:13

[QUOTE=gd_barnes;259713]...bumped the router while cleaning it...[/QUOTE]
On the upside:
The router is now clean and won't have to be bumped for months :big grin:

henryzz 2011-04-27 16:05

[QUOTE=gd_barnes;259713]Sorry all. Strange thing happened. After losing internet access on my laptop, I went to inspect things. Apparently the cleaning lady bumped the router while cleaning it and the internet chord came partially out. I plugged it back in and the internet came back up on my laptop so I went on my marry way. Alas, after many hours I found out that the server machine still didn't have internet access. I then disconnected and reconnected the internet chord from the back and that did the trick.

Stupid cleaning ladies have messed me up more than once now. Sorry about the problems.

To all concerned: As David alluded to, we have a very detailed and frequent backup process in place.[/QUOTE]
If you ever need another person for backup of NPLB/CRUS/etc data then I would be happy to donate some disk space. I have unlimited downloads at home so I don't need to worry about downloading huge files.

kar_bon 2011-04-28 23:05

And again the servers are not reachable.

Max knows about it and I hope he can fix it as fast as possible.

AMDave 2011-04-28 23:18

I can tell you that the server is still up and running.
I am logged into it.
The web server is also still running.
So are the ports.

The DDNS link still works: [url]http://nplb-gb1.no-ip.org/stats/[/url]

It seems we have a problem with the DNS translation of [url]www.nopimeleftbehind.net[/url]

Over to Max and Gary.

AMDave 2011-04-28 23:24

ip traceroute shows that [URL="http://www.noprimeleftbehind.net"]www.noprimeleftbehind.net[/URL] is pointing to 74.62.165.120

that's wrong

it should be pointing to 173.197.48.155

Gary / Max will need to update the IP in the easyDNS manager

mdettweiler 2011-04-28 23:42

[QUOTE=AMDave;259873]ip traceroute shows that [url]www.noprimeleftbehind.net[/url] is pointing to 74.62.165.120

that's wrong

it should be pointing to 173.197.48.155

Gary / Max will need to update the IP in the easyDNS manager[/QUOTE]
Hmm...looks like Gary got a new dynamic IP address. We never did get the static IP thing set up--for some reason it never worked very well with his router, so we just left it on dynamic, which essentially never changed once he was on the business class service and its separate IP pool. However, it seems to have finally decided to change now, for reasons only the ISP knows. Hopefully this one will last a good long time like the last did...

To resolve the problem, I need to log into the control panel at EasyDNS (the domain registrar) and change the DNS record to point to the new IP. That done, the problem should be cleared up within an hour or so. However, I cannot for the life of me find Gary's EasyDNS password in my email archives; I could have sworn that he'd sent it to me at some point, but at any rate it's not there. Gary, could you send that to me as soon as you get the chance? Thanks.

Meanwhile, for all others: we should be able to get everything working again at the usual noprimeleftbehind.net sometime later today, but in the meantime, you can point your clients to nplb-gb1.no-ip.org (the old server address, which has continued to work all along and, because it's a dynamic instead of static DNS record, still works despite noprimeleftbehind.net being out of commission).

@Chris: IIRC, you had the IP address hardcoded into the PRPnet clients on your new i7 because it wasn't working with the regular domain name. (Or it could have been LLRnet clients...my memory fails me. :smile:) If you're still doing that, you'll need to change it to the new IP, 173.197.48.155, to get your clients back online. My apologies for any inconvenience this causes.

gd_barnes 2011-04-29 05:41

Max,

Your text was extremely misleading. I was out and had no idea that the servers were not accessable. When you said, "Oops, scratch that, it appears DNS related", I assumed that you just needed to fix some DNS stuff so that I could remotely access my machines. I had no idea that that was related to the accessability of the servers. Please do not alert me and then make it seem unimportant with the next text. I can't even upload CRUS pages updates now. I was busy at the time but could have texted you back with the password if I thought it was any kind of big deal. Anyway, I just now Emailed the password. Hopefully it is correct.

Sheesh, what a friggin pain this stuff is.


Gary

gd_barnes 2011-04-29 06:05

OK, I just now got on the EasyDNS page. I confirmed that the password that I sent to Max is correct. After browsing it for 15 mins. for the life of me, I cannot figure out how to change the IP address to the correct one.

Dave, I've sent you my EasyDNS ID and password. If you get this before Max does, perhaps you can correct it. Thanks.

Batalov 2011-04-29 06:10

Ahem, I thought that the traditional name of the thread was "Severe outrages", wasn't it?

gd_barnes 2011-04-29 06:13

[QUOTE=Batalov;259896]Ahem, I thought that the traditional name of the thread was "Severe outrages", wasn't it?[/QUOTE]

Lol, I don't remember that. Regardless I'm having an outrage right now. :smile:

gd_barnes 2011-04-29 08:08

I'm even more enraged now. After Dave got on EasyDNS, he said it appeared that I hadn't paid the annual fee. But I paid that back in Feb. before the Mar. 1 due date but they aren't showing it on their site. I even have an Email receipt for the payment.

I'm at a loss and tired and going to bed now. I guess the servers will just have to be inaccessable until I can call them and get it straightened out. It'll probably be early afternoon Friday before I can see what is going on. Max, if you have any ideas, please let us know.

mdettweiler 2011-04-29 11:52

[QUOTE=gd_barnes;259902]I'm even more enraged now. After Dave got on EasyDNS, he said it appeared that I hadn't paid the annual fee. But I paid that back in Feb. before the Mar. 1 due date but they aren't showing it on their site. I even have an Email receipt for the payment.

I'm at a loss and tired and going to bed now. I guess the servers will just have to be inaccessable until I can call them and get it straightened out. It'll probably be early afternoon Friday before I can see what is going on. Max, if you have any ideas, please let us know.[/QUOTE]
What the heck?!? Okay, that's strange. I'm going to log into the EasyDNS website myself now to see if I can figure out what's up.

Meanwhile, sorry about the conflicting messages; I at first thought I already had the EasyDNS password, so I sent you the "all OK" message under the assumption that this would be an easy 2-minute fix I could apply myself. It wasn't until after that that I realized it was a little more complex.

[b]Edit: I tried logging into the EasyDNS website with the password you sent me and a variety of likely usernames (gbarnes017, <gbarnes017 at gmail dot com>, etc.) but it was rejected each time. :rant: Could you (or Dave, since it sounds like he got in--whichever sees this first) send me the username as well? Thanks.[/b]

mdettweiler 2011-04-29 17:10

Okay, I think I see what's up now. For some reason, the domain was never actually transferred to Gary's account at EasyDNS...it's still under IronBits's, just (presumably) with Gary listed as the billing contact. @Gary and Dave, I'll be sending you guys an email shortly with more on this.

@all others: In light of this, I'm thinking it may be a day or two before this is resolved. In the meantime, you can continue to use nplb-gb1.no-ip.org instead of noprimeleftbehind.net to access all NPLB services. Also, Gary and I's email addresses @noprimeleftbehind.net are not affected by this, as they are hosted through Google Apps and thus are independent of IP address changes on Gary's end.

Flatlander 2011-05-01 15:02

[QUOTE=mdettweiler;259946]...In the meantime, you can continue to use nplb-gb1.no-ip.org instead of noprimeleftbehind.net to access all NPLB services...[/QUOTE]

Aaargh. Somebody slap me. I [I]thought[/I] my PCs were quiet.

AMDave 2011-05-02 02:04

Slap. :smile:

mdettweiler 2011-05-02 02:43

[QUOTE=Flatlander;260124]Aaargh. Somebody slap me. I [I]thought[/I] my PCs were quiet.[/QUOTE]
FYI, PRPnet allows you to specify backup servers to be used in case the primaries are inaccessible or out of work. I have a few of PrimeGrid's servers in all of my clients set at 0% resource share to cover contingencies like this. (Incidentally, if you have a PrimeGrid BOINC account--which you can set up either via a BOINC client or their [url=http://www.primegrid.com/create_account_form.php]web form[/url]--they will manually give you BOINC credit on a periodic basis to represent your contributions to their PRPnet servers. Raiders of the Lost Primes is [URL="http://www.primegrid.com/team_display.php?teamid=2624"]set up[/URL] there, too, so any credit you earn for "downtime" work can count for the team as well. :smile:)

At any rate, it lets your computers continue to do useful work for [i]somebody[/i] when our servers are down, even though it's not for NPLB. :smile:

Flatlander 2011-05-02 11:46

All my backup servers belonged to NPLB. :ouch1:

Flatlander 2011-05-02 13:45

[QUOTE=AMDave;260194]Slap. :smile:[/QUOTE]

And the other one.

gd_barnes 2011-05-02 19:57

I just sent David (IronBits) an Email and left a message at his number shown on the whois pages. We'll see how that works. Both the Email and phone # are still operational so hopefully they are still valid for him. If I don't hear from him by early Tuesday, I'll call the support number at EasyDNS. I'm sure they have situations like this that come up where no one can get ahold of a previous owner of a Domain so hopefully they can get it transferred to me. If not, we're stuck having to create a new domain name for all of the servers and pages.

What a mess.

gd_barnes 2011-05-03 21:02

OK. After Bok transferred the domain to my EasyDNS account, I made all of the changes to the IP address and the whois stuff. But it still doesn't seem to be "taking". Noprimeleftbehind.net is still unavailable and the whois page still shows Ironbits as the owner.

Perhaps it takes a little while to "take". Max or Dave, can you check my EasyDNS account and see if I've missed anything? Thanks.


Gary

Lennart 2011-05-03 21:25

[QUOTE=gd_barnes;260423]OK. After Bok transferred the domain to my EasyDNS account, I made all of the changes to the IP address and the whois stuff. But it still doesn't seem to be "taking". Noprimeleftbehind.net is still unavailable and the whois page still shows Ironbits as the owner.

Perhaps it takes a little while to "take". Max or Dave, can you check my EasyDNS account and see if I've missed anything? Thanks.


Gary[/QUOTE]

It use to take some time but it works for me now.

Lennart

Flatlander 2011-05-03 22:12

Whois says
[QUOTE] Domain status: clientTransferProhibited
clientUpdateProhibited[/QUOTE]

mdettweiler 2011-05-03 22:51

[QUOTE=gd_barnes;260423]OK. After Bok transferred the domain to my EasyDNS account, I made all of the changes to the IP address and the whois stuff. But it still doesn't seem to be "taking". Noprimeleftbehind.net is still unavailable and the whois page still shows Ironbits as the owner.

Perhaps it takes a little while to "take". Max or Dave, can you check my EasyDNS account and see if I've missed anything? Thanks.


Gary[/QUOTE]
It's working here now too (and the whois shows your information now). Yes, DNS changes take a little while to "take"--various DNS servers around the world have the old IP in their cache, and that has to expire (usually within a couple of hours) before it will refresh it.

Move along people, nothing to see here. :wink:

@Chris: huh, not sure what's up with that. It may be that those fields haven't been filled in by EasyDNS, or they take a little longer to be farmed out to various individual whois repositories (or however that works).

gd_barnes 2011-05-04 05:12

Well, after the noprimeleftbehind.net pages worked here for about an hour, they have stopped working again. I cannot access anything.

Does anyone know what is going on?

gd_barnes 2011-05-04 05:21

Now I'm really baffled. The links in the first post of the "LLRnet servers for NPLB" do not work. I cannot access my machines remotely. Yet the links in the "Come join us" thread at CRUS work perfectly and I can readily upload my CRUS updates. Yet they all point to the exact same domain suffix.

So the status at this point is: Some stuff works and some stuff doesn't.

Good grief! :furious:

mdettweiler 2011-05-04 18:06

[QUOTE=gd_barnes;260455]Now I'm really baffled. The links in the first post of the "LLRnet servers for NPLB" do not work. I cannot access my machines remotely. Yet the links in the "Come join us" thread at CRUS work perfectly and I can readily upload my CRUS updates. Yet they all point to the exact same domain suffix.

So the status at this point is: Some stuff works and some stuff doesn't.

Good grief! :furious:[/QUOTE]
See the PM I just sent you--I included instructions for refreshing your computer's DNS cache, which I think may be the cause of many of the problems you mentioned.

@all: as it turned out, the noprimeleftbehind.net MX records got messed up during the transfer of "ownership", so @noprimeleftbehind.net email addresses were inaccessible since yesterday evening. I've now fixed the problem; incoming messages may still get bounced for the next few hours or so (depending on how quickly your ISP's mail server gets updated with the new MX records), but within 12 hours or so everything should be fine.

Note that my old email address, <bugmesticky at googlemail dot com>, was unaffected by this; even though it was unable to forward mail to <max at noprimeleftbehind dot net> as usual, any mail sent to it wasn't totally lost. I only see one message there (aside from forum notifications and the like) sent to me there during the downtime; it's from henryzz containing some CRUS results. (@henryzz: I've now forwarded it to my NPLB address, so I have it on hand and will process your results as soon as I get the chance.) Any others, however, who sent messages directly to <max at noprimeleftbehind dot net> would have gotten bounced, so if you sent me anything you'll need to resend. Sorry for any inconvenience.

Edit: I've added a note to the first post of this thread detailing a workaround that you can use to contact me if we have a similar email outage in the future.

gd_barnes 2011-05-04 18:20

Everything looks great on my end now.

henryzz 2011-05-04 18:54

Do you prefer <max at noprimeleftbehind dot net> normally then? I suppose it doesn't matter as it is redirected normally. Do you have a preference Gary(and if so is it gary, gdbarnes or something else <at noprimeleftbehind dot net>?

gd_barnes 2011-05-04 18:57

[QUOTE=henryzz;260509]Do you prefer <max at noprimeleftbehind dot net> normally then? I suppose it doesn't matter as it is redirected normally. Do you have a preference Gary(and if so is it gary, gdbarnes or something else <at noprimeleftbehind dot net>?[/QUOTE]

It's Gary but I always give people my "regular" Email of:
gbarnes017 at gmail dot com.

I believe the <Gary at noprime(yada> address should get forwarded to the above anyway so it doesn't really matter.

mdettweiler 2011-05-04 19:23

[QUOTE=henryzz;260509]Do you prefer <max noprimeleftbehind dot net> normally then? I suppose it doesn't matter as it is redirected normally. Do you have a preference Gary(and if so is it gary, gdbarnes or something else @noprimeleftbehind.net)?[/QUOTE]
Yeah, it doesn't really matter...I use the NPLB address as the primary one but it all does go to the same place. In fact, anything (at) noprimeleftbehind.net will get to me because I have a catch-all set up to redirect any unconfigured addresses to the max box. (I have the <max at noprimeleftbehind dot net> account configured so that it can [I]send[/I] mail from any of a number of addresses; incidentally, this is why some email clients, most notably Microsoft Outlook, will show such messages as "From: Max Dettweiler <max (at) noprimeleftbehind.net> on behalf of Max Dettweiler <bugmesticky (at) googlemail.com>".)

Regarding Gary, yes, his gary at noprimeleftbehind dot net address is redirected to his regular Gmail address (the reverse of how mine's set up).

[SIZE=1]Interestingly enough, because of the catch-all, if you send email to <gd_barnes at noprimeleftbehind dot net>, it will come to me. :smile:

Confused yet? :wink:[/SIZE]

Lennart 2011-05-14 03:38

Is it only me that can't connect to noprimeleftbehind ??

Lennart

mdettweiler 2011-05-14 04:32

[QUOTE=Lennart;261377]Is it only me that can't connect to noprimeleftbehind ??

Lennart[/QUOTE]
No, it's not just you--I can't get through to the web pages, and my clients have fallen back to alternate servers. It looks like some kind of server or network outage on Gary's end. (I don't believe it's a DNS outage--the alternate [url]http://nplb-gb1.no-ip.org/[/url] address also doesn't work.)

I'll see if I can get a hold of Gary about it.

gd_barnes 2011-05-14 05:09

Sorry again guys. There seems to be a consistent issue on this end that I haven't been able to address. Frequently whenever I have to recycle my router because one of my kids can't connect with their IPOD when they come over, the server machine does not reconnect afterwards. I then have to recycle the server machine itself by disconnecting its chord and then reconnecting.

I've just now figured out that that has been the cause of recent problems. So whenever I have to recycle the router in the future, I will immediately check the server machine so that the outage is no longer than a minute.

Everything is working now. Sorry about the problems.

mdettweiler 2011-10-12 01:14

The noprimeleftbehind.net server appears to be down for the moment--neither the web pages, the LLRnet/PRPnet servers, or my administrative SSH interface are working. It doesn't appear to be a DNS issue (the alternate address at [url]http://nplb-gb1.no-ip.org/[/url] is also down); perhaps a power outage or the like. More information will likely be forthcoming when Gary notices the issue and comes online to tell us about it. :smile:

gd_barnes 2011-10-12 01:46

All is working now. I had to recycle both the router and modem. I'm not sure what caused it.

vaughan 2011-10-13 03:22

I wondered what happened; thought it was because I added a core to the project after a long absence :)

AMDave 2011-10-14 14:57

offsite backup copy of the nplb stats db and website in progress
will run for approx 4 hours from now
possible slow response from website during this time
although tests show very little difference
but if it happens, you know why :smile:

/me climbs into flame retardent suit
:explode:

AMDave 2011-10-14 18:38

ETF 30 minutes.
(I hope ... In the middle of a monster storm stretching from Rockhampton in QLD to Taree in NSW. Approx 1250km long north to south. 358 ground strikes in the last 5 mins. 2880 ground strikes in the last 45 minutes. This one is going for broke. Very noisy :smile: )

AMDave 2011-10-14 19:11

Finished.

Lennart 2011-11-15 14:57

PRPNet
 
Is it only me who can't reach any servers ?


Lennart

mdettweiler 2011-11-15 15:04

[QUOTE=Lennart;278485]Is it only me who can't reach any servers ?


Lennart[/QUOTE]
Everything's working fine for me right now--I can load all of the web pages, and my clients on ports 9000, 1400, and 13000 are getting through as well. :huh:

Lennart 2011-11-15 15:06

[QUOTE=mdettweiler;278486]Everything's working fine for me right now--I can load all of the web pages, and my clients on ports 9000, 1400, and 13000 are getting through as well. :huh:[/QUOTE]


Yes it seems to work now. :smile: Strange but good.


Lennart

AMDave 2011-11-17 11:50

[QUOTE=Lennart;278485]Is it only me who can't reach any servers ?[/QUOTE]
Don't feel too bad.
From where I am sitting I can barely reach my own server.
stretching ... stretching ... got it.
j/k

That was just me being incredibly lazy. My chair is on castors. :smile:

Mini-Geek 2011-11-22 13:48

[url]http://noprimeleftbehind.net:1400/all.html[/url] does not respond for me, but [url]http://nplb-gb1.no-ip.org:1400/all.html[/url] does. Is something wrong with the NPLB domain?

paleseptember 2011-11-22 21:58

Seconded to Mini-Geek (ie. it's not just him.)

mdettweiler 2011-11-23 00:18

Ah yes...it seems Gary's IP address changed again. I went into the DNS settings and updated the IP--should be all set within an hour or so when the changes farm out to various DNS servers worldwide. :smile:

Oddball 2011-12-04 21:01

The server has been down for the past hour.

Mini-Geek 2011-12-05 03:09

[QUOTE=Oddball;280992]The server has been down for the past hour.[/QUOTE]

And it's still down. Not just the NPLB domain either, this time. :yucky:
Unfortunately, all of my fallback servers were hosted at NPLB, as well, so I've got 6+ hours of idle time. Are there any active public PRPnet servers hosted elsewhere for such an occasion?

gd_barnes 2011-12-05 05:33

I'm very sorry guys. My buddy knocked the downstairs plug out of the wall that controls the internet, phone, and cable when bringing up the XMAS tree early this afternoon. Unfortunately I've been out all day and just now noticed it. Needless to say, it's plugged back in now.

AMDave 2011-12-05 09:17

Santa's little helper has been a very naughty boy.
1 lump of charcoal for him.
Ho! Ho! Ho!
:razz:

j/k.
The holiday season involves a lot of activities that do not occur throughout the rest of the year.
In terms of process analysis, control and quality, Deming called these Uncontrolled Factors 'special causes'.
In simpler terms, 'Stuff happens that you can't control' ... AND ... your buddy is 'special' :P
Situation normal. No big deal. :)


All times are UTC. The time now is 05:37.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.