mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   No Prime Left Behind (https://www.mersenneforum.org/forumdisplay.php?f=82)
-   -   LLRnet servers for NPLB (https://www.mersenneforum.org/showthread.php?t=10042)

Flatlander 2009-01-11 08:54

[quote=IronBits;158009]For those of you that might be curious what the console scrolling looks like on a busy llrnet nplb Server handing out ~200,000 knpairs per day might look like, I give you:
[URL]http://nplb.ironbits.net/images/port9000action.mov[/URL]
Warning: It is about 25 MEG in size
Hopefully it won't get /. type coverage and bring us to our knees. :wink:[/quote]
Okay, I've watched the movie. Will there be a book?

IronBits 2009-01-11 10:03

:missingteeth: I think we can add another 50-100 cores tho...
looking to see where/when it might break down.
No books, can't read :wink:

Get Mr. Sheep to come back on it will help.

gd_barnes 2009-01-11 11:01

[quote=henryzz;158060]i actually meant a time gap
when are the timestamps set
is it before the client fails to send the results to the server or after[/quote]

Max,

I think Henry meant that we could determine how long my server was down by looking at the gap in the timestamp of the results. I think I'll go check that now. Good idea Henry.


Gary

gd_barnes 2009-01-11 11:12

[quote=MyDogBuster;157844]Looks like G3000 & G4000 are down.

It's Gary, we had no problems when he was out of town. LOL[/quote]

[quote=gd_barnes;158081]Max,

I think Henry meant that we could determine how long my server was down by looking at the gap in the timestamp of the results. I think I'll go check that now. Good idea Henry.


Gary[/quote]


Well, I just analyzed the results from Jan. 9th. It was not too bad compared to last time. Results were returned at:

18:20:56
-and-
20:08:39


So my IP address changing caused my server to be down for a little over 1 hour 47 mins.

Now that we hopefully have the no-IP thing set up properly, I think we can assume that any IP address change on my end will "only" (lol) cause a 2-hour outage or less of my servers.

Ian or whomever connects to port G4000 or any other server of mine for NPLB or CRUS, just cache enough pairs on each of your machines to last 2-3 hours. For this drive at the testing time of the current n-range, I think cacheing 25 pairs should be sufficient. If you do that and the server later drops, your CPU's will just keep right on crunching and when it comes back up, your machines should send all of the queued results.

I will make note of the dates that this happens starting now. I want to determine if there is a pattern of when they occur and how often.


Gary

MyDogBuster 2009-01-11 13:07

[QUOTE]Well, I just analyzed the results from Jan. 9th. It was not too bad compared to last time. Results were returned at:

18:20:56
-and-
20:08:39


So my IP address changing caused my server to be down for a little over 1 hour 47 mins.
[/QUOTE]

Maybe, Maybe not.

I had to do ipconfig/flushdns on all my machines. When I did, they all got work. SOOOOOO If I had found the problem sooner, I would have fixed it sooner. Hard to say just how long your machine was down. Could have only been 5 minutes or anywhere in between. What was strange on this end was the fact that I had to "flush" them all. Usually some startup by themselves (I guess they could care less what your DNS is), but not this time. And the plot thickens.

Late entry, I have 2 cores on 2 different machines running PRPNet. They hung also (and needed the flush).

em99010pepe 2009-01-11 13:34

The famous Gary's stable internet connection...lol

Mini-Geek 2009-01-11 14:24

[quote=MyDogBuster;158094]Maybe, Maybe not.

I had to do ipconfig/flushdns on all my machines. When I did, they all got work. SOOOOOO If I had found the problem sooner, I would have fixed it sooner. Hard to say just how long your machine was down. Could have only been 5 minutes or anywhere in between. What was strange on this end was the fact that I had to "flush" them all. Usually some startup by themselves (I guess they could care less what your DNS is), but not this time. And the plot thickens.

Late entry, I have 2 cores on 2 different machines running PRPNet. They hung also (and needed the flush).[/quote]
The DNS at no-ip is set to expire after 1 minute. In a test I just did where I pinged the server repeatedly, it would not stop and get a new IP when the normal expiration time is up when the pinging is continually going, but when I stop the ping and restart it with the command again, it will check again if the expiration time is up. (though this might be more related to the ping command in Windows, which seems to resolve it once then ping the IP; a program might always point to the URL whenever it needs to access the outside system, in which case I'd expect the 1 minute TTL to function correctly) Maybe if you still have connections it won't check no matter the expiration time. The only thing flushing the DNS is doing is forcing it to expire it and find it again, while it probably should have expired by now anyway with the 1 minute TTL (time to live).

mdettweiler 2009-01-11 14:32

[quote=MyDogBuster;158094]Maybe, Maybe not.

I had to do ipconfig/flushdns on all my machines. When I did, they all got work. SOOOOOO If I had found the problem sooner, I would have fixed it sooner. Hard to say just how long your machine was down. Could have only been 5 minutes or anywhere in between. What was strange on this end was the fact that I had to "flush" them all. Usually some startup by themselves (I guess they could care less what your DNS is), but not this time. And the plot thickens.

Late entry, I have 2 cores on 2 different machines running PRPNet. They hung also (and needed the flush).[/quote]
Yes, indeed...it appears that the DNS records themselves are being updated quite quickly, but instead the main problem that makes the servers unreachable is this infernal DNS cache thing that needs to be flushed on the clients. I haven't really had much of a problem with this, but then again, some Google research I did indicated that most Linux distros that don't specifically have DNS caching software installed don't cache DNS records anyway, so thus my computer will be able to access the server as soon as the DNS records are updated.

henryzz 2009-01-11 14:34

[quote=mdettweiler;158109]Yes, indeed...it appears that the DNS records themselves are being updated quite quickly, but instead the main problem that makes the servers unreachable is this infernal DNS cache thing that needs to be flushed on the clients. I haven't really had much of a problem with this, but then again, some Google research I did indicated that most Linux distros that don't specifically have DNS caching software installed don't cache DNS records anyway, so thus my computer will be able to access the server as soon as the DNS records are updated.[/quote]
is there a way for people to have their cruching pc flush the cache every 15 mins or something like that all the time

mdettweiler 2009-01-11 14:35

[quote=Mini-Geek;158107]The DNS at no-ip is set to expire after 1 minute. In a test I just did where I pinged the server repeatedly, it would not stop and get a new IP when the normal expiration time is up when the pinging is continually going, but when I stop the ping and restart it with the command again, it will check again if the expiration time is up. (though this might be more related to the ping command in Windows, which seems to resolve it once then ping the IP; a program might always point to the URL whenever it needs to access the outside system, in which case I'd expect the 1 minute TTL to function correctly) Maybe if you still have connections it won't check no matter the expiration time. The only thing flushing the DNS is doing is forcing it to expire it and find it again.[/quote]
Hmm...that's interesting. I figured that No-IP would have a very low TTL, like all dynamic DNS servers by necessity must have, but wasn't sure if Windows's cache would listen to that or not. Interesting theory about the active connections messing something up--though, I don't think LLRnet keeps its connections active between when it contacts the server, so I can't see how that would affect things for more than a few seconds, and even then, only on one or two clients that just *happened* to talk to the server during that maximum-1-minute downtime on No-IP's end.

mdettweiler 2009-01-11 14:36

[quote=henryzz;158111]is there a way for people to have their cruching pc flush the cache every 15 mins or something like that all the time[/quote]
Maybe setting a scheduled task to run "ipconfig /flushdns" every 5 or 15 minutes would do the trick?


All times are UTC. The time now is 23:07.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.