mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   No Prime Left Behind (https://www.mersenneforum.org/forumdisplay.php?f=82)
-   -   LLRnet servers for NPLB (https://www.mersenneforum.org/showthread.php?t=10042)

MyDogBuster 2009-06-29 19:09

Looks like port IB8000 is hung

IronBits 2009-06-29 23:22

connection request from d7f50f79:3042 (socket 6)
net_Recv : bad header 'GET '
connection closed (socket 6)
connection request from d7f50f79:3176 (socket 6)
recv error res=0, errno=107
connection closed (socket 6)
connection request from d7f50f79:4415 (socket 6)
net_Recv : bad header 'P'
connection closed (socket 6)

Restarted ok.

IronBits 2009-06-30 03:43

Restarted it just now, again.
finished ...
write 'joblist.txt' ...
finished ...
connection closed (socket 6)
connection request from 2ce13f4a:1483 (socket 6)
net_Recv : bad header 'GET '
connection closed (socket 6)
connection request from 2ce13f4a:1720 (socket 6)
net_Recv : bad header 'P'
connection closed (socket 6)
connection request from 2ce13f4a:3033 (socket 6)
recv error res=-1, errno=104
shutdown: errno 107
connection closed (socket 6)
connection request from 2ce13f4a:3641 (socket 6)
net_Recv : bad header 'GET '
connection closed (socket 6)
connection request from 2ce13f4a:3885 (socket 6)
net_Recv : bad header 'P'
connection closed (socket 6)
connection request from 2ce13f4a:2503 (socket 6)
recv error res=-1, errno=104
shutdown: errno 107
connection closed (socket 6)

mdettweiler 2009-07-05 13:50

Looks like someone's attempting to connect to the server directly with a web browser. Should be harmless, since the server is rejecting such connections out of hand.

gd_barnes 2009-07-06 04:18

Port 2000 JobMaxtime has been changed back to 1 day.

kar_bon 2009-07-13 21:24

Gary's servers are down!
no response from [URL]http://nplb-gb1.no-ip.org/llrnet/[/URL] too.

Edited: Yup I moved too my independent search. If it is down for more than a day, you may have to adjust the prune period and jobmaxtime (before restarting) so Karsten doesn't lose his uploads.

Edit2: yep, waiting 400 results from G8000 for upload!

mdettweiler 2009-07-13 23:16

[quote=kar_bon;180870]Gary's servers are down!
no response from [URL]http://nplb-gb1.no-ip.org/llrnet/[/URL] too.

Edited: Yup I moved too my independent search. If it is down for more than a day, you may have to adjust the prune period and jobmaxtime (before restarting) so Karsten doesn't lose his uploads.

Edit2: yep, waiting 400 results from G8000 for upload![/quote]
Ouch. Might be a power outage or something. Fortunately, all of my cores are using PRPnet and thus will fall back to the PrimeGrid servers when Gary's are down. So, between that, and with your cores and Ian's moved off, then it doesn't look like we'll have too serious of a problem even if the servers remain down until Gary gets back.

Mini-Geek 2009-07-13 23:25

One more side effect of Gary's servers going down: [URL]http://www.mersenneforum.org/showthread.php?p=180865#post180865[/URL]
The relations for [URL="http://www.mersenneforum.org/showthread.php?t=12119"]a factorization[/URL] are being uploaded to/downloaded from one of Gary's boxes, so that's unavailable.

AMDave 2009-07-14 10:16

Stats updates blew out from 4.5 minutes to over 40 minutes waiting for http response.
I am suspending updates from GB until the issue is fixed.

/ed- done -ed/

IronBits 2009-07-15 03:51

Use a 20 second timeout in your script.
Some ideas to chew on, not sure if you are using bash or perl. Perl makes it really easy to do...

[url]http://redflo.de/tiki-index.php?page=Bash+script+with+timeout+function[/url]

./cmdtimeout "ssh server2 /usr/sbin/command_that_may_hang" 5

vim cmdtimeout
#!/bin/bash
command=$1
# run $command in background, sleep for our timeout then kill the process if it is running
$command &
pid=$!
echo "sleep $2; kill $pid" | at now
wait $pid &> /dev/null
if [ $? -eq 143 ]; then
echo "WARNING - command was terminated - timeout of $2 secs reached."
echo
fi

IronBits 2009-07-15 06:13

As you can see here, no one appears to be working on port 4000, that or the server crashed...
[url]http://nplb.ironbits.net/[/url]


kar_bon: it's ok, i spend one core on this :smile:


All times are UTC. The time now is 22:47.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.