mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   No Prime Left Behind (https://www.mersenneforum.org/forumdisplay.php?f=82)
-   -   LLRnet servers for NPLB (https://www.mersenneforum.org/showthread.php?t=10042)

gd_barnes 2009-02-23 23:43

[quote=mdettweiler;163711]OH MAN! I just figured out what happened. :rolleyes: When I tried logging on to the server's "terminal session" (as opposed to the session it gives you when you log on through VNC), lo and behold--there was a copy of the G8000 LLRnet server that had been running all along! No wonder port 8000 was already tied up by something--it was tied up by the G8000 LLRnet server which Gary had already restarted! :smile:

Gary, in the future, when you restart servers on crunchford could you possibly log into the machine via VNC, and then restart the server from there? Otherwise I can't see the restarted server's terminal window, and I think it's not even running. Thus I kept trying to restart it over on the VNC session, not realizing that it was already up and running on the console session and that that was what was hogging the port. (That would also explain why results have kept coming in for that port even after I thought the server went down. :wink:)

Okay, long story short: turns out the crashed server hadn't hung on the port after all. A new server was running the whole time! :smile:

I'll go and get the server moved over to the VNC session now so I can see it better, and get it into the while loop in case of crashes. :smile:[/quote]


OK, sorry about that. But after I stopped and restarted the server, only one of my clients would run it. I even waited 10-15 mins. one time after stopping the server and then restarted it. The other clients still hung and were sleeping, even after multiple attempts to stop and restart each one of them after stopping and restarting the server. I'll go downstairs and see if they are running now.

You were seeing results because I had ONE client running against the server but the other 6 that I was running against it wouldn't connect.

What's the deal with the VNC thing anyway? I've noticed the same thing that you have. If you have a terminal window up from directly messing with the machine, it won't show in VNC. When you have a terminal window up in VNC, it won't show when you are directly messing with the machine. That seems like a bug to me. It doesn't make sense because I pulled up the task manager (Linux version) and verified exactly what was running on the machine. I then killed the server, waited an appropriate amount of time and restarted it. I tried this twice to unhang my clients. Shouldn't that have killed any terminal window in VNC or non-VNC?

Is there a way around this confusing VNC (remote access) vs. non-VNC problem? English laymen's terms please.

Also, why do you have to keep running this "while" loop? Is that because of my constant IP address changes? If so, why didn't it work this time? It was well over an hour after the crash or IP address change before I tried starting and stopping it.

Gawd, this server stuff is confusing. My question is: Why did port 8000 have the problem and port 4000 didn't? Also, why should it be so difficult to kill the server from the task manager, wait 10-15 mins., and then restart it? These servers shouldn't be rocket science but they are from my perspective.

I think in the future, I'm not going to do any kind of attempt at stopping and restarting of the servers. I just end up creating more problems than there were originally. If there's a problem, I'll just move my machines to something else and you can fix them the next day.

Personally, I think having the servers on my machines has turned out to be a bad idea. My internet connection is quite stable but keeps changing addresses. I've had a mobo crash one time but I think I've gotten that problem resolved. Knock on wood. Ian has been very patient with port GB4000 and diligently does the flushdns thingy when the IP address changes. Others, I'm sure, won't be so patient.


Gary

gd_barnes 2009-02-23 23:59

Port 3000 has now dried out and the range is complete. It has been removed from the 1st post of this thread.


Gary

IronBits 2009-02-24 00:34

ps aux --forest is your friend. :wink:

AMDave 2009-02-24 02:06

ps -ef | grep llr :razz:

gd_barnes 2009-02-24 02:27

Are you guys answering anything in my post because I can't tell?

As stated above:

[quote]
English laymen's terms please.
[/quote]


:smile: Gary

gd_barnes 2009-02-24 03:02

[quote=mdettweiler;163687]In fact, we *do* use fixed IPs for all of Gary's dedicated crunch machines, one of which is running all the servers. :smile:

Gary, as David said--whenever you have a client within your network running one of the nplb-gb1.no-ip.org servers, you can configure the client to say 192.168.2.100 instead. That's the *internal* IP address of the server and it will be a much more direct connection that will never cut off when any No-IP things have to change. :smile: (Note that you can't do this on your laptop, since then you wouldn't be able to connect to the server when away from your home network.)[/quote]

OK, thanks guys. I did that. Hopefully that will resolve the reconnect problem when my public IP address changes.

I'm still wondering about some of the issues that came up per my last post.

Everything seems to be working fine now. Thanks for the help.


Gary

gd_barnes 2009-02-24 03:19

About 15 mins. ago, I just got inundated with 400+ Emails of old primes found. Can you guys look into that quickly? Thanks.

Lennart 2009-02-24 03:25

SPAM :)
 
What the f***k are you doing ?????????:shock:


Creating ads ? :big grin:

/Lennart

Brucifer 2009-02-24 03:40

You guys been hacked or something? I just got 31 notifications for primes found under ports that haven't been run for ages, and aren't running now, plus others that are but I don't have systems crunching on those ports. ???????????????????????????????????

SUM TING WONG

PCZ 2009-02-24 03:48

Holy Spam Batman !!!

AMDave 2009-02-24 03:58

No no no.
Everything is ok.

It was me.

Whoo.
When I fail, I fail spectacularly.

I forgot to update the mail_sent flag on the prime_list table when I re-activated th mail_notification script on the new server.

I completely missed it.
It was not on my checklist.

409 emails went out - thats the difference between the snapshot I took when I started the database migration and the current status of the old database.


I do apologise to Gary, and Max and everyone who just got their notifications again.

Please delete all of the emails from nplb_stats recieved in the last hour.
The table is upto date and there are no more coming.

:sorry:


All times are UTC. The time now is 23:02.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.