mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > No Prime Left Behind

Reply
 
Thread Tools
Old 2008-12-11, 05:18   #485
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

141518 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
Thanks for your attention to detail on this Max. Sounds like a mess.

Any luck with stopping the crashing with this last attempt?
I just checked the server...and it looks like all is working well now! I'll still leave it on the while-loop thingie so that it will automatically restart if it does go down for whatever reason, but it looks like it should be good now.

Thanks David for your help!
mdettweiler is offline   Reply With Quote
Old 2008-12-11, 05:39   #486
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

1041410 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
I just checked the server...and it looks like all is working well now! I'll still leave it on the while-loop thingie so that it will automatically restart if it does go down for whatever reason, but it looks like it should be good now.

Thanks David for your help!

Very good. I'm glad we have some server gurus around here. Thanks for your diligence Max and the helpful hint David.


Gary
gd_barnes is online now   Reply With Quote
Old 2008-12-12, 02:00   #487
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

2×41×127 Posts
Default

At the current processing rate, port 400 will finish well ahead of port 4000 even if you add the last unreserved range into port 400.

Therefore, I'm pulling a quad off of port 400 and moving it to port 4000 within the next hour or so. We'll see if that balances things out more. If not, I'll move a 2nd quad over.

This is partly as a result of David's large extra boost on port 400 but I still would have had to have moved over even without it. As it is now, I may need to move a 2nd quad over in the next couple of days. Thanks for the extra boost David!

No one else needs to move any of their machines at this point. I'll keep things balanced out by moving around as needed...the intent being that port 400 is the last port to finish n<600K.


Gary
gd_barnes is online now   Reply With Quote
Old 2008-12-12, 04:55   #488
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

101000101011102 Posts
Default

I was off on my "mental" calculation for port 4000, thinking that the daily cutoff was midnight like it is for port 400. Doing a more exact calcuation using the true daily cutoff of 7 AM for port 4000, assuming that there are 15,000 k/n pairs in the remaining unreserved range and that range is eventually loaded into port 400, at the current processing rate we have:

Port 400 would finish in 9.6 days.

Port 4000 would finish in 10.1 days.


So we're surprisingly close but since we'd like port 400 to finish last, I'll still move a quad to port 4000 but shouldn't need to move any more. Doing this will make it 10.4 days for port 400 and 8.4 days for port 4000; which would complete the main ports by Dec. 22nd.

If anyone pulls off of port 400 for an extended period, I may move the quad back over.

Mini-Geek, this means when your manual ranges are done, you can move to port 400 for the remainder of the time. Also, assuming that port 443 dries before this time, then people can move to port 400. Since it wouldn't complete too much ahead of the other servers, even having several extra quads on port 400 for just a few days should still put it completing behind port 4000.


Gary
gd_barnes is online now   Reply With Quote
Old 2008-12-12, 05:05   #489
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

2·41·127 Posts
Default

Max,

You talked about in another post somewhere (can't remember where) about port 8000 on my machine and are showing it on the port 4000 web page. I seem to recall in our PM exchange with David that for n>600K, we would use:

Port IB400 for k=400-600
Port IB5000 for k=600-800
Port G4000 for k=800-1001

Can you inform me what the purpose of port 8000 is? Is that going to be a roving server amongst the drives as needed? If so, then cool. If not, then I'm confused.

Assuming the above, within the next 5-6 days, we'll probably want to load the above ports with n=600K-605K for the applicable k-ranges.

We've been using the term "dried" for the servers upon completion of this drive but technically they won't dry; they'll just start handing out n>600K. I really don't want them to dry because we have a lot of CPUs on them now and I'd like people to just keep right on crunching without stopping if they choose to for n>600K.

Max and David,

Once the servers start handing out n>600K, we need to reduce the JobMaxTime to 1 day on ports 400 and 4000 temporarily to get n<=600K cleared out quickly and the drive finished. Once the lower n-range is cleared, then we can go back to 3 and 5 days respectively.


Gary
gd_barnes is online now   Reply With Quote
Old 2008-12-12, 05:21   #490
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
Max,

You talked about in another post somewhere (can't remember where) about port 8000 on my machine and are showing it on the port 4000 web page. I seem to recall in our PM exchange with David that for n>600K, we would use:

Port IB400 for k=400-600
Port IB5000 for k=600-800
Port G4000 for k=800-1001

Can you inform me what the purpose of port 8000 is? Is that going to be a roving server amongst the drives as needed? If so, then cool. If not, then I'm confused.
I was thinking that, considering how we selected the port numbers 400 and 4000 due to their being loaded with work for k=400-1001, that port 8000 would be most appropriate for k=800-1001 in keeping with that convention. And since you and Ian are the only users on G4000, I figured that it wouldn't be too big a deal to phase out G4000 in favor of G8000 for the new >600K work.

This would also make things a bit easier when cleaning up straggling k/n pairs for n<600K--we don't have to worry about setting an active server's jobMaxTime to 1 day; instead, G8000 can be on the normal 5 days, and I can set G4000 to 1 day (or possibly even lower) while, say, I have my two cores on it. Since we'd only be dealing with (hopefully) a few stragglers, I'm sure that my dualcore would be enough to clean up the last few within a few hours.

Of course, if after all this we'd still like to keep G4000 around as a "roving" server in case we ever need it, that would work out fine--I've already got it all set up on the web page and everything, so having it on standby should work quite well.

Quote:
Assuming the above, within the next 5-6 days, we'll probably want to load the above ports with n=600K-605K for the applicable k-ranges.
Agreed. If you're OK with my plan for G4000 and G8000 as outlined above, I can load n=600K-605K for k=800-1001 into G8000 as soon as you'd like.

Which just got me thinking: why don't we just load up *all* of our servers with their respective n>600K work right now, rather than waiting 5-6 days? (Of course, IB400 would have all of the remaining Drive #1 work loaded before any n>600K work.)

Max

Last fiddled with by mdettweiler on 2008-12-12 at 05:22
mdettweiler is offline   Reply With Quote
Old 2008-12-12, 05:47   #491
MyDogBuster
 
MyDogBuster's Avatar
 
May 2008
Wilmington, DE

B2416 Posts
Default

Quote:
Port 400 would finish in 9.6 days.

Port 4000 would finish in 10.1 days.
I've added 3 cores already today, and I'll add another 1 in about 2 hours as I wind down my manual range work. I'm figuring I'm doing about 4200/day now on G4000. Maybe we'll get this figured out about the time we run out of work on Drive 1. LOL

By my reckoning, thats 9.2 days left on 4000. BTW, I should finish my manual ranges on the 16th. That will free up 2 more cores.

Last fiddled with by MyDogBuster on 2008-12-12 at 05:56
MyDogBuster is offline   Reply With Quote
Old 2008-12-12, 05:53   #492
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

2·41·127 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
I was thinking that, considering how we selected the port numbers 400 and 4000 due to their being loaded with work for k=400-1001, that port 8000 would be most appropriate for k=800-1001 in keeping with that convention. And since you and Ian are the only users on G4000, I figured that it wouldn't be too big a deal to phase out G4000 in favor of G8000 for the new >600K work.

This would also make things a bit easier when cleaning up straggling k/n pairs for n<600K--we don't have to worry about setting an active server's jobMaxTime to 1 day; instead, G8000 can be on the normal 5 days, and I can set G4000 to 1 day (or possibly even lower) while, say, I have my two cores on it. Since we'd only be dealing with (hopefully) a few stragglers, I'm sure that my dualcore would be enough to clean up the last few within a few hours.

Of course, if after all this we'd still like to keep G4000 around as a "roving" server in case we ever need it, that would work out fine--I've already got it all set up on the web page and everything, so having it on standby should work quite well.

Agreed. If you're OK with my plan for G4000 and G8000 as outlined above, I can load n=600K-605K for k=800-1001 into G8000 as soon as you'd like.

Which just got me thinking: why don't we just load up *all* of our servers with their respective n>600K work right now, rather than waiting 5-6 days? (Of course, IB400 would have all of the remaining Drive #1 work loaded before any n>600K work.)

Max

Ugh, and I was sure that we had this settled already.

Not to beat a dead horse but what you're trying to avoid on port 4000 is what is going to happen on port 400. When any port dries, it means lost processing time because it's bound to happen at some point when at least one person is not right there at his machine to switch it over to the other port.

I just now reread our PM exchange. Here is how it went:

Me:
Quote:
We don't want more new servers than really needed. Let's just load k=400-600 for n>600K right into port IB400 right behind the current k/n pairs.

The ideal situation would be that the huge resources that we currently have connected to port IB400 will just keep right on with it even after the first drive is done.

I would then suggest using port IB5000 for k=600-800 and port G4000 for k=800-1000. We could create some new port for the roving server. I'm indifferent on that.
You:
Quote:
Okay, that sounds like a good plan. Though, in all honesty, it wouldn't be too terribly hard at all for me to change G4000 to G8000 once the 1st Drive is finished--then it can still be in keeping with our existing pseudo-plan for port numbers and k-ranges.

Or would changing the port number make things harder on David's end as far as importing the stats from G4000/G8000 goes?
It ended there without any kind of additional suggestion to use port 8000 for k=800-1001. I guess part of my point is to stop the whole "new server" thing and just stick with what we already have to avoid lost processing time and more new servers. We've had enough servers already to choke a horse. (Ian's phrase that I kind of like, lol) Any new server can be used as a roving server.

What I suppose I didn't pick up on is: What do you mean by a "pseudo-plan for port #'s and k-ranges". I wasn't aware of any such plan. If that's the case, then we need to go all the way with it: Port 400 for k=400-600, port 600 for k=600-800, and port 800 for k=800-1001 but that would be 2 new servers. bleh!

As for switching JobMaxTime, that's a 5 min. task twice. As for loading n>600K pairs, that could be done now in the existing ports.

One final thing on this: Don't forget that Ian and me have to switch upwards of 40 cores any time a port changes. It's a pain. Others a little less but still an annoyance.

I'll send you the k=400-1001 for n=600K-1M file shortly.


Gary
gd_barnes is online now   Reply With Quote
Old 2008-12-12, 05:59   #493
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

1041410 Posts
Default

Quote:
Originally Posted by mdettweiler View Post

Which just got me thinking: why don't we just load up *all* of our servers with their respective n>600K work right now, rather than waiting 5-6 days? (Of course, IB400 would have all of the remaining Drive #1 work loaded before any n>600K work.)

Max
Not a bad idea but it sure is easy right now to calculate time left to completion and we don't have all of the n-ranges loaded for n<600K yet. That's why I suggested waiting.

Of course we can load port 5000 with k=600-800 right now. As for the others, I'd just as soon wait. When it gets down to loading them, it'll just take an extra step to look in the sieve file and compare to the current pair being handed out to do the calculation of time remaining for the 1st drive pairs.


Gary
gd_barnes is online now   Reply With Quote
Old 2008-12-12, 06:03   #494
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

28AE16 Posts
Default

Quote:
Originally Posted by MyDogBuster View Post
I've added 3 cores already today, and I'll add another 1 in about 2 hours as I wind down my manual range work. I'm figuring I'm doing about 4200/day now on G4000. Maybe we'll get this figured out about the time we run out of work on Drive 1. LOL

By my reckoning, thats 9.2 days left on 4000. BTW, I should finish my manual ranges on the 16th. That will free up 2 more cores.

Oh, OK, great. I'll likely move my quad back over to port 400 in just a few days then.
gd_barnes is online now   Reply With Quote
Old 2008-12-12, 06:14   #495
MyDogBuster
 
MyDogBuster's Avatar
 
May 2008
Wilmington, DE

22×23×31 Posts
Default

Quote:
Don't forget that Ian and me have to switch upwards of 40 cores any time a port changes.
Now thats fun. I can do a whole machine blindfolded once I get the rythme.

Last fiddled with by MyDogBuster on 2008-12-12 at 06:18
MyDogBuster is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
PRPnet servers for NPLB mdettweiler No Prime Left Behind 228 2018-12-26 04:50
Servers for NPLB gd_barnes No Prime Left Behind 0 2009-08-10 19:21
LLRnet servers for CRUS gd_barnes Conjectures 'R Us 39 2008-07-15 10:26
NPLB LLRnet server discussion em99010pepe No Prime Left Behind 229 2008-04-30 19:13
NPLB LLRnet server #1 - dried em99010pepe No Prime Left Behind 19 2008-03-26 06:19

All times are UTC. The time now is 22:37.


Fri Aug 6 22:37:34 UTC 2021 up 14 days, 17:06, 1 user, load averages: 3.42, 3.63, 3.44

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.