![]() |
|
|
#826 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3×2,083 Posts |
Quote:
![]() Gary, one thing to keep in mind: if you ever need to restart any of the servers, make sure you do it by logging into the VNC desktop on crunchford, and doing it from there. That's where I've got all the terminal windows already open for the servers. It sounds like you tried to start up the server a *second* time on the main desktop, and (fortunately!) it gave you the cryptic error message and didn't start since there was already an instance of that same server running. Oh, wait--doh! I think I know what may be causing the problem. I don't think I ever remember setting up the port forwarding for port 8000... *bangs head* ![]() Okay, I've just checked and it turns out that the absence of port forwarding on that port was indeed the cause of the problem. It is now remedied. ![]() BTW--one other thing I'd like to mention. I've got all the public servers running within a while loop, which means that if the server stops running for whatever reason, it will instantaneously and automatically be restarted. So, in most cases, even if the server crashes, it shouldn't need to be restarted manually; and Gary, if you do ever think it needs to be restarted manually and I'm not around at the time, make sure you check the VNC desktop *first* to see if it's still running on there. If, say, the server is frozen and thus didn't exit and allow the while loop to restart it, then just hit Ctrl-C on the server and it will shut off, then automatically be restarted anew. ![]() Max
|
|
|
|
|
|
|
#827 | |
|
May 2007
Kansas; USA
32×13×89 Posts |
Quote:
What do you mean it was "still" running? It was never running to begin with! I tried moving my 2 cores just a few hours after you presumably started it. lol Oh well, I'm glad you got it running. Unfortunately I don't have time to move my 2 cores over from port 7000 at the moment. Can I make a request? When starting a new server, please run a couple of tests on it to make sure it works before opening it up to the public. Thanks. Gary |
|
|
|
|
|
|
#828 |
|
May 2007
Kansas; USA
32·13·89 Posts |
Max,
Two problems on the port G8000 status page: 1. The "All primes found so far on this server" is a dead link. Perhaps it is because no primes have been found yet. If so, it should still link to a page that says "No primes found yet". (Be sure and delete the message once a prime is found.) 2. There is no previous results file in the "All results files for NPLB servers". There should be one because the first k/n pair processed today was n=600285 and I know there were some pairs for lower n's loaded into the server. In the future, I'd like to work on a "test plan" for all of the scenarios on new servers before rolling them out publicly, especially when we do the PRPnet server. Thanks, Gary Last fiddled with by gd_barnes on 2009-02-13 at 23:02 |
|
|
|
|
|
#829 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
186916 Posts |
Quote:
Regarding #2: Ouch! I just realized why that's happening: I just realized that my copy-off script was accidentally set to use "4000" in the results file names for port 8000, instead of "8000"! And since the script does the port 8000 results files *after* it does the results for port 4000 from that day....the 8000 results file would overwrite the 4000 results file for that day! This means that we are essentially missing all of our results from port 4000 for 2/13 and 2/14. ![]() I've got the results file name fixed so it won't happen again, and I also renamed the mislabeled files to their correct port #, but there's nothing I can do to bring back the two days' worth of results we lost from G4000. David, would it be possible for you to take all the G4000 results in the DB imported on 2/13 and 2/14, dump them to a file, and send them to me? We may be able to salvage most of the lost work that way. (Edit: Correction, make that the results from 2/12 and 2/13. I forgot for a moment that since the daily files are copied off at midnight, the datestamp on the file is actually a day later from what the individual entries in the DB will show.) Well, to look on the bright side, at least most of the missing results still made it into the DB. That's because David's server copies off my 15-minute results file updates, as well as the daily files. So, it sounds like at the most we've lost 30 minutes of results (15+15, from the two days) since that's the only thing that (theoretically) doesn't make it into the 15-minute update files. I say "theoretically" because sometimes in actual practice the last day's results will stick around for ~15 minutes after the daily copy-off since the status page updated just before the copy-off script ran. So, we may have been fortunate and didn't lose anything. This also means that a few G8000 results were mistakenly imported as G4000. This shouldn't be a problem; all it will mean is that there'll be a few extra results showing up on the progress table for G4000 on 2/13 and 2/14. Once the numbers drop off the progress table, all that matters is that the results are in the database and have the "GB" server code on them; at that point, the port number isn't displayed on any chart and doesn't really make that much of a difference. Gary, as you said in your previous post, you're right, I really should have done some more testing before telling everyone this server was online. However, if it's any comfort, the only thing that would have fixed is the thing with the port forwarding not being online at first--the results file problem was somewhat buried and hard to detect. I missed it when I originally added port 8000 to the copy-off script a few months ago, and we never saw it give us any problems since there were no results from port 8000 to overwrite the port 4000 results--until now. Thus, that error, at least, most likely could not have been detected until we actually saw it mess up.Next time I'll be sure to watch veeeeery closely when I make such changes to my scripts! ![]() Max
Last fiddled with by mdettweiler on 2009-02-14 at 12:15 |
|
|
|
|
|
|
#830 |
|
Jan 2006
deep in a while-loop
65810 Posts |
csv.zip file sent via email
|
|
|
|
|
|
#831 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
141518 Posts |
Thanks! I just reviewed the data and it looks pretty well complete. I'll get it converted back to the original LLRnet results format and posted on the http://nplb-gb1.no-ip.org/llrnet/results/ website shortly.
|
|
|
|
|
|
#832 |
|
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
2×33×109 Posts |
has the redirect from port 400 to port 4000 been removed today ironbits?
i have just had to change my client to 4000 i have just installed a second network card in my pc and i though for ages that that was causing the problem but after lots of fiddling around i remembered the port number had changed and i hadnt changed my clients Last fiddled with by henryzz on 2009-02-14 at 20:24 |
|
|
|
|
|
#833 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
|
|
|
|
|
|
|
#834 |
|
I ♥ BOINC!
Oct 2002
Glendale, AZ. (USA)
3·7·53 Posts |
The redirect for port 400 is no longer there, as you found out.
|
|
|
|
|
|
#835 |
|
May 2008
Wilmington, DE
22×23×31 Posts |
Looks like Gary's servers changed their IP address again. I had to do the flushdns thingy.
|
|
|
|
|
|
#836 | |
|
May 2007
Kansas; USA
32×13×89 Posts |
Quote:
I have to beg to differ with you. Correct test plans would quickly catch this. I've worked in the programming industry long enough to know it. You pick what and when you want tested and run TEST data through it at the time that simulates the test that you want to conduct. In this case, run test data from 11:45 PM to 12:15 AM local time and verify that the results are properly being copied off. If not, it's not a big deal because it's misc. test data that we care nothing about. When we say a test plan, we're talking about using test and not production data. If we "test in production" like this, we get burned. I know I'm beating a dead horse now but: For the PRPnet server, I would suggest that we start putting a test plan together and creating some test data for it fairly soon and long before we actually get to loading production data into it. BTW, I can break anything. lol I've tested legacy stuff for so many years; if there's a bug in something that I'm quite familiar with, I will find the scenario(s) where it won't work. Ian did the same type of work that I did on legacy systems. I'm sure he can do the same if he is familiar with the process being tested. One final thing: Code reviews can help too. For the PRPnet server, I'd suggest having David and/or Rogue do a code-review on your code before beginning testing. Frequently that will catch more than testing can. Gary |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| PRPnet servers for NPLB | mdettweiler | No Prime Left Behind | 228 | 2018-12-26 04:50 |
| Servers for NPLB | gd_barnes | No Prime Left Behind | 0 | 2009-08-10 19:21 |
| LLRnet servers for CRUS | gd_barnes | Conjectures 'R Us | 39 | 2008-07-15 10:26 |
| NPLB LLRnet server discussion | em99010pepe | No Prime Left Behind | 229 | 2008-04-30 19:13 |
| NPLB LLRnet server #1 - dried | em99010pepe | No Prime Left Behind | 19 | 2008-03-26 06:19 |