![]() |
[quote=gd_barnes;162557]So much for moving 2 cores from port IB4000 to G8000. Port G8000 isn't working.
Max, I tried the "./llrnet llrserver.lua" command to attempt to start it...no go. It gives a cryptic error message. Can you look into the problem? Gary[/quote] Hmm...I just checked the server and it is indeed still running. :huh: Gary, one thing to keep in mind: if you ever need to restart any of the servers, make sure you do it by logging into the VNC desktop on crunchford, and doing it from there. That's where I've got all the terminal windows already open for the servers. It sounds like you tried to start up the server a *second* time on the main desktop, and (fortunately!) it gave you the cryptic error message and didn't start since there was already an instance of that same server running. Oh, wait--doh! I think I know what may be causing the problem. I don't think I ever remember setting up the port forwarding for port 8000... *bangs head* :redface: Okay, I've just checked and it turns out that the absence of port forwarding on that port was indeed the cause of the problem. It is now remedied. :smile: BTW--one other thing I'd like to mention. I've got all the public servers running within a while loop, which means that if the server stops running for whatever reason, it will instantaneously and automatically be restarted. So, in most cases, even if the server crashes, it shouldn't need to be restarted manually; and Gary, if you do ever think it needs to be restarted manually and I'm not around at the time, make sure you check the VNC desktop *first* to see if it's still running on there. If, say, the server is frozen and thus didn't exit and allow the while loop to restart it, then just hit Ctrl-C on the server and it will shut off, then automatically be restarted anew. :smile: Max :smile: |
[quote=mdettweiler;162579]Hmm...I just checked the server and it is indeed still running. :huh:
Gary, one thing to keep in mind: if you ever need to restart any of the servers, make sure you do it by logging into the VNC desktop on crunchford, and doing it from there. That's where I've got all the terminal windows already open for the servers. It sounds like you tried to start up the server a *second* time on the main desktop, and (fortunately!) it gave you the cryptic error message and didn't start since there was already an instance of that same server running. Oh, wait--doh! I think I know what may be causing the problem. I don't think I ever remember setting up the port forwarding for port 8000... *bangs head* :redface: Okay, I've just checked and it turns out that the absence of port forwarding on that port was indeed the cause of the problem. It is now remedied. :smile: BTW--one other thing I'd like to mention. I've got all the public servers running within a while loop, which means that if the server stops running for whatever reason, it will instantaneously and automatically be restarted. So, in most cases, even if the server crashes, it shouldn't need to be restarted manually; and Gary, if you do ever think it needs to be restarted manually and I'm not around at the time, make sure you check the VNC desktop *first* to see if it's still running on there. If, say, the server is frozen and thus didn't exit and allow the while loop to restart it, then just hit Ctrl-C on the server and it will shut off, then automatically be restarted anew. :smile: Max :smile:[/quote] What do you mean it was "still" running? It was never running to begin with! I tried moving my 2 cores just a few hours after you presumably started it. lol Oh well, I'm glad you got it running. Unfortunately I don't have time to move my 2 cores over from port 7000 at the moment. Can I make a request? When starting a new server, please run a couple of tests on it to make sure it works before opening it up to the public. Thanks. Gary |
Max,
Two problems on the port G8000 status page: 1. The "All primes found so far on this server" is a dead link. Perhaps it is because no primes have been found yet. If so, it should still link to a page that says "No primes found yet". (Be sure and delete the message once a prime is found.) 2. There is no previous results file in the "All results files for NPLB servers". There should be one because the first k/n pair processed today was n=600285 and I know there were some pairs for lower n's loaded into the server. In the future, I'd like to work on a "test plan" for all of the scenarios on new servers before rolling them out publicly, especially when we do the PRPnet server. Thanks, Gary |
[quote=gd_barnes;162750]Max,
Two problems on the port G8000 status page: 1. The "All primes found so far on this server" is a dead link. Perhaps it is because no primes have been found yet. If so, it should still link to a page that says "No primes found yet". (Be sure and delete the message once a prime is found.) 2. There is no previous results file in the "All results files for NPLB servers". There should be one because the first k/n pair processed today was n=600285 and I know there were some pairs for lower n's loaded into the server. In the future, I'd like to work on a "test plan" for all of the scenarios on new servers before rolling them out publicly, especially when we do the PRPnet server. Thanks, Gary[/quote] Regarding #1: ah, yes, I forgot about that. With G4000 we never ran into that problem because I'd ran a few small known primes through the server anyway to test the functionality of my scripts. I'll see what I can do about it, though as you'll see below this is the least of our worries for now... Regarding #2: Ouch! I just realized why that's happening: I just realized that my copy-off script was accidentally set to use "4000" in the results file names for port 8000, instead of "8000"! And since the script does the port 8000 results files *after* it does the results for port 4000 from that day....the 8000 results file would overwrite the 4000 results file for that day! This means that we are essentially missing all of our results from port 4000 for 2/13 and 2/14. :ouch2: I've got the results file name fixed so it won't happen again, and I also renamed the mislabeled files to their correct port #, but there's nothing I can do to bring back the two days' worth of results we lost from G4000. David, would it be possible for you to take all the G4000 results in the DB imported on 2/13 and 2/14, dump them to a file, and send them to me? We may be able to salvage most of the lost work that way. [I](Edit: Correction, make that the results from 2/12 and 2/13. I forgot for a moment that since the daily files are copied off at midnight, the datestamp on the file is actually a day later from what the individual entries in the DB will show.)[/I] Well, to look on the bright side, at least most of the missing results still made it into the DB. That's because David's server copies off my 15-minute results file updates, as well as the daily files. So, it sounds like at the most we've lost 30 minutes of results (15+15, from the two days) since that's the only thing that (theoretically) doesn't make it into the 15-minute update files. I say "theoretically" because sometimes in actual practice the last day's results will stick around for ~15 minutes after the daily copy-off since the status page updated just before the copy-off script ran. So, we may have been fortunate and didn't lose anything. This also means that a few G8000 results were mistakenly imported as G4000. This shouldn't be a problem; all it will mean is that there'll be a few extra results showing up on the progress table for G4000 on 2/13 and 2/14. Once the numbers drop off the progress table, all that matters is that the results are in the database and have the "GB" server code on them; at that point, the port number isn't displayed on any chart and doesn't really make that much of a difference. Gary, as you said in your previous post, you're right, I really should have done some more testing before telling everyone this server was online. :rolleyes: However, if it's any comfort, the only thing that would have fixed is the thing with the port forwarding not being online at first--the results file problem was somewhat buried and hard to detect. I missed it when I originally added port 8000 to the copy-off script a few months ago, and we never saw it give us any problems since there were no results from port 8000 to overwrite the port 4000 results--until now. Thus, that error, at least, most likely could not have been detected until we actually saw it mess up. Next time I'll be sure to watch veeeeery closely when I make such changes to my scripts! :smile: Max :smile: |
csv.zip file sent via email
|
[quote=AMDave;162805]csv.zip file sent via email[/quote]
Thanks! I just reviewed the data and it looks pretty well complete. I'll get it converted back to the original LLRnet results format and posted on the [url]http://nplb-gb1.no-ip.org/llrnet/results/[/url] website shortly. :smile: |
has the redirect from port 400 to port 4000 been removed today ironbits?
i have just had to change my client to 4000 i have just installed a second network card in my pc and i though for ages that that was causing the problem but after lots of fiddling around i remembered the port number had changed and i hadnt changed my clients |
[quote=mdettweiler;162827]Thanks! I just reviewed the data and it looks pretty well complete. I'll get it converted back to the original LLRnet results format and posted on the [URL]http://nplb-gb1.no-ip.org/llrnet/results/[/URL] website shortly. :smile:[/quote]
Okay, I've uploaded the reproduced results files to the web site. The results are a little out of order within the files (possibly due to how they're stored in the DB?), but they should all be in the correct day's results file. So, even though they may look a little funny at first glance, everything is still where it's supposed to be. :smile: |
The redirect for port 400 is no longer there, as you found out.
|
Looks like Gary's servers changed their IP address again. I had to do the flushdns thingy.
|
[quote=mdettweiler;162798]
Regarding #2: Ouch! I just realized why that's happening: I just realized that my copy-off script was accidentally set to use "4000" in the results file names for port 8000, instead of "8000"! And since the script does the port 8000 results files *after* it does the results for port 4000 from that day....the 8000 results file would overwrite the 4000 results file for that day! This means that we are essentially missing all of our results from port 4000 for 2/13 and 2/14. [/quote] So, how are Karsten or I supposed to balance the results? Ugh! I have to beg to differ with you. Correct test plans would quickly catch this. I've worked in the programming industry long enough to know it. You pick what and when you want tested and run TEST data through it at the time that simulates the test that you want to conduct. In this case, run test data from 11:45 PM to 12:15 AM local time and verify that the results are properly being copied off. If not, it's not a big deal because it's misc. test data that we care nothing about. When we say a test plan, we're talking about using test and not production data. If we "test in production" like this, we get burned. I know I'm beating a dead horse now but: For the PRPnet server, I would suggest that we start putting a test plan together and creating some test data for it fairly soon and long before we actually get to loading production data into it. BTW, I can break anything. lol I've tested legacy stuff for so many years; if there's a bug in something that I'm quite familiar with, I will find the scenario(s) where it won't work. Ian did the same type of work that I did on legacy systems. I'm sure he can do the same if he is familiar with the process being tested. One final thing: Code reviews can help too. For the PRPnet server, I'd suggest having David and/or Rogue do a code-review on your code before beginning testing. Frequently that will catch more than testing can. Gary |
| All times are UTC. The time now is 23:03. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.