 2010-02-23, 09:17 #67 gd_barnes     "Gary" May 2007 Overland Park, KS 22·5·593 Posts Holy cow. It looks like port 9950 had already dried out again. That's 30K+ pairs. I'm not going to load it again because that was a sufficient stress test. If you guys want, try port 9975. It has a bunch of pairs at n=~540K from the 11th drive and will NOT dry out. :-) Max, I'm going to be curious as to what pairs the server couldn't process overall. Clearly it cannot seem to handle the small k=3 pairs that are prime right at the beginning of the file. Like before, my client shows them as processed but as rejected yet the server shows no rejected pairs with them just sitting in the joblist. I suspect we encountered a load-related issue with LLRnet there. The tests are all n=1K-10K primes. I'm pretty sure the above is an already existing LLRnet issue. What I will do is set up another personal server and run the same initial tests through it to recreate the problem. I'll then run an "old" client on the same file to make sure that we have not introduced a new problem here. If not, we can let it go. Few people would ever use LLRnet for testing n<10K. I'll go ahead and start up port 6000 up on all of my machines again for the night. There's no reason to let them sit idle over night. Excluding the issue with the server not "taking" the small k=3 tests right at the beginning, with this testing, we've identified 3 issues on the Linux client. Max has now fixed 2 of them and it appears he'll look into the 3rd issue on Tues. related to stopping a client when the server is down. I know that on the Window's side, the primes.txt file was being written correctly. What we'll also still need to do there is see if the # of iterations and stopping a client when the server is dried/down issues still exist in Windows. Gary Last fiddled with by gd_barnes on 2010-02-23 at 09:27
 2010-02-23, 09:30 #68 kar_bon     Mar 2006 Germany BBB16 Posts good work! the cLLR-option OutputIterations works fine in my DOS-version: stop script, edited that line to 100000 iterations, started script and the next load of workunits, cLLR will display that new iteration-counting properly. to G4000: so i will try to submit the other results to the server when i'm home again. while testing port 9950 i've found no prime, because of Gary's 31 cores but my prime-logging works, too. to your last post: - iterations ok - server down/dried see post #54 Last fiddled with by kar_bon on 2010-02-23 at 18:08
Quote:
 Originally Posted by gd_barnes I thought you fixed #1 on Jeepford also.
No, I didn't, sorry.

 2010-02-23, 18:08 #70 kar_bon     Mar 2006 Germany 56738 Posts issue from post #54 solved: in function DialogWithServer() change some lines this way: Code:  t, k, n = GetPair() if t and k and n then print(format(" Fetching WU #1/%d: %s %s",WUCacheSize,k,n)) changed = 1 else return end so the output "Fetching..." will only displayed, when getting a new pair was successful, otherwise exit the function.
Quote:
 Originally Posted by kar_bon issue from post #54 solved: in function DialogWithServer() change some lines this way: Code:  t, k, n = GetPair() if t and k and n then print(format(" Fetching WU #1/%d: %s %s",WUCacheSize,k,n)) changed = 1 else return end so the output "Fetching..." will only displayed, when getting a new pair was successful, otherwise exit the function.
Okay, I've applied that fix in the llrnet.lua file for my Perl script as well (just finished uploading).

Meanwhile, I also clarified the documentation a bit: Gary mentioned that it was a tad wordy and thus he missed some key setup instructions, so I added a bit saying "start here if you just want to find out how to get started".

 2010-02-23, 20:36 #73 mdettweiler A Sunny Moo     Aug 2007 USA (GMT-5) 2·55 Posts Sounds like a good plan. BTW, I'll be out of the house from about 5:30-10:30 EST; that works out to 4:30-9:30 CST. Not that it makes much of a difference since I don't really have enough cores to contribute meaningfully to any stress test, but I figured I'd let you guys know. Way cool on being able to figure out the equivalent # of cores at 500K where we start running into problems with LLRnet. Once we've narrowed it down to a more exact figure with these stress tests, that should be immensely helpful for future rallies. Come to think of it, I should be able to apply a similar method to determine more exactly the load of PRPnet 2.4.6 at n=500K; that way we can find out whether a PRPnet rally would even be feasible prior to the perfecting of 3.x. Last fiddled with by mdettweiler on 2010-02-23 at 20:38
Quote:
 Originally Posted by gd_barnes Hey guys. Excellent work! What a great team effort! (...) ...hope you guys aren't too blery eyed from the late nights. :-) Thanks for all of your hard work.
yep, great work and another issue solved (not that bad, only a cosmetic operation).

no, not really hard work for me! just 2 weeks ago i begun to test this idea and after the first script run fine, it was fun to do this nobody thought before of this option!

and it's working great! (ok, small n-values could be tested by individuals in few days).
and what about other possibilities? think about it! taking LLRnet only as server/client and with changing the script other programs should run as well.

now we are able to use LLRnet's abilities and we're independent from others!
that's the great chance we must use! although we should inform Jean (and he informs Vincent) of this new use of LLRnet!

 2010-02-23, 22:22 #75 gd_barnes     "Gary" May 2007 Overland Park, KS 101110010101002 Posts Port 9975 is loaded up and ready to go with the previously mentioned pairs. There's a bunch of primes at the beginning. I'm still working on port 9950.
 2010-02-23, 23:25 #76 gd_barnes     "Gary" May 2007 Overland Park, KS 101110010101002 Posts Port 9950 is loaded up and ready to go. The delay was caused because I'm having tremendous difficulty getting my main machine, Jeepford, to connect to the server. (It's internet access is fine because I'm typing from it right now.) It keeps saying "could not connect to server after 5 tries". I've checked everything in the llr-clientconfig.txt file and it looks good. I've stopped and restarted the server; no luck. I've stopped and restarted the clients; no luck. I'm going to try loading the client on some other machines now. FYI, I'm using the "Perl do.pl" command. I thought it would be easier and faster today. I was mistaken as usual. It's always something. Ergh. Last fiddled with by gd_barnes on 2010-02-23 at 23:25
Quote:
 Originally Posted by gd_barnes I thought it would be easier and faster today. I was mistaken as usual. It's always something. Ergh.
check, if you deleted all old files like tosend.txt, workfile.txt, workfile.res and check the entries in llr-clientconfig.txt again!

