mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   No Prime Left Behind (https://www.mersenneforum.org/forumdisplay.php?f=82)
-   -   LLRnet servers for NPLB (https://www.mersenneforum.org/showthread.php?t=10042)

gd_barnes 2009-05-01 22:34

Can someone analyze the following problem:

k/n pair 1709 557812 is in the knpairs.txt file but not in the joblist.txt file.

3 things needs to be done:

1. Find out why the above happened.

2. Look to see if it is in any of our results files.

3. If no on #3, test and process the pair to the server.

When #2 and/or #3 are done, Karsten can then send me the results for n=520K-560K.

Assuming that it hasn't been done, the above pair is the only one holding up completion of the n=520K-560K range.


Gary

kar_bon 2009-05-01 22:44

if processed all files for drive #10 upto n=580k (all done) so far and missing this pair too!

and it's not in any resultfile!

gd_barnes 2009-05-01 22:55

Oops, I meant that you could process all n=540K-580K to me, not 520K-560K, when the missing pair is processed.

Could someone test and send the missing pair to the server please?


Thanks,
Gary

kar_bon 2009-05-01 23:09

done it, but was rejected!

1709*2^557812-1 is not prime. Res64: F2F7FD8D247BB5B2 Time : 456.778 sec.

PS: pair is in the KNpairs.txt (again?) for port 8000!

AMDave 2009-05-01 23:46

[QUOTE=kar_bon;171860]if processed all files for drive #10 upto n=580k (all done) so far and missing this pair too!

and it's not in any resultfile![/QUOTE]

confirmed as missing from the results table in the stats db also
it has not been received for processing in any of the server files
indication is a hiccup in the server software

I guess you can tell me how many times this has happened before and then we can quantify that against the 10,686,829 results that have been processed.

my spin - 1 in several million is not awful odds for a data processing error

Maybe it is acceptable, maybe not?

Ooh!
By the way, the stats database has passed 10 million results!

gd_barnes 2009-05-02 04:19

Excellent odds Dave. Now: How can we make it process this one k/n pair that Karsten says it is rejecting?

David or Dave, can we stop the server, delete the pair from knpairs.txt, restart the server for 5 mins., stop it again, then add it back again to the TOP of knpairs.txt? The hope is that it will then hand it out to someone immediately and that it immediately gets tested and returned to the server properly.

It's not in the results and it's not in joblist.txt but it is in knpairs.txt. I can think of no other way to make it process this pair. But what is making it reject it if it's not in joblist.txt? This is VERY strange!

If it won't process the pair, I guess we'll just leave it out. If that is the case, Karsten, just include it in your results to me and the results that you have saved off. It will just be missing in the server.


Gary

AMDave 2009-05-02 04:49

The stats db hungers for this pair.

I recon we can reload it at the top of the knpairs.
It could have a ctrl-char in there somewhere that stopped it from going through. (Speculation - I have not tested that)

gd_barnes 2009-05-02 04:53

[quote=AMDave;171892]The stats db hungers for this pair.

I recon we can reload it at the top of the knpairs.
It could have a ctrl-char in there somewhere that stopped it from going through. (Speculation - I have not tested that)[/quote]

I would say that is an excellent speculation! I too have had severe problems with carriage control character differences between Windows and Linux -and- Notepad and Wordpad. Sometimes when I've manually combined files in Windows by cutting and pasting one to the end of another and then loaded that file into a Linux machine, the Linux machine just flat out skips testing a k/n pair. If I want to avoid the problem, I do the combining on the Linux machine instead. It's really stupid.

It's already at the top of knpairs.txt. Perhaps stopping it, deleting the entire line including the carriage control character, hitting enter at the end of the line above it, and then manually re-entering it on the new blank line will fix it. That is how I've fixed similar types of errors before.

David (Ironbits), can you please stop port 8000 and try this with the 1709 557812 pair at the top of the knpairs.txt file? Hopefully it will then quickly hand it out for testing.


Gary

IronBits 2009-05-02 05:01

1709 557812 removed, saved knpairs.txt
ran ./llrnet llrserver.lua -s
That flushed everything.
put 1709 557812 at the top again and restarted server

gd_barnes 2009-05-02 05:15

[quote=IronBits;171895]1709 557812 removed, saved knpairs.txt
ran ./llrnet llrserver.lua -s
That flushed everything.
put 1709 557812 at the top again and restarted server[/quote]

Great thanks!

I see you posted this at just a minute after the hour. I hope we didn't interfere with any hourly copying off of results files.

gd_barnes 2009-05-02 06:30

Karsten,

The rogue k/n pair has now been processed. Although it still shows in the knpairs.txt and joblist.txt, that should clear out within a little while (it's probably related to the prune period, which I believe is pretty short) because it has now shown up in the lresults.txt file.

n=540K-580K for port 8000 is now ready to be processed to me when you're ready.

AMDave,

Good call on the bad carriage control character! Deleting it and re-entering it did the trick. :smile:


Gary


All times are UTC. The time now is 22:57.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.