mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Raiders of the Lost Primes (https://www.mersenneforum.org/forumdisplay.php?f=87)
-   -   Testing.... (https://www.mersenneforum.org/showthread.php?t=13099)

kar_bon 2010-02-18 16:46

Testing....
 
[center][size=+2]Backgrounds and purposes[/center][/size]

LLRnet is a Client/Server program to calculate primes with LLR.
LLRnet was programmed in 2004-2005 by Vincent Penne in the language LUA. It uses internally the program LLR of Version 3.5 programmed by Jean Penne.

The latest LLR-Version available is 3.8, which is 10%-20% faster than V3.7, with more possibilities in testing different values and less issues in small n-values.

Vincent do not support a newer version of LLRnet, so the idea is to make both (LLRnet and LLR) working together with a script, using the feature for the client/server communication and the speed of the new LLR V3.8!

The LLRnet-client consists of an EXE-file with some other LUA-source files (like Pascal or C), which easily can be edited.


[center][size=+2]First attempts[/center][/size]
The idea was, to skip the primetest in the lua-source by commenting out that line.
So LLRnet will reserve some workunits from the server, saves them in a test-file and quit.
Now LLR take this test-file and do the primetest. After completion the LLR-resultfile have to be converted into the format LLRnet can read. In the next step LLRnet uses the converted resultfile and send it to the server and grabs new workunits again.

That's the (quite easy) theory.

The first small version of the script looked like this:
[code]
1 :start
2 if exist z* goto do_cllr
3 gawk -f do_tosend.awk lresults.txt
4 type lresults.txt >>lresults_hist.txt
5 del lresults.txt workfile.res workfile.txt llr.ini
6 llrnet
7 del tosend.txt
8 :do_cllr
9 cllr -d workfile.txt
10 goto start
[/code]

2 if there is a cllr-savefile, continue the test
3 convert the cllr-resultfile 'lresults.txt' into 'tosend.txt' for submitting with LLRnet
4 append the current results to a history-file
5 delete all not used files (will be written at every call of do.bat)
6 calling LLRnet to receive new pairs and send old results
7 delete the old LLRnet submit-file
9 calling cLLR and testing all workunits in 'workfile.txt'
10 begin from start

[center][size=+2]Downloads[/center][/size]

The current official Version V0.7 can be downloaded [url=www.rieselprime.de/dl/LLRnetV07.zip]here[/url].
The zip contains all files for the latest LLRnet WIN-version 0.9b7 (client and server) and also the scripts and gawk.exe.

A UNIX version of the script will be done by M.Dettweiler.

The package is pre-configured for a local server (127.0.0.1) and with my user name.
To test start the server (llrserver.exe in server-folder) and then run 'do' (in client-folder).
You can interrupt the script by pressing CTRL-C when cLLR is running (testing a pair).
To restart only call 'do' again or cancel jobs by calling 'do -c'.

ToDo: ReadMe.txt for Win-Version to include.

kar_bon 2010-02-18 19:01

[center][size=+2]ToDo list and suggestions for latter versions[/center][/size]

[b]ToDO[/b]:

[strike][b]1.[/b] The choice-command in WIN are different in XP and Vista, so the waiting for reconnecting to the server in the script with the command (waiting for 60 seconds):[code]choice /t 60 /d j >nul[/code]
won't work under XP![/strike]
[color=red][b]Solved on 2010-02-19.[/b][/color]

[strike][b]2.[/b] If the workfile.txt contains > 1 workunits and cllr didn't tested all pairs yet, stopping the script and cancelling with "do -c" will cancel [b]all[/b] reserved pairs! -> only not tested pairs to cancel![/strike]
[color=red][b]Solved in Version 0.7 from 2010-03-06[/b][/color]

[b]Suggestions[/b]:

1. when cancelling, print the pairs in the lresults.hist.txt with date/time.
-> only counting is printed in lresults-hist.txt, which pairs are displayed only on screen

2. make a sound, if script ended (imagine PC is running offline, no/off monitor -> aware when ends)
-> done with a script option the user can set

3. set the LLR-option OutputIterations by parameter to the script
-> done with a script option the user can set

4. manipulate LLRnet options "WUCacheSize" and "once" by seperate script -> stopping LLRnet
-> not needed: 'once' is not used anymore -> cancelling pairs with script is working

5. submitting LLR-timings to server, rather than time between receive-submit in server-results
-> done in Version 0.8

kar_bon 2010-02-18 19:02

[center][size=+2]Test cases[/center][/size]

[b]1.[/b] Receiving workunits from server and server crashes.

[b]2.[/b] Cancelling workunits with LLRnet -c with the script.

[b]3.[/b] Starting script when server dried.

[b]4.[/b] Processing a different type of primesearching (normally k*2^n-1 30000000000000:M:1:2:258), so perhaps twins or Sophie-Germain.

[b]5.[/b] Process different n-values with many cores (stress-test).

[b]6.[/b] Processing small-n/big-k-values with LLR and the conversion script.

kar_bon 2010-02-18 19:03

[center][size=+2]Version history[/center][/size]

2010-03-08 V0.8 cancel all jobs at once in llrnet.lua, submitting LLR-timings to server
2010-03-06 V0.7 cancelling works, new script to clean workfile.txt, new option to set by user
2010-02-21 V0.61 waitloop set before loop, comment out 2 lines in llrnet.lua (grabbing new pair although once=1)
2010-02-20 V0.6 cancel-option works, some more changes in llrnet.lua
2010-02-19 V0.51 using sleep.exe (renamed to waits.exe here) from MS Resource Kit
2010-02-17 V0.5 Option -c to cancel jobs (not working properly)
2010-02-14 V0.4 printing date/time in lresults.txt with set, beep when prime found, 5 attempts if no connection to server, primes.txt only created/added when prime found
2010-02-12 V0.3 Printing found primes in extra file, commenting out in llrnet.exe, date/time in lresults.txt
2010-02-11 V0.2 Error handling included, commented LLRnet-outputs, llr.ini with OutputIterations set
2010-02-10 V0.1 First small version worked

kar_bon 2010-02-18 22:34

CANCEL jobs issue
 
[strike]i'm currently working on the cancel-issue. a first test seems (almost) perfect checked by Max: the reserved pairs were all canceled from/at the server.
one thing: yesterday i've changed the reservation-loop in the lua and now the first pair was written twice in the workfile.txt and also cancelled twice! (seems not possible but it was so).[/strike]


2010-02-20:
first tests with some changes in llrnet.lua the cancel-function seems to work! some more testing needed!

2010-02-21: the version 0.61 now got the fully working code for cancelling reserved pairs.
those were tested by different users without any issue.

kar_bon 2010-02-19 14:02

CHOICE - waiting in script for n seconds
 
Because CHOICE behave different on my XP and Vista box, i will try another tool.

Possibilities:

[b]1.[/b] With PING:
ping 127.0.0.1 -n 60 -w 1000>NUL

waits for 60 seconds.

[b]2.[/b] [url=http://www.microsoft.com/downloads/details.aspx?FamilyID=9d467a69-57ff-4ae7-96ee-b18c4790cffd&DisplayLang=en]Windows Server 2003 Resource Kit Tools[/url]:
sleep.exe

[b]3.[/b] [url=ftp://ftp.microsoft.com/Services/TechNet/samples/ps/win98/reskit/scrpting/SLEEP.EXE]Win98 Resource Kit[/url]:
sleep.exe

[b]4.[/b] [url=http://unxutils.sourceforge.net/]UnixTools[/url]:
sleep.exe

I think, no.3 is the best choice instead of CHOICE! :grin:

[quote=mdettweiler]Confirmed, number 3 works on my Windows XP setup as well. If it was designed for Win98, and works on XP and Vista as well, then it should work on pretty much anything. :smile:[/quote]

With Win Vista it's ok, too, so this tool is used in the Win-Version of the script.
I've renamed it to 'waits.exe' (wait seconds), because there're other tools around named 'sleep' which put the PC in sleepmode! But this is not what we want! :smile:

gd_barnes 2010-02-19 21:29

Karsten,

I need to get a good idea of where we are at in all of this on the Windows client testing. Can you answer the following questions:

1. Please post remaining problems and how they will be resolved.
2. When should I get involved in a Windows test?
3. Can you make all Linux changes except for the script?
4. If no to #3, please post the files/programs changed in Windows.
5. If no to #3, please post the files/programs that are stable right now so that we can begin making changes on the Linux side.

On #2, I'm holding off a little until the files/programs stablize. Once that happens, I can get my new I7 connected and add that to my 2 slow laptop cores for a better test than what I could do before.

#'s 4 & 5 will allow Max or me to start making changes to the Linux client so that we can speed things up a little.

Please don't add any more new features at this point. We want to get this rolled out fairly soon.


Thank you,
Gary

kar_bon 2010-02-20 01:04

[QUOTE=gd_barnes;206109]
1. Please post remaining problems and how they will be resolved.[/quote]
The major problem (oh, wait, the only problem) now is to cancel jobs with LLRnet. to comment out the LLRMain()-call in llrnet.lua is the way i used and my Quad is working fine with this for serveral days now. to cancel jobs with LLRnet i only have to comment out the real primetest-call but this don't work by now. i'm just understanding more in lua and the way the job-queue is handled. i think, i can fix this problem this weekend!

[QUOTE=gd_barnes;206109]
2. When should I get involved in a Windows test?[/quote]
as above, i think this weekend we could test a little bit more.

[QUOTE=gd_barnes;206109]
3. Can you make all Linux changes except for the script?[/quote]
yes. as i mentioned the lua-files for WIN and LINUX are the same so if i got the WIN-version running, this should be no problem for the LINUX-side. but this has to be tested on LINUX also.

[QUOTE=gd_barnes;206109]
4. If no to #3, please post the files/programs changed in Windows.[/quote]
i will upload the new version of the script and all files needed when the above issue is solved.

[QUOTE=gd_barnes;206109]
5. If no to #3, please post the files/programs that are stable right now so that we can begin making changes on the Linux side.[/quote]
for a point of start for Max/you i can upload also the version i'm running the last days.

[QUOTE=gd_barnes;206109]
On #2, I'm holding off a little until the files/programs stablize. Once that happens, I can get my new I7 connected and add that to my 2 slow laptop cores for a better test than what I could do before.[/quote]
sure, a maxload-test should be done, also. i will add my Quad for this test, then.

[QUOTE=gd_barnes;206109]
#'s 4 & 5 will allow Max or me to start making changes to the Linux client so that we can speed things up a little.

Please don't add any more new features at this point. We want to get this rolled out fairly soon.[/QUOTE]

me, too. there're no more features to add, i think. to support some other non-normal uses of that script can be added later.

kar_bon 2010-02-20 09:51

can someone please check, if these pairs have been canceled correctly:

[code]
2001/84577
2001/84579
2001/84619
[/code]

i think, those were not canceled! (find just the error).

so try these:
[code]
2001/84686
2001/84689
2001/84709
[/code]

hope these ok!


further check, if these pairs have been submitted correctly:
[code]
2001*2^84543-1 is not prime. LLR Res64: E10FDBF18C77D779 Time : 7.978 sec.
2001*2^84559-1 is not prime. LLR Res64: AAC6FB72FD70DD13 Time : 7.962 sec.
2001*2^84561-1 is not prime. LLR Res64: 6EC9994879661F83 Time : 8.007 sec.
[/code]

i think, the latter is correct as usual.
if the former is correct (crossing fingers) that would be version 1.0!

thanks

mdettweiler 2010-02-20 18:08

[quote=kar_bon;206155]can someone please check, if these pairs have been canceled correctly:

[code]
2001/84577
2001/84579
2001/84619
[/code]

i think, those were not canceled! (find just the error).[/quote]
These are all listed as "in progress" for user kar_bon.

[quote]so try these:
[code]
2001/84686
2001/84689
2001/84709
[/code]

hope these ok![/quote]
The first one (84686) is listed as "in progress" under user kar_bon. The other two are complete, by user MyDog. (Presumably, they were properly canceled, then reassigned.)


[quote]further check, if these pairs have been submitted correctly:
[code]
2001*2^84543-1 is not prime. LLR Res64: E10FDBF18C77D779 Time : 7.978 sec.
2001*2^84559-1 is not prime. LLR Res64: AAC6FB72FD70DD13 Time : 7.962 sec.
2001*2^84561-1 is not prime. LLR Res64: 6EC9994879661F83 Time : 8.007 sec.
[/code]

i think, the latter is correct as usual.
if the former is correct (crossing fingers) that would be version 1.0!

thanks[/quote]
Yes, all three of those results are present in the results file from user kar_bon.

kar_bon 2010-02-20 22:51

@anyone: please check the server if those pairs were reserved by me and canceled properly:
[code]
2001 94893
2001 94899
2001 94901
[/code]

thanks.


All times are UTC. The time now is 00:18.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.