mersenneforum.org Testing....
 Register FAQ Search Today's Posts Mark Forums Read

2010-02-24, 09:07   #89
kar_bon

Mar 2006
Germany

300310 Posts

Quote:
 Originally Posted by mdettweiler One other consideration that I can see being an issue with this is screen spam; the way LLR does its screen output, for most k*2^n-1 numbers the status line is just longer than the size of a standard console window, so each update rolls over into a new line. (This issue doesn't usually happen for PRP or Proth tests, since the word "bit" is a lot smaller than "iteration", which is enough to make it fit.) One way to fix this is to expand the console window to a larger width, say 90 characters (instead of the standard 80). Or one could just use the GUI LLR which I think has plenty of room to avoid rollovers anyway.
i've not yet tested LLR-GUI version with this script because you have to put the llr.ini first to a minimum of the options set:
Code:
PgenInputFile=t17_b2.prp
PgenLine=1
- the testfile
- and the line to begin

seems to work without script.
but: you have to end LLR-GUI by hand!?

the long line in the cLLR output is the result like:

7095*2^137576-1 is not prime. LLR Res64: 7E4DD3D5AA6CE447 Time : 15.954 sec.

and this is less than 80 characters only for smaller k-values.
there're two spaces, which could be omitted, so it's possible to do (but for greater k it's again longer than 80 chars).
so the best way is to set the output window to a width of 100 or so.
in Win the properties of a CMD-win can be set for all screens (i use it so no problems)!

2010-02-24, 09:16   #90
kar_bon

Mar 2006
Germany

3·7·11·13 Posts

Quote:
 Originally Posted by gd_barnes Karsten, what about your Windows .awk script? How does it handle it? Will it work correctly if the first pair or pairs in the knpairs.txt file are prime? I don't remember it having a problem with any primes, even ones at the beginning of the file but I may not have been specifically looking for it because I didn't have the cores to stress test it a lot.
i've just tested this:
lresults.txt contains this:
Code:
742837095*2^3190-1 is prime!  Time : 38.325 ms.
742837095*2^3193-1 is not prime.  LLR Res64: 9F1CECDF79EA85C4  Time : 37.998 ms.
742837095*2^3197-1 is not prime.  LLR Res64: 711C29862057740B  Time : 38.217 ms.
742837095*2^3201-1 is not prime.  LLR Res64: 33EB063C96F7756C  Time : 39.356 ms.
742837095*2^3205-1 is prime!  Time : 39.301 ms.
742837095*2^3207-1 is not prime.  LLR Res64: 689F3E5F8EA36E7F  Time : 38.635 ms.
742837095*2^3210-1 is not prime.  LLR Res64: 9876B6D0612180D6  Time : 38.365 ms.
workfile.bak contains this:
5000000000000:M:1:2:258

and do.bat contains this:
gawk -f do_tosend.awk lresults.txt

put the awk-script (do_tosend.awk) and gawk.exe in the same folder and run it.
the resultfile for LLRnet, tosend.txt then contains:
Code:
5000000000000:M:1:2:258 742837095 3190 0 0
5000000000000:M:1:2:258 742837095 3193 -2 9F1CECDF79EA85C4
5000000000000:M:1:2:258 742837095 3197 -2 711C29862057740B
5000000000000:M:1:2:258 742837095 3201 -2 33EB063C96F7756C
5000000000000:M:1:2:258 742837095 3205 0 0
5000000000000:M:1:2:258 742837095 3207 -2 689F3E5F8EA36E7F
5000000000000:M:1:2:258 742837095 3210 -2 9876B6D0612180D6
all is ok, nothing unusual!

 2010-02-24, 09:41 #91 gd_barnes     "Gary" May 2007 Overland Park, KS 33×439 Posts Very good. So Karsten chose to put a "0" in the residue for a prime and I chose to put a "xxxxxxxxxxxxxxxx" in the residue for a prime. I chose that because I wasn't sure if it was expecting 16 digits. Karsten, I got the same type of output as you did except with the 16 x's in the residue for primes. I'll try it with just "0" also. If it works, I'll go with that to keep the Windows and Linux clients as close to the same as possible. "0" is technically more correct anyway since the residue really is zero if the number is prime. Max, this confirms that the blank residue that you have in there doesn't work for primes. It just needs "something" in there. One note of historical interest on all of this: We actually corrected a long-standing LLRnet bug with this. I confirmed by running the old client that if the first few pairs of the files were primes, the server would never get them because the residue came out as blank in tosend.txt file. Once a composite came through, then all future primes would be good because LLRnet just kept the previous pair's residue. That's pretty poor coding in the original design. Gary Last fiddled with by gd_barnes on 2010-02-24 at 09:44
2010-02-24, 10:29   #92
kar_bon

Mar 2006
Germany

3·7·11·13 Posts

Quote:
 Originally Posted by gd_barnes One note of historical interest on all of this: We actually corrected a long-standing LLRnet bug with this. I confirmed by running the old client that if the first few pairs of the files were primes, the server would never get them because the residue came out as blank in tosend.txt file. Once a composite came through, then all future primes would be good because LLRnet just kept the previous pair's residue. That's pretty poor coding in the original design.
i think, i found the reason why!

in llrnet.lua the call is
Code:
 result, residue = primeTest(t, format("%s %s", k, n))
in LUA it's possible to return more than one (even variable counts of results are possible)result by a function (see here).

so the lua expect 2 return-values: result and residue.

in the LLRnet-source in llr2.c, the primtest-function is called
Code:
int primeTest (const char * type, const char * input, char * residue)
and the end is

Code:
  strcpy(residue, res64);
return retval;
so in C only the result is returned as integer, the residue is give in the call-by-value-list.

perhaps a change of the lua-line in
Code:
 result = primeTest(t, format("%s %s", k, n),residue)
could do the job right!

i will test this afternoon.

2010-02-24, 13:14   #93
gd_barnes

"Gary"
May 2007
Overland Park, KS

2E4D16 Posts

I'll be interested to see what you come up with Karsten.

In the mean time, I have confirmed that putting a single "0" (vs. 16 x's) in the residue for primes works on the Linux side. Therefore I have made that an official modification to the Linux do.pl script. In that regard, the Windows and Linux clients are now in sync.

Max, attached is the updated do.pl script. Like discussed, I put it at 10000 iterations, beep at true, and individualPrimeLog at false.

I've calculated that a single quad running n=~20K tests is like 1000+ cores running n=~400K tests. Therefore a stress test with a single quad is not only sufficient, it's a lot easier to manage and find bugs from my perspective. I may try a 2nd quad for a while just to "feel like" I'm really stressing the server AFTER I can verify that all small bugs like what I've found so far are fixed.

Max, if you want to see what I've been running the last several hours, look at port 9985. There was 180K+ pairs in there when I started it 3-4 hours ago and it should dry out within the next 1-2 hours.

The last thing for me to test is the cancellation of pairs on the Linux client. I'll be fairly busy today (Weds.) but will make time to do that test.

Gary
Attached Files
 do.pl.tar.gz (3.0 KB, 148 views)

Last fiddled with by gd_barnes on 2010-02-24 at 13:15

2010-02-24, 15:37   #94
mdettweiler
A Sunny Moo

Aug 2007
USA (GMT-5)

2·55 Posts

Quote:
 Originally Posted by gd_barnes On another topic: Should we use Karsten's client with the "awk" script or Max's client with the Perl script for the public Windows client?
Definitely Karsten's--since mine requires Perl, it would be a pain in the butt for many Windows users to run. I mainly developed it with Linux in mind but since Perl is rather crosscompatible over both, figured that it should work just as well for both and that as such I'd test it on both platforms. As I said in the readme file, it never hurts to have options.
Quote:
 Originally Posted by gd_barnes I'll be interested to see what you come up with Karsten. In the mean time, I have confirmed that putting a single "0" (vs. 16 x's) in the residue for primes works on the Linux side. Therefore I have made that an official modification to the Linux do.pl script. In that regard, the Windows and Linux clients are now in sync. Max, attached is the updated do.pl script. Like discussed, I put it at 10000 iterations, beep at true, and individualPrimeLog at false.
AMAZING! Thanks for fixing that--I should have figured it was something simple like that. This should end up saving me quite a bit of time since I imagine it would have taken me a little while to find that.

I'll get the new version uploaded shortly. (Edit: done)

Quote:
 I've calculated that a single quad running n=~20K tests is like 1000+ cores running n=~400K tests. Therefore a stress test with a single quad is not only sufficient, it's a lot easier to manage and find bugs from my perspective. I may try a 2nd quad for a while just to "feel like" I'm really stressing the server AFTER I can verify that all small bugs like what I've found so far are fixed.
Okay, that's good. Two quads, I'd think, would be good to make sure we account for even the biggest rally (I think our biggest ones so far may have gone over 1000 cores, with you, Lennart, Beyond, IronBits, and quite a number of others all banging away full-steam).

Last fiddled with by mdettweiler on 2010-02-24 at 15:47 Reason: upload done

2010-02-24, 15:47   #95
mdettweiler
A Sunny Moo

Aug 2007
USA (GMT-5)

2·55 Posts

Quote:
 Originally Posted by kar_bon i've not yet tested LLR-GUI version with this script because you have to put the llr.ini first to a minimum of the options set: Code: PgenInputFile=t17_b2.prp PgenLine=1 - the testfile - and the line to begin seems to work without script. but: you have to end LLR-GUI by hand!? the long line in the cLLR output is the result like: 7095*2^137576-1 is not prime. LLR Res64: 7E4DD3D5AA6CE447 Time : 15.954 sec. and this is less than 80 characters only for smaller k-values. there're two spaces, which could be omitted, so it's possible to do (but for greater k it's again longer than 80 chars). so the best way is to set the output window to a width of 100 or so. in Win the properties of a CMD-win can be set for all screens (i use it so no problems)!
Ah, whoops, that's not quite what I meant...when I said "or one could just use the GUI LLR" I was referring to it in the more general context of running manual LLR. Not that that's used that commonly nowadays, though...

I had actually intended to try the GUI-LLR version with the script just for the fun of it; I hadn't gotten around to it yet with all the "real" issues to worry about, but I imagine it could still be done if you set NoTrayIcon=0 and NoIcon=1 (I think those are the correct options) in llr.ini to have LLR run without a tray icon, and without displaying any window whatsoever. I think the combination of those two should do the trick; at any rate, I know it worked for Riesel Sieve and PrimeGrid, who had to use the GUI LLR in their BOINC setups prior to the introduction of cllr. Of course, they may well have been using a modified version of the GUI LLR.

Anyway, though, not a particularly big deal...as you said, it's not hard to just make the command window bigger. I imagine a similar "permanent" setting should be possible on Linux as well.

 2010-02-24, 15:58 #96 mdettweiler A Sunny Moo     Aug 2007 USA (GMT-5) 2×55 Posts Hmm, I just discovered something interesting. It turns out you CAN kill the do.pl script when it can't connect to a server. You just have to get it while it's in "sleeping 60 seconds" mode, which it does after every 5 failed tries. Guys, which do you feel would be a better way to handle this: just have the script try 5 times then exit (like Karsten's script) or keep trying over and over like it does now (so that unattended clients don't go kapooey after a longish temporary outage)? Or, the best of both worlds: add another option for it! That shouldn't be too hard... P.S. @Gary: I have a Windows client running do.pl on 9975 now and I just ran into a pair with a small factor. It seemed to handle it pretty well, though--from what I can tell it only rejected the one with the small factor and accepted the rest. I don't have time right now as I'm leaving the house in 15 minutes, but can you check to see if all of these were received (except the small factor of course)? Code: 2013*2^221147-1 is prime! Time : 73.842 sec. 2015*2^50601-1 has a small factor : 3 !! 2015*2^56660-1 is prime! Time : 3.520 sec. 2015*2^63662-1 is prime! Time : 4.571 sec. 2015*2^104784-1 is prime! Time : 15.064 sec. Submitted to server at [2010-02-24 10:56:39] Last fiddled with by mdettweiler on 2010-02-24 at 16:01
2010-02-24, 18:24   #97
kar_bon

Mar 2006
Germany

1011101110112 Posts

Quote:
 Originally Posted by kar_bon i think, i found the reason why! in llrnet.lua the call is Code:  result, residue = primeTest(t, format("%s %s", k, n))
it's not so easy as thought, but the following lines will do the trick.
Code:
result, residue = primeTest(t, format("%s %s", k, n))
if result == 0 then
residue = "0"
end
so, if a prime is found, set the residue to '0' and all is ok!

Note: not needed for the script, only for the 'old' version of the LLRnet-client.

Last fiddled with by kar_bon on 2010-02-24 at 18:26

2010-02-24, 20:34   #98
gd_barnes

"Gary"
May 2007
Overland Park, KS

33·439 Posts

Quote:
 Originally Posted by kar_bon it's not so easy as thought, but the following lines will do the trick. Code: result, residue = primeTest(t, format("%s %s", k, n)) if result == 0 then residue = "0" end so, if a prime is found, set the residue to '0' and all is ok! Note: not needed for the script, only for the 'old' version of the LLRnet-client.
Which file in the old client needs to be changed? We should go ahead and get this corrected for users who like to "fiddle" with code and stuff.

My opinion on what to do when the server dries or goes down: Keep trying. Karsten, this would be a modification to your script. I know firsthand how maddening it would be if all of my clients stopped after a small 5-minute internet blip. IMHO, that should not be an option. It should be the default to keep trying.

Max,

It took me a while but I finally concluded what you did: It's much better to "kill" the Linux client with the system manager than it is to do Ctl-C. There are several times I noticed when it took 3-4 Ctl-C's to kill it; usually on small tests -or- when the server had dried. (Don't quote me on the exact scenarios but I do know that sometimes it didn't want to "die" on the first Ctl-C.)

Can you please put something in the documentation about it being best to kill the clients when stopping them?

Karsten,

Can you please modify the Windows script to do the following?:

1. Keep trying to connect when the server is dried or down.

2. Add an option to allow the user to change how often it tries to connect when the server is down. Both of you have that as 60 secs. but the Linux client allows the user to change it. Regardless, I think that is a good default for the value of a variable field that the user can change.

3. To sync up the Windows and Linux clients, would it be too much effort to allow the user an option to set the beep on or off and to put the primes in the current folder or the parent folder?

I think it make sense to have the 2 clients be as close to the same as possible. The above would accomplish that. Note: I think Karsten probably stopped adding options earlier because we had said that we don't want to add any new features. Sorry we're being a little wishy-washy here Karsten.

One more thing Karsten: There have been so many posts and changes here. Can you provide a link in the next post to your latest client? Sometime today or tomorrow after you make the above changes, I'd like to run a short test on it using 4 cores on my I7.

Gotta run...busy day for me. I'll probably get a little bit of testing in on cancelling pairs on the Linux client before about 5 PM CST.

Thanks everyone.

Gary

Last fiddled with by gd_barnes on 2010-02-24 at 20:44

2010-02-24, 20:48   #99
kar_bon

Mar 2006
Germany

56738 Posts

Quote:
 Originally Posted by gd_barnes Which file in the old client needs to be changed? We should go ahead and get this corrected for users who like to "fiddle" with code and stuff.
see post #97 in the first code-block: it's llrnet.lua.

Quote:
 Originally Posted by gd_barnes One more thing Karsten: There have been so many posts and changes here. Can you provide a link in the next post to your latest client?
i'll use the same link as in post #1 for any new version.

i'll try to implement the other options the next time, not sure if today all of them.

when starting the script, prompt the most important setting from llr-clientconfig.txt at first!

Code:
+-------------------------------------+
| LLRnet client V0.9b7 with cLLR V3.8 |
| K.Bonath, 2010-02-10, Version 0.61  |
+-------------------------------------+

Current configuration:
server = "nplb-gb1.no-ip.org"
port = 9950
WUCacheSize=1
that's what would have saved some time on running and checking for errors at the first tests with the script (you know: forgot to change my username in your settings).

suggestions?

 Similar Threads Thread Thread Starter Forum Replies Last Post kladner Soap Box 3 2016-10-14 18:43 GARYP166 Information & Answers 9 2009-02-18 22:41 gd_barnes Riesel Prime Search 20 2007-11-08 21:13 grobie Marin's Mersenne-aries 1 2006-05-15 12:26 eepiccolo Math 6 2006-03-28 20:53

All times are UTC. The time now is 17:21.

Tue Feb 7 17:21:07 UTC 2023 up 173 days, 14:49, 1 user, load averages: 1.02, 0.80, 0.83