mersenneforum.org Testing....
 Register FAQ Search Today's Posts Mark Forums Read

2010-02-21, 00:08   #12
gd_barnes

May 2007
Kansas; USA

3×17×223 Posts

Quote:
 Originally Posted by kar_bon @anyone: please check the server if those pairs were reserved by me and canceled properly: Code: 2001 94893 2001 94899 2001 94901 thanks.
Here is what I have in the joblist (all times CST U.S. or GMT-6)
2001 94893:
reserved by kar_bon 16:48:41
canceled by kar_bon 16:49:05
reserved by mdettweiler 16:56:59
solved by mdettweiler 16:57:36
(Checked in results; confirmed completed by mdettweiler 16:57:36)

2001 94899:
reserved by kar_bon 16:48:42
canceled by kar_bon 16:49:06
reserved by mdettweiler 16:57:36
(Max still has it for testing as of this moment at ~18:05)

2001 94901:
reserved by kar_bon 16:48:42
canceled by kar_bon 16:49:07
(At ~18:05, pair has not been handed back out yet.)

All 3 pairs are still in the knpairs file. The only one that should have been pruned out at this point is 2001 94893. I'm sure it will get pruned out within the next 1-2 hours.

As far as I can tell, this looks very good!

Gary

Last fiddled with by gd_barnes on 2010-02-21 at 00:39

 2010-02-21, 00:15 #13 kar_bon     Mar 2006 Germany 13×229 Posts very good! seems, i got it! i reserved those pairs and immediatly canceled them and the timings from the server say this, too! i'll clear some things (additional output and commented lines) and will upload the next hour this new version for more testing.
 2010-02-21, 00:38 #14 gd_barnes     May 2007 Kansas; USA 3·17·223 Posts Excellent! So I don't have to dig back through 30-40 PM's, can you provide that link to the Windows client again? Will you have the changes for the Linux client (except the script) ready tonight? If not, I understand. If so, can you provide a link for that one too? This is exciting!
 2010-02-21, 01:02 #15 kar_bon     Mar 2006 Germany 13·229 Posts new version V0.6 uploaded, for download see post #1 first link: cancel-option in Win works fine! Last fiddled with by kar_bon on 2010-02-21 at 01:02
2010-02-21, 02:24   #16
mdettweiler
A Sunny Moo

Aug 2007
USA (GMT-5)

3·2,083 Posts

Quote:
 Originally Posted by gd_barnes Will you have the changes for the Linux client (except the script) ready tonight? If not, I understand. If so, can you provide a link for that one too?
I've been working on the script throughout the day and have a mostly working version at this point. I should be able to finalize it tonight. BTW, as for the Linux version of the modified client, I believe all the substantial modifications are in the .lua files, right? If so, then all we have to do is swap those in and they'll work with the existing Linux client.

Last fiddled with by mdettweiler on 2010-02-21 at 02:25

2010-02-21, 02:25   #17
kar_bon

Mar 2006
Germany

13×229 Posts

Quote:
 Originally Posted by mdettweiler I've been working on the script throughout the day and have a mostly working version at this point. I should be able to finalize it tonight. BTW, as for the Linux version of the modified client, I believe all the substantial modifications are in the .lua files, right? If so, then all we have to do is swap those in and they'll work with the existing Linux client.
correct!

 2010-02-21, 05:43 #18 gd_barnes     May 2007 Kansas; USA 3×17×223 Posts OK, I did a fairly extensive Windows test with 4 cores on a new I7 with the following conditions: 1. Set the cache to 25 pairs on 4 cores. 2. Ran all 4 cores for 15-20 mins. 3. Hit ctl-C to stop on all of them. 4. Ran do -c to cancel all of the pairs. 5. Changed cache to 3 pairs on 4 cores. 6. Ran all 4 cores for 5-10 mins. After I read the thread a little closer and used the do -c command instead of llrnet -c to cancel the pairs, everything looked excellent! I now have just one concern and a nitpick: Nitpick: When hitting Ctl-C to stop the client, it should stop right away and not ask another question. As it is now, it asks you if you want to with: "Terminate batch job (Y/N)?". I don't believe any other version of LLRnet or PRPnet that I'm aware does this so I just thought I'd bring it up. It is something we could live with but a real novice user would be somewhat confused by the question. Concern: When you run do -c, it appears to cancel ALL pairs that are in the cache. This means that pairs already processed and sitting in lresults.txt waiting to be sent to the server appear to be ignored and that testing will be potentially lost if you don't start the same client up again in the near future. This also means possible rejected pairs either by you or someone else since the pairs have been sent back to the server for reprocessing. Am I understanding everything right with the concern? Excellent work Karsten! I particularly like the primes file in the main directory one level above the clients. Very cool! Gary Last fiddled with by gd_barnes on 2010-02-21 at 06:16
2010-02-21, 06:09   #19
mdettweiler
A Sunny Moo

Aug 2007
USA (GMT-5)

3×2,083 Posts

Quote:
 Originally Posted by gd_barnes Nitpick: When hitting Ctl-C to stop the client, it should stop right away and not ask another question. As it is now, it asks you with you if you want to with: "Terminate batch job (Y/N)?". I don't believe any other version of LLRnet or PRPnet that I'm aware does this so I just thought I'd bring it up. It is something we could live with but a real novice user would be somewhat confused by the question.
That's a built-in function of Windows' batch file processing engine. Sometimes Windows will just go right ahead and not ask, though most of the time it will, and I don't believe there's anything you can do about it.

In other news: I have a mostly working Perl script that has passed a decent amount of testing on my dualcore. (There's still a few more scenarios I'd like to test, though, before posting it, including a test on an actual Linux machine. All the testing I've done so far has been on Windows/Cygwin.)

The only thing that doesn't work is the cancellation mechanism. I tried to implement it from the get-go based on my earlier suggestion to Karsten of generating a tosend.txt file myself, then submitting it with LLRnet, but it wasn't working too well. For whatever reason, it kept messing up big time and instead of canceling a job, it would just throw it out the window and grab a new one. I ended up throwing the whole subroutine out the window and tried to have it just use "llrnet -c" instead as a bare-minimum option, but even that didn't work. It's too late for me to do any more work on it today, but tomorrow I'll try it with Karsten's latest LLRnet modification, which should work that way. I should be able to implement it successfully in a manner similar to how Karsten did it.

P.S.: One thing I've been planning to address (before Gary mentioned it in fact) was his concern about pairs being canceled even if they have results already completed for them. That shouldn't be too hard as long as I can get the actual cancellation working properly.

Last fiddled with by mdettweiler on 2010-02-21 at 06:11

2010-02-21, 06:32   #20
gd_barnes

May 2007
Kansas; USA

3×17×223 Posts

Quote:
 Originally Posted by mdettweiler That's a built-in function of Windows' batch file processing engine. Sometimes Windows will just go right ahead and not ask, though most of the time it will, and I don't believe there's anything you can do about it. P.S.: One thing I've been planning to address (before Gary mentioned it in fact) was his concern about pairs being canceled even if they have results already completed for them. That shouldn't be too hard as long as I can get the actual cancellation working properly.
Great on getting close on a working Linux client! :-)

That's typical Windows for you. Forcing something on you that you cannot get around or that is very difficult to get around.

I think the best way to address the straggling results that get canceled issue is to not address them at cancellation at all. They should be addressed when processing is stopped. So if you hit Ctl-C, I think the client should right away send all completed pairs as processed to the server. Logically, it would go something like this:

1. Ctl-C is hit with a response of "Y" subsequently pressed.
2. Pairs in lresults.txt are converted to LLRnet format and sent to the server.
3. Pairs in lresults.txt are moved to lresults_hist.txt and removed from workfile.txt.

If you think about it, that's the way it should happen. If someone stops LLRnet in order to run something else for a few hours or to shut the machine off for the night, LLRnet should send everything already processed, even though the current batch is not done yet.

By doing this, it completely separates the issue from the cancellation of pairs. In effect, it's an issue completely separate from it that can be addressed on its own.

BTW, I could "kind of" fudge my away around the issue although the possibility of rejected pairs would still exist. It would just be very small. Here is what I would do:
1. Stop the client with Ctl-C.
2. Change the cache to 1.
3. Cancel all of the pairs with do -c.
4. Quickly after #3, restart the client.

Almost immediately after restarting in #4, it would send all pairs in lresults.txt to the server, move them to lresults_hist.txt, and request a single new pair immediately after that. I would then stop the client and cancel that one final pair with do -c. Everything got sent, nothing was lost, and all unused pairs were returned to the server. There was only a miniscule chance that there would be a rejected pair somewhere in there. It doesn't take long and is a few hoops to jump through but wasn't bad for 4 cores. But it's a matter of magnitude. I certainly don't want to be doing it on 30-50 cores at any one time so I hope we can automate the sending of fully processed pairs to the server right away when a client is stopped.

Gary

Last fiddled with by gd_barnes on 2010-02-21 at 08:04

 2010-02-21, 06:37 #21 MyDogBuster     May 2008 Wilmington, DE 285210 Posts I've tested everything locally using my own DB and cannot bust it. Nice work Karsten. I have NO nitpicks. LOL The only problem I see in the future is if a newer version of LLR is released. That would require changing LUA files and could get messy. I guess as long as you specify what version of LLR the scripts work with, it won't be all that bad. I especially like the very low amount of output verbage. I don't have to read War and Peace to find something. It's also neat that the #1 test is offset in the results file showing where a server communication began. Also, that primes.txt file 1 level up is really nice. 5 gold stars for Karsten Edited: I didn't test the do -c so I'll attack that in the AM. I agree with Gary. If there is a way to process all completed pairs when ctrl-c is hit, then that is the best scenario. Last fiddled with by MyDogBuster on 2010-02-21 at 06:51
2010-02-21, 07:35   #22
mdettweiler
A Sunny Moo

Aug 2007
USA (GMT-5)

186916 Posts

Quote:
 Originally Posted by MyDogBuster The only problem I see in the future is if a newer version of LLR is released. That would require changing LUA files and could get messy. I guess as long as you specify what version of LLR the scripts work with, it won't be all that bad.
Actually, that shouldn't be a problem at all. The only changes Karsten made to cllr.exe were cosmetic (removing a couple extra output lines); I've verified that the stock version of cllr does indeed work just as well. Thus, when new versions of LLR are released, we should be able to just swap them in without a problem.

@Gary: good idea about sending the results first of all regardless. I'll have to give some thought to how to implement that. Actually, though, that might be better saved for the 1.0 version; considering as how my script should be able to do at least everything that standard LLRnet can do without having that feature, and you'd like to have this tested and released before you leave town Tuesday, it should be OK to get it out like this (especially alongside Karsten's version which doesn't have that feature either).

 Similar Threads Thread Thread Starter Forum Replies Last Post kladner Soap Box 3 2016-10-14 18:43 GARYP166 Information & Answers 9 2009-02-18 22:41 gd_barnes Riesel Prime Search 20 2007-11-08 21:13 grobie Marin's Mersenne-aries 1 2006-05-15 12:26 eepiccolo Math 6 2006-03-28 20:53

All times are UTC. The time now is 04:01.

Sun Oct 2 04:01:27 UTC 2022 up 45 days, 1:30, 0 users, load averages: 1.90, 1.21, 1.07