mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-02-16, 01:54   #749
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

25BF16 Posts
Default

It is not your batch file. I found this behavior too, when the exponent is finished. It did not bothered me, beside of the thing that the message "could not find a file to resume from" is written at the end (after the testing is done), which I already mention. I think there is a mis-synchronization between CPU and GPU, willing to do the things faster, the last one does not wait for the former to finish its part first (printing, returning the exit codes) then the things go wild. It does not affect the computing, just the cosmetic. For me it was a bit bothering that I have to edit the batch file (to eliminate the first row) and restart, every time one exponent was finished. Right now I am LL-ing at the LL front, so it will take about 5 days for an expo. I can live doing that editing every 5 days.
LaurV is offline   Reply With Quote
Old 2012-02-16, 02:30   #750
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by LaurV View Post
It is not your batch file. I found this behavior too, when the exponent is finished. It did not bothered me, beside of the thing that the message "could not find a file to resume from" is written at the end (after the testing is done), which I already mention. I think there is a mis-synchronization between CPU and GPU, willing to do the things faster, the last one does not wait for the former to finish its part first (printing, returning the exit codes) then the things go wild. It does not affect the computing, just the cosmetic. For me it was a bit bothering that I have to edit the batch file (to eliminate the first row) and restart, every time one exponent was finished. Right now I am LL-ing at the LL front, so it will take about 5 days for an expo. I can live doing that editing every 5 days.
I will have to switch to LL for the same reason. Right now DCs finish every 24 hours, so I lose a lot of time when I can't be there to get it started again.

Thanks for the info.
flashjh is offline   Reply With Quote
Old 2012-02-16, 15:19   #751
kjaget
 
kjaget's Avatar
 
Jun 2005

2018 Posts
Default

Quote:
Originally Posted by flashjh View Post
I just finished another exponent with 1.50 and it force closed again. Maybe it's my batch file?
I still think this is related to a problem in the code. I reviewed the 1.50 source just posted, and I believer we need to add a line with

*infp = NULL;

after the each of the fclose(*infp) calls @ line 1525, 1587, and possibly1494 of rw.cu.

Since I'm not running or compiling the code currently I can't be sure, but it was needed before (see the code in http://mersenneforum.org/showpost.ph...&postcount=405). If this isn't the correct fix in the current code it's still a place to start looking - a file was being closed but since the file pointer wasn't being set to NULL other code was trying to use it. Chaos resulted.

If I remember it didn't always happen - either it happened when restarting from a checkpoint but not from scratch or vice versa. Wish I could remember which one was the problem, but it's been a while.

If someone can save a checkpoint file that's nearly finished it would make this easier to test.

Last fiddled with by kjaget on 2012-02-16 at 15:28
kjaget is offline   Reply With Quote
Old 2012-02-16, 19:06   #752
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

1010010112 Posts
Default

Quote:
Originally Posted by kjaget View Post
I still think this is related to a problem in the code. I reviewed the 1.50 source just posted, and I believer we need to add a line with

*infp = NULL;

after the each of the fclose(*infp) calls @ line 1525, 1587, and possibly1494 of rw.cu.
I feel addressed...

Find attached a zip containing experimental exe and modified rw.cu. I have no time testing... :-(
Attached Files
File Type: zip CUDALucas.1.50.test.cuda4.0.sm_20.WIN64.zip (107.8 KB, 64 views)
Brain is offline   Reply With Quote
Old 2012-02-16, 20:13   #753
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Brain View Post
I feel addressed...

Find attached a zip containing experimental exe and modified rw.cu. I have no time testing... :-(
I'll test tonight, can you also compile a 4.1/2.1. Thanks.
flashjh is offline   Reply With Quote
Old 2012-02-16, 20:16   #754
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

46316 Posts
Default

Quote:
Originally Posted by kjaget View Post

If someone can save a checkpoint file that's nearly finished it would make this easier to test.
I can save a file. Who would like a copy?
flashjh is offline   Reply With Quote
Old 2012-02-16, 20:40   #755
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

331 Posts
Default

Quote:
Originally Posted by flashjh View Post
I'll test tonight, can you also compile a 4.1/2.1. Thanks.
Here it is.
Attached Files
File Type: zip CUDALucas.1.50.test.cuda4.1.sm_21.WIN64.zip (106.9 KB, 79 views)
Brain is offline   Reply With Quote
Old 2012-02-16, 21:34   #756
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5·103 Posts
Default

My invoking cudalucas scripts:

Start with a file 'exp' (list of candidates to check):
Code:
25431071
27961807
25019459
Invoking command to 'kick' things off:
Code:
tail -n 1 -F exp | ./runme.sh
All that does is do a permanent 'tail' on the exponents list. You can add additional exponents like this:

Code:
echo 25888663 >> exp
You can do this even while running a previous exponent or even if the last exponent has stopped. With the capital 'F' option, you can even delete the exp file if it gets too cumbersome. Tail will re-read the new file as long as it has the same name.

runme.sh:
Code:
#!/bin/bash

while read line
 do
  echo Starting M$line
  ./start.sh $line
 done
Basically take an exponent from tail and kick off this script:

Code:
#!/bin/bash
GPU=`cat limit.gpu`
./CUDALucas.cuda3.2.sm_13.WIN64.exe -c10000 -D$GPU -t $1
This is where you set the options. Note: There's also a file called limit.gpu which sets the gpu number.

To be better extensible, you could do something like:

Code:
#!/bin/bash
OPTIONS=`cat limit.options`
./CUDALucas.cuda3.2.sm_13.WIN64.exe $OPTIONS $1
This would completely abstract the options. So you can could change them for the next exponent. Saves stopping the process. Note: I haven't tested this.

Pros:
You can add exponents any time. No need to ctrl-c to redo a batch file or stop the code in anyway. It keeps on going as long as you have sufficient exponents in the 'exp file.

Cons:
A little complicated. Maybe hard to understand.
Yes it's a hack - I didn't say it's pretty.
I'm not a coder, so the coding style may offend :) (My apologies)
I'm guessing it could work under linux.
Under Windows it uses cygwin tools, so you need to install that.
No warranty whatsoever. I'm sorry, I can't support this if it doesn't work for you. I provide this as information only.

-- Craig
nucleon is offline   Reply With Quote
Old 2012-02-16, 21:52   #757
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

I bet those who are familiar with cmd.exe can put something together for Windows that is similar, so that it doesn't necessitate cygwin. The only major drawback I see is that you'd spawn a child process for each script, though they would be idle and shouldn't affect performance in any way, but if you watch your process list carefully, it'll be annoying.

Perhaps an alternate runme.sh:
Code:
#!/bin/bash

FLAGS='cat [insert options file here, relative to above directory]'

while read line
 do
  echo "Starting M$line"
  ./CUDALucas.[blah blah insert proper name].exe $FLAGS $line
 done
This file would be invoked with the same tail command (I don't know how that works, can't provide support there)
Code:
tail -n 1 -F exp | ./runme.sh
And add exponents to the file as described above.

Edit: When I tried that tail command with the example exp file, it printed the last line, not the first line. That means the first line exp will never get tested unless you let the exp file get really low. Also, how do you remove lines once a test is done? And now that I think about it, I don't understand how getting to the end of the while loop will cause the tail command to run again and input another exp.

Last fiddled with by Dubslow on 2012-02-16 at 22:02
Dubslow is offline   Reply With Quote
Old 2012-02-17, 02:43   #758
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Brain View Post
Here it is.
Works great for batch file, have to test further.

It tested successfully:

Code:
 
M( 216091 )P, n = 524288, CUDALucas v1.50 
M( 216103 )C, 0xd27223d7dbf3febf, n = 524288, CUDALucas v1.50
Thanks.
flashjh is offline   Reply With Quote
Old 2012-02-17, 11:39   #759
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

331 Posts
Default CUDALucasWatchdog

Quote:
Originally Posted by flashjh View Post
Works great for batch file, have to test further.
So the code change did have an effect?

New topic:
I'm currently developing a proof-of-concept CUDALucasWatchdog, just for curiousity. It is written in Java and does the following:
1) Check via "taskmgr.exe" / "ps -e" if CUDALucas is running.
2) If not, goto worktodo.txt (new) and grab top assignment line.
3) Check in mersarch.txt if this expo has ever been finished.
4a) If yes delete assigment line in worktodo and goto 2).
4b) If no launch CUDALucas via command line call the old-fashioned way.
5) Quit

This program would have to be periodally executed by the user/system like the perl submit spider is working.

I'm not sure yet if I will ever publish the code but liked to know if a Cygwin based solution or a Java solution would be preferred. I have no Pros and Cons yet. CUDALucasWatchdog would be designed for Win, Linux (and Mac) to run on.
Example call:
Code:
java -jar CUDALucasWatchdog "F:\Computing\CUDALucas\CUDALucas.x.y.z.exe"
Brain is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 13:00.


Fri Aug 6 13:00:19 UTC 2021 up 14 days, 7:29, 1 user, load averages: 3.22, 2.90, 2.70

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.