mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

apsen 2011-08-25 12:08

[QUOTE=KingKurly;270061]et me know if I can provide further help; ... :smile:[/QUOTE]

Hmm... when I created new file it worked. The difference is the owner (this is under cygwin BTW). But if I put my own key in id_rsa I could still connect to my machine (even if the owner remains Administrators). Looks like the remote server settings could influence that...

[CODE]
~/.ssh# ls -l
total 21
-rwx------ 1 apsen root 1706 Aug 25 07:41 bzr_rsa
...
-rwx------+ 2 Administrators root 1679 Aug 25 07:49 id_rsa
...

~/.ssh# cmp id_rsa bzr_rsa

~/.ssh# md5sum bzr_rsa
58e8aa27f6f8b2aad21e5ec3d36e8ecb *bzr_rsa

~/.ssh# ssh -i id_rsa bzruser@utila.eocys.com
bzruser@utila.eocys.com's password:


~/.ssh# ssh -i bzr_rsa bzruser@utila.eocys.com
PTY allocation request failed on channel 0

[/CODE]

The only difference between my server and the above that is visible under -vvv is that they seem to use different "protocols":


[CODE]
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
[/CODE]

vs.

[CODE]
debug1: sending SSH2_MSG_KEX_ECDH_INIT
[/CODE]

monst 2011-08-25 13:33

I just set up another machine with a GPU in it. I'm trying to run a double check and am getting the following message:

device number >= device_count ... exiting

What is going on here and what can I do about it?

Thanks, in advance, for your help.

-- Rich

apsen 2011-08-25 13:38

[QUOTE=Ethan (EO);269958]Sure -- I think what you need to do is manually create the .ssh directory, and then inside that create a file named `id_rsa' and paste the whole private key block above into it. Then try reconnecting with the bzr client, which should be able to use the key to connect. Let me know if that works![/QUOTE]

Now I'm able to connect via ssh but bzr still prompts for password...

Ethan (EO) 2011-08-25 20:37

[QUOTE=apsen;270045]Let's do it step by step:

First of all I should be able to:

[CODE]ssh bzruser@utila.eocys.com[/CODE]

Correct?

I cannot - I'm being prompted for password (private key is being offered but falls back to password).[/QUOTE]

No -- bzruser can only execute bzr commands, and cannot get a pty or a shell. Sorry that I don't have more time to look at this right now but I will later this evening :)

msft 2011-08-26 05:58

Hi ,monst
[QUOTE=monst;270078]device number >= device_count ... exiting
[/QUOTE]
Can you run another CUDA program(mfaktc etc).

monst 2011-08-26 12:42

I see the problem. The CUDA drivers were at version 3.2. This permitted mfaktc to run but not CUDALucas. I updated to version 4.0 and now both programs run successfully. Thanks for your help.

Brain 2011-08-26 18:12

Testing
 
[QUOTE=Ethan (EO);268927]As a side effect, this build is _only_ for compute capability 2.0 devices, because I changed the block size for the transpose kernels to 32 from 16; I think that will overrun shared memory in sm<=1.3 or may fail silently and mess up your run. I've tested this build against 216091, but haven't tested it at larger fft sizes -- so please test! :)[/QUOTE]

I've run 3 successfull 2M DCs on my GTX 560 Ti @ 280er driver and CUDA 4er libs. Now starting 4M first time tests.

Brain 2011-08-27 11:16

Ethans 1.3 Alpha
 
Would it be possible to insert an artificial wait loop where formerly our extraneous cudamemcopy was? My PC has become very laggy since that... I'm thinking of a command line param like "throttle" or "sleep time", maybe in microseconds "-sleep 100" --> sleep for 0.1ms after every iteration.

I'm running mfaktc concurrently and CUDALucas now seems to be very "egoistic". I'd like to enforce some free GPU cycles, especially also for video playback...

apsen 2011-08-28 09:41

[QUOTE=Ethan (EO);268927]
As a bonus, this build is about 10% faster on my setup (280 series drivers, GTX 470)!:
[/QUOTE]

I've updated the drivers to 280 series and now Ethan's CUDA 4.0 version is faster then my CUDA 3.2. The difference is about 3% on GTX 465.

Christenson 2011-08-28 14:12

I'm having some trouble getting CudaLucas re-started. The machine it's on crashes reliably, about every week, whether for students that use it, Windows issues, or for minor power glitches that its less heavily loaded brethren can ignore.

Last week, I restarted it, and things went fine...or so it seemed, but soon after I left the building, the whole machine crashed. (It was running one instance of mfaktc, one of P95, and one of cudaLucas on its 4 cores, Win64, and GTX480). I base this assertion on the minimal progress (mfaktc's last class was still in the 3500 range on a billion digit number to 2^84).

This week, when I get there, I run the command line in batch mode as before, and don't get a window. The t25xxxxxx and c25xxxxxx files start with one of them at 0 bytes, the other at 16M or so. Check with task manager, the process isn't there. Run the batch file again, the window lasts 2 or three seconds on the taskbar, not long enough for me to see what it says, and it disappears.

1) How would I decide if cudaLucas had finished its exponent?
2) How would I tell the Windows7 command processor to hold the window open after the process exited? (Or do I need to simply run withoiut creating a window and see what I get?)
3) What's in the t25xxxxxx and c25xxxxxx files? Can I extract the residue to report to Primenet from there?

As a suggestion to the developers, when a test finishes, write a line to "results.txt" like mfaktc does...with everything that needs to be reported to primenet and a few words to help the silly humans like me that look at it. If you don't do it, I will eventually contribute it, but you may not want to wait for me.

apsen 2011-08-31 13:00

[QUOTE=Christenson;270241]
1) How would I decide if cudaLucas had finished its exponent?
2) How would I tell the Windows7 command processor to hold the window open after the process exited? (Or do I need to simply run withoiut creating a window and see what I get?)
3) What's in the t25xxxxxx and c25xxxxxx files? Can I extract the residue to report to Primenet from there?
[/QUOTE]

When cudaLucas finishes an exponent it writes the result to mersarch.txt. The line from that file is enough to report to primenet. Also once the exponent is done there should be no tXXX or cXXX files.

I see from reading program's help that it should be possible to specify the file(s) to duplicate output to. But I haven't had any luck with it and didn't find enough incentive yet to find enough time to look into it. I just usually pipe the output through (I have cygwin) [CODE]tee -a logfile[/CODE]

In my experience cudaLucas crashes either because it's out of memory or because the driver has been reset by windows. Actually I should not say crashes - it actually exits quite gracefully giving you an error message.

I'm not sure what exactly in the t/c files but those are just backups of current state and if you'll look at the code it should be possible to figure out how to extract intermediate residue from them.


All times are UTC. The time now is 23:04.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.