mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   The P-1 factoring CUDA program (https://www.mersenneforum.org/showthread.php?t=17835)

owftheevil 2013-08-17 18:20

1 Attachment(s)
Many thanks to frmky, here's a 64bit windows build of CUDAPm1, using CUDA toolkit 5.0. I have tested this very little, but seems to be working OK.

kladner 2013-08-17 18:52

[QUOTE=owftheevil;349949]Many thanks to frmky, here's a 64bit windows build of CUDAPm1, using CUDA toolkit 5.0. I have tested this very little, but seems to be working OK.[/QUOTE]

OMG! Wow! Thanks to both of you! :grin:

frmky 2013-08-17 20:47

[QUOTE=nucleon;349939]GTX780 is Titan-lite. GTX780 is a different chip to GTX770.

If you can't get a GTX5x0, and a Titan is out of your price range then GTX780 is your better bet. Of course factoring in budget constraints.
[/QUOTE]

The DP performance of the GTX 780 has been cut to GTX 7xx levels, so for DP compute it is really no different than the earlier chip. A GTX 580 should still give better performance at a much lower price.

James Heinrich 2013-08-17 20:52

[QUOTE=frmky;349965]The DP performance of the GTX 780 has been cut to GTX 7xx levels, so for DP compute it is really no different than the earlier chip. A GTX 580 should still give better performance at a much lower price.[/QUOTE]According to [url=http://www.mersenne.ca/cudalucas.php]benchmark data[/url] I have for CUDAlucas, the GTX 780 is still slightly ahead of the GTX 580 by roughly 5%
I'm not sure how relative performance varies between CUDAlucas and CUDAPm1.

kladner 2013-08-17 20:57

[QUOTE=James Heinrich;349966]According to [URL="http://www.mersenne.ca/cudalucas.php"]benchmark data[/URL] I have for CUDAlucas, the GTX 780 is still slightly ahead of the GTX 580 by roughly 5%
I'm not sure how relative performance varies between CUDAlucas and CUDAPm1.[/QUOTE]

Too bad it also uses ~2.5% more power, too. I'd say this gives the edge to the 580 because of its lower price.

nucleon 2013-08-17 23:46

If you are after DP* result throughput efficiency.

Your best best is to skip GPUs and buy multiple low-clocked quad core machines +high clock ram.

Capex might be more, opex is lower for a given throughput.

-- Craig
*I stress DP. TF - GPUs blow CPUs out of the water.

owftheevil 2013-08-18 02:11

[QUOTE=James Heinrich;349966]According to [URL="http://www.mersenne.ca/cudalucas.php"]benchmark data[/URL] I have for CUDAlucas, the GTX 780 is still slightly ahead of the GTX 580 by roughly 5%
I'm not sure how relative performance varies between CUDAlucas and CUDAPm1.[/QUOTE]

I haven't tested this very thoroughly yet, but it seems that on cards with smaller amounts of memory, e.g. a 560 with ~1gb of memory, CUDALucas and CUDAPm1 have about the same thoughput, whereas with 6gb of memory, throughput for CuPm1 is about 15% greater than for CuLu.

kladner 2013-08-18 04:29

[QUOTE=kladner;349968]Too bad it [U]also[/U] uses ~2.5% more power, [U]too[/U]. [/QUOTE]

Brought to you by the Department of Redundancy Department. :blush:

Karl M Johnson 2013-08-18 08:47

[QUOTE=owftheevil;349949]Many thanks to frmky, here's a 64bit windows build of CUDAPm1, using CUDA toolkit 5.0. I have tested this very little, but seems to be working OK.[/QUOTE]
Thank you for the new binary.
I see some changes (like full S1 and S2 checkpoints) from the old one I've had (dated 06 May 2013).

Owners of the defective Titan may run CUDAPm1/CUDALucas on Windows like this:[CODE]
start
CUDAPm1 [flags if not using ini file]
goto :start[/CODE]So whenever CUDAPm1 quits due to the vRAM being unstable, it will launch again and restart from the latest checkpoint.
For this to work effectively, I suggest setting the checkpoint iterations to a thousand, so checkpoints would be written every couple of seconds and running CUDAPm1 from a RAM disk, so that the checkpoints would not wear your storage media.

One drawback of this method is that it will never go out of the loop, even if there are no tasks in the worktodo file.
Another one is related to the volatile nature of RAM disks: if your system crashes or reboots, you lose all the work.

Comments are welcome:smile:

owftheevil 2013-08-19 18:22

With the latest drivers, 326.41 for windows and 325.15 for linux, the unstable memory problem (if that's what it was) is fixed. There is still a bug with the driver that causes the ffts to hang occasionally. Its been reported and I presume being worked on. This bug affects all cards, not just the titans.

I've been doing something similar to what you suggested, but instead, looping on a non zero exit value. That way ^C still exits the program. I also don't think the checkpoint iterations set so low is necessary. You will loose as much time by doing the extra checkpoints as you gain by having a more recent checkpoint when it dies.

kladner 2013-08-19 22:55

[QUOTE=owftheevil;350145]With the latest drivers, 326.41 for windows and 325.15 for linux, the unstable memory problem (if that's what it was) is fixed. There is still a bug with the driver that causes the ffts to hang occasionally. Its been reported and I presume being worked on. This bug affects all cards, not just the titans.
[/QUOTE]

Is this other [U]recent[/U] cards, i.e. 600 and 700 series, or does it extend back to the 500s and 400s? I would love to find out that my 570 can actually run at stock RAM clock.


All times are UTC. The time now is 23:19.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.