mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas not fast on my slow hardware (https://www.mersenneforum.org/showthread.php?t=20566)

saeres 2015-10-23 16:02

CUDALucas not fast on my slow hardware
 
I have CUDALucas set up on my desktop running a 100 million digit LL with FFT 19208 (auto selected). I know GPU's are faster the CPU's but my gtx 550 ti is only getting 125 ms/it. Is there a setting i need to implement to get the full benefit of the GPU?

GPU gtx 550 TI
900 mhz clock speed (increasing speed even minimally crashes CUDALucas)
runnign CUDALucas 6.5

kladner 2015-10-23 16:45

[QUOTE=saeres;413471]I have CUDALucas set up on my desktop running a 100 million digit LL with FFT 19208 (auto selected). I know GPU's are faster the CPU's but my gtx 550 ti is only getting 125 ms/it. Is there a setting i need to implement to get the full benefit of the GPU?

GPU gtx 550 TI
900 mhz clock speed (increasing speed even minimally crashes CUDALucas)
runnign CUDALucas 6.5[/QUOTE]

Before addressing speed, may I ask if you have run the CUDALucas self-tests, (-r 0), (-r 1), successfully? You are embarking on a very long task. If you have not already, it would be a very good idea to run a few Double Check LL tests to make sure your hardware is giving good results.

Even then, as I am sure you will hear from others, the chances for errors are substantial on such a run. From personal experience with nVidia cards, I would strongly suggest underclocking your GPU memory, if not the chip itself. The self-tests are good guides in this regard, but successful DC tests will the real proof of basic stability.

EDIT: Oh! Welcome to the forum, too!:smile:

fivemack 2015-10-23 16:48

[QUOTE=saeres;413471]I have CUDALucas set up on my desktop running a 100 million digit LL with FFT 19208 (auto selected). I know GPU's are faster the CPU's but my gtx 550 ti is only getting 125 ms/it. Is there a setting i need to implement to get the full benefit of the GPU?

GPU gtx 550 TI
900 mhz clock speed (increasing speed even minimally crashes CUDALucas)
runnign CUDALucas 6.5[/QUOTE]

125ms per iteration isn't actually all that bad; hundred-million-digit LL jobs are very large, and the GPU you have is not very fast as GPUs go: one core of i7/4770K gets about 12 GHz-days per day, the GTX550Ti in [url]http://www.mersenne.ca/cudalucas.php?model=18[/url] gets about 11.2.

On a hundred-megadigit number, one core of a 2.6GHz Ivy Bridge does about 200ms/it , so you're a bit faster than that. I think you are getting about the right speed.

saeres 2015-10-23 17:58

I haven't ran a self-test, haven't found anything on that. However, I have been comparing the CUDALucas reports to the Prime95 reports and as stated i'm getting 125 ms/it in CUDALucas whereas i'm getting 8 ms/it in Prime95. (both LL 100m...)

VBCurtis 2015-10-23 18:03

[QUOTE=saeres;413482]I haven't ran a self-test, haven't found anything on that. However, I have been comparing the CUDALucas reports to the Prime95 reports and as stated i'm getting 125 ms/it in CUDALucas whereas i'm getting 8 ms/it in Prime95.[/QUOTE]

You're getting 8ms/it in Prime95.... for what exponent? For what FFT size? Compare apples to apples, sir.

Or, do a test on your GPU of the same-size exponent you're doing with Prime95, compare those speeds.

saeres 2015-10-23 19:41

Sorry not at my desktop currently so can't get the specific number. However, the number I'm running I had running in prime95. That's where I'm getting my basis from. Same 100m number and prime is over 15x faster. This is based on stage 1 but I don't know if that has any influence.

I believe ftt for both is 19208 but I will validate when I get home.

blip 2015-10-23 19:46

Stage 1 is P-1 testing, not LL. So you cannot compare that, as they are completely different beasts.

saeres 2015-10-23 20:37

Okay, so the number i'm working on is M332213083. In GIMPS assignments it states that it is LL type and acquired through the selection of 100M digit. In Prime and CUDAL it uses 19208 FFT. As I am new to this and from what i'm gathering from your previous message is that initializing the work in Prime forces it into p-1 whereas CUDAL immediate starts LL testing? is this correct or am i missing something?

blip 2015-10-23 21:20

As you can see [URL="http://www.mersenne.org/report_exponent/?exp_lo=332213083&full=1"]here[/URL], no P-1 has been done on that exponent yet, so I assume your worktodo.txt reads
[CODE]Test=blabla,332213083,77,0[/CODE].
The "0" at the end instructs mprime to do P-1 before LL. Change to "1" to let it run LL, and then look at the timings again. Stop after a while and change back to "0" (P-1 should be done before LL).

saeres 2015-10-23 21:22

Awesome, Thanks for all the help!

Is there any way to start p-1 on CUDALucas or do i need to run it through prime first?

blip 2015-10-23 21:24

You could also run P-1 on the GPU with [URL="http://sourceforge.net/projects/cudapm1/"]CUDAPm1[/URL]to compare its timings with mprime :smile:


All times are UTC. The time now is 13:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.