mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Riesel Prime Search (https://www.mersenneforum.org/forumdisplay.php?f=59)
-   -   LLRcuda (https://www.mersenneforum.org/showthread.php?t=17069)

shanecruise 2012-08-10 05:26

LLRcuda
 
Is anyone working with llrcuda? or any idea about gpu enabled prime search.
I need help


(OpEd: Moved from the "Post lots of primes" thread)

pinhodecarlos 2012-08-10 06:06

[QUOTE=shanecruise;307547]thanks a lot kosmaj :)

Is anyone working with llrcuda? or any idea about gpu enabled prime search.
I need help[/QUOTE]

[URL="http://www.mersenneforum.org/member.php?u=1636"]VBCurtis[/URL] is.

Pages that might me of interest:

[URL]http://www.bc-team.org/downloads.php?cat=7[/URL]
[URL]http://primegrid.pytalhost.net/Mirror.htm[/URL]

Edit: Sorry for the offtopic.

VBCurtis 2012-08-11 02:35

[QUOTE=shanecruise;307547]thanks a lot kosmaj :)

Is anyone working with llrcuda? or any idea about gpu enabled prime search.
I need help[/QUOTE]

Shane-
Indeed, I am. See the k=5 thread for my results with LLRcuda. Create a thread, or PM me, if you have any questions. Threads are wise for things other might want answers to!

In short, it's great for specific n-ranges below FFT jumps; but there are few FFT sizes, so wide ranges of n are inefficient. Also, the bigger the n the more efficient CUDA is compared to CPU testing.
-Curtis

Dubslow 2012-08-11 03:14

[QUOTE=VBCurtis;307625]it's great for specific n-ranges below FFT jumps; but there are few FFT sizes, so wide ranges of n are inefficient.[/QUOTE]

I don't know anything about LLRcuda, but CUDALucas (LLcuda analogously) was able to move from power-of-two-only FFT lengths to using [URL="http://developer.nvidia.com/cuda/cufft"]cufft[/URL], which supports any 7-smooth length. CUDALucas is now able to (efficiently) use all the same FFT lengths as Prime95.

I can't help with actually doing it though, you'd have to ask [URL="http://www.mersenneforum.org/member.php?u=9446"]msft[/URL] for more details.

pinhodecarlos 2014-09-14 20:32

Is there any OpenCL LLR version?

kracker 2014-09-14 20:48

[QUOTE=pinhodecarlos;383033]Is there any OpenCL LLR version?[/QUOTE]

Sadly, no... There is clLucas(Lucas-Lehmer) out there... but the clFFT library that drives it is quite unoptimized/inefficient compared to Nvidia's FFT library at the moment...

LaurV 2014-09-15 03:12

Let's say not that it is inefficient, but it only plays well with FFT's which are powers of two. So, if accidentally you have to do a LL for an exponent in (say) 37M, where P95 and cudaLucas would also use a 2^x FFT, then clLucas is same efficient as the other two. And so on for other ranges where the 2^x is theoretically optimal. But as long as you are LL-ing exponents for which P95 or cudaLucas would select a better FFT, non-power of two, then clLucas would either be very slow with than non-power of two FFT, or you can force it to use the next higher 2^x FFT, for which the speed can be a little faster than with the smaller non-2^x, but still far away from what P95 or cudaLucas could do, due to the fact that now the FFT is higher.

pinhodecarlos 2014-09-15 08:23

I am not worried about the inefficient because I will use the company electricity. Can you guys point me to the link of the client? I pretend to PRP riesel base 2 for k=5 at n>5M.

kracker 2014-09-16 02:09

[QUOTE=pinhodecarlos;383066]I am not worried about the inefficient because I will use the company electricity. Can you guys point me to the link of the client? I pretend to PRP riesel base 2 for k=5 at n>5M.[/QUOTE]

Damn, I should have initially said it doesn't exist... Sorry. :davieddy:

The main reason why: Not worth it...
Besides, CPU's usually beat or are very close to GPU's with less power nowadays with FMA3, and that gap is especially very noticeable in lower FFT's where there is no memory bottleneck(CPU)...


All times are UTC. The time now is 23:20.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.