20120810, 05:26  #1 
Mar 2012
Hyderabad, India
3·17 Posts 
LLRcuda
Is anyone working with llrcuda? or any idea about gpu enabled prime search.
I need help (OpEd: Moved from the "Post lots of primes" thread) Last fiddled with by Kosmaj on 20120815 at 06:22 
20120810, 06:06  #2  
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
2^{4}·313 Posts 
Quote:
Pages that might me of interest: http://www.bcteam.org/downloads.php?cat=7 http://primegrid.pytalhost.net/Mirror.htm Edit: Sorry for the offtopic. Last fiddled with by pinhodecarlos on 20120810 at 06:12 

20120811, 02:35  #3  
"Curtis"
Feb 2005
Riverside, CA
7×11×67 Posts 
Quote:
Indeed, I am. See the k=5 thread for my results with LLRcuda. Create a thread, or PM me, if you have any questions. Threads are wise for things other might want answers to! In short, it's great for specific nranges below FFT jumps; but there are few FFT sizes, so wide ranges of n are inefficient. Also, the bigger the n the more efficient CUDA is compared to CPU testing. Curtis 

20120811, 03:14  #4  
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 89<O<88
3·29·83 Posts 
Quote:
I can't help with actually doing it though, you'd have to ask msft for more details. 

20140914, 20:32  #5 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
2^{4}·313 Posts 
Is there any OpenCL LLR version?

20140914, 20:48  #6 
"Mr. Meeseeks"
Jan 2012
California, USA
3^{2}×241 Posts 

20140915, 03:12  #7 
Romulan Interpreter
"name field"
Jun 2011
Thailand
2×3×5×7×47 Posts 
Let's say not that it is inefficient, but it only plays well with FFT's which are powers of two. So, if accidentally you have to do a LL for an exponent in (say) 37M, where P95 and cudaLucas would also use a 2^x FFT, then clLucas is same efficient as the other two. And so on for other ranges where the 2^x is theoretically optimal. But as long as you are LLing exponents for which P95 or cudaLucas would select a better FFT, nonpower of two, then clLucas would either be very slow with than nonpower of two FFT, or you can force it to use the next higher 2^x FFT, for which the speed can be a little faster than with the smaller non2^x, but still far away from what P95 or cudaLucas could do, due to the fact that now the FFT is higher.
Last fiddled with by LaurV on 20140915 at 03:28 
20140915, 08:23  #8 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
5008_{10} Posts 
I am not worried about the inefficient because I will use the company electricity. Can you guys point me to the link of the client? I pretend to PRP riesel base 2 for k=5 at n>5M.

20140916, 02:09  #9  
"Mr. Meeseeks"
Jan 2012
California, USA
3^{2}·241 Posts 
Quote:
The main reason why: Not worth it... Besides, CPU's usually beat or are very close to GPU's with less power nowadays with FMA3, and that gap is especially very noticeable in lower FFT's where there is no memory bottleneck(CPU)... 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
llrCUDA  msft  GPU Computing  375  20210828 16:51 
LLRCUDA  getting it to work  diep  GPU Computing  1  20131002 12:12 