mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-02-10, 01:32   #408
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Hi ,aaronhaviland
You are completly right.
msft is offline   Reply With Quote
Old 2011-02-11, 02:59   #409
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

821310 Posts
Default

msft,

Can you pm me an article about this software for the wiki?
Or post one yourself.
Uncwilly is offline   Reply With Quote
Old 2011-02-11, 03:03   #410
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Hi ,Uncwilly
Quote:
Originally Posted by Uncwilly View Post
Can you pm me an article about this software for the wiki?
Or post one yourself.
permission granted.
msft is offline   Reply With Quote
Old 2011-02-11, 17:26   #411
Andrew Thall
 
Dec 2010

23 Posts
Default

Quote:
Originally Posted by aaronhaviland View Post
There seems to be a couple upper limits to this right now. I tried running higher numbers, and get a couple different errors:

#CUDALucas 151150000
err = 0.353794, increasing n from 8388608
CUDALucas.cu(534) : cufftSafeCall() CUFFT error.

I'm guessing it's because of: "The cuFFT manual states that 1-D ffts are supported for < 8 million elements."

The other is at exponents around 318750000, I hit the memory limit on my 768MB card. At 336000000, it wants over 1Gb.

Combined, these prevent it from being useful for the 100 million digit numbers. (I can't be the only one eyeing this as making that task feasible.)
Update your CUDA library and CUFFT. The most recent version no longer has the 8M element limit. It's also much more numerically accurate, particularly the non-power-of-two transforms.
Andrew Thall is offline   Reply With Quote
Old 2011-02-12, 01:57   #412
aaronhaviland
 
Jan 2011
Dudley, MA, USA

1118 Posts
Default

Quote:
Originally Posted by Andrew Thall View Post
Update your CUDA library and CUFFT. The most recent version no longer has the 8M element limit. It's also much more numerically accurate, particularly the non-power-of-two transforms.
These were with CUDA/CUFFT 3.2.16. Is there a newer version?

(Also, I'm running nv drivers 270.18, which say they support CUDA 4.0. Any word of a newer toolkit/sdk?)
aaronhaviland is offline   Reply With Quote
Old 2011-02-12, 16:12   #413
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

331 Posts
Default CUDALucas thoughts

Quote:
Originally Posted by kjaget View Post

Run times on my factory overclocked GTX 275, along with some rough run times for current work assignments. I know these aren't the most efficient use of the code but it's a good basis for comparison to a CPU.

8.96 msec/iter @ 2M FFT (~ 2.5 days for a 25M LL double check)
18.8 msec/iter @ 4M FFT (~ 11 days for a 47M LL first time run)

Not sure how that compares to Linux versions, but it's definitely fast enough to be useful.
- I timed 6.8 msec/iter @ 2M FFT (DC @ 26M) on my GTX 560 Ti with Win7 (GPU load @ 93%). Seems reasonable to me. Thanks for the build, kjaget. Stay tuned.
- I'd also like to know some Linux comparisons.
- I had trouble to figure out the checkpoint command but Uncwilly will document this.
- Which CUDA version (CUFFT) is CUDALucas build with? Will most current 3.2 bring a speedup?
- It seems that mfaktc gets a bigger "bang" out of the GPU but PrimeNet has enough TF power. What kind of work do you prefer?
- Last question: Where is the turnover point to the 4M FFT?
Brain is offline   Reply With Quote
Old 2011-02-13, 10:41   #414
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Hi ,Brain
Quote:
Originally Posted by Brain View Post
- Last question: Where is the turnover point to the 4M FFT?
39800000.
msft is offline   Reply With Quote
Old 2011-02-15, 10:25   #415
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2×5×61 Posts
Default

Support CUDA device number.

cudalucas.1.1$ ./CUDALucas -D1 216091
device_number >= device_count ... exiting
Attached Files
File Type: bz2 CUDALucas.1.1.tar.bz2 (27.5 KB, 90 views)
msft is offline   Reply With Quote
Old 2011-02-15, 19:34   #416
kjaget
 
kjaget's Avatar
 
Jun 2005

2018 Posts
Default

Quote:
Originally Posted by Brain View Post
- Which CUDA version (CUFFT) is CUDALucas build with? Will most current 3.2 bring a speedup?
From memory, it's 3.1. I've seen mixed reviews of 3.2 for other projects, but have no idea what it will do for this one.

Quote:
- It seems that mfaktc gets a bigger "bang" out of the GPU but PrimeNet has enough TF power. What kind of work do you prefer?
I prefer using CUDALucas since mfaktc gives good speed but also requires CPU core(s) when running. That hurts overall system throughput. I can either work on 5 LL tests, or 3LL tests plus mfaktc on my 4-core system. The former seems more useful, especially with the TF wavefront moving faster than LL testing.

Not to take anything away from mfaktc, though. And I honestly haven't looked at the GHz-days/day comparison between the two scenarios, so it's more a rationalization that properly thought out at this point.

Kevin
kjaget is offline   Reply With Quote
Old 2011-02-17, 12:53   #417
Svenie25
 
Svenie25's Avatar
 
Aug 2008
Good old Germany

3·47 Posts
Default

I have running the windows version. My first test, a DC around 27M is nearly complete. But wich command do I use for checkpoints?
Svenie25 is offline   Reply With Quote
Old 2011-02-17, 17:41   #418
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

331 Posts
Default

Quote:
Originally Posted by Svenie25 View Post
I have running the windows version. My first test, a DC around 27M is nearly complete. But wich command do I use for checkpoints?
I had the same problem:
1. When you start an expo for the first time:
Code:
CUDALucas.exe -c10000 <prime_expo>
2. Next time use:
Code:
CUDALucas.exe -c10000 c<prime_expo>
This reads as "read checkpoint file" as you will find a same named file.
There's also a t<prime_expo> file. I had to use it because c<prime_expo> file was corrupt because of forced shutdown via timed skript.
c10000 means every 10000 iterations, about 70 secs here at me.
Brain is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 10:58.

Fri Jul 10 10:58:37 UTC 2020 up 107 days, 8:31, 0 users, load averages: 1.11, 1.17, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.