mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-08-21, 18:25   #1541
TObject
 
TObject's Avatar
 
Feb 2012

19516 Posts
Default

I upgraded from CUDALucas-2.04 Beta-4.1-sm_21-x64.exe to CUDALucas-2.04 Beta-4.2-sm_30-x64.exe and now I am getting the following error:

CUDALucas.cu(163) : cufftSafeCall() CUFFT error 6: CUFFT_EXEC_FAILED

I thought I had bad CUDA DLLs, but I downloaded fresh ones from the recommended site, and I still get the error.

Please advise.

Thank you.
TObject is offline   Reply With Quote
Old 2012-08-22, 01:21   #1542
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

What card are you using it on? If you're using sm_30 then you need at least a Kepler. Use the architecture which your card belongs to. [45][78]0s are 2.0, [45][1-6]0 are 2.1, and everything <= GTX 2** are 1.x.
Dubslow is offline   Reply With Quote
Old 2012-08-22, 01:24   #1543
TObject
 
TObject's Avatar
 
Feb 2012

34·5 Posts
Default

I use a GTX 580, but I am upgrading to 4.2 because the latest version of mfaktc uses 4.2.

I run CudaLukas and mfaktc side-by-side and I had a problem with mismatched CUDA versions.

Thank you.

Last fiddled with by TObject on 2012-08-22 at 01:32
TObject is offline   Reply With Quote
Old 2012-08-22, 01:31   #1544
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by TObject View Post
I use a GTX 580, but I am upgrading to 4.2 because the latest version of mfaktc uses 4.2.
Thank you.
Okay, but you'll still need a CUDA_4.2-sm_20 executable. flash doesn't compile those AFAIK. You have to choose one with sm <= 20. If you want CUDA_4.2-sm_20, you'll have to compile it yourself or ask flash to do it.

Note that there probably won't be a performance increase from switching CUDA versions.
Dubslow is offline   Reply With Quote
Old 2012-08-22, 01:33   #1545
TObject
 
TObject's Avatar
 
Feb 2012

34·5 Posts
Default

I see. Thank you.
TObject is offline   Reply With Quote
Old 2012-08-22, 01:36   #1546
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

21438 Posts
Default

I can compile just about any combination, but the problem is what Dubslow already pointed out: you won't see much of an improvement.

The version that fastest on all of my 580s is CUDA_3.2 | sm_1.3. You should try that one and let us know how it works for you.
flashjh is offline   Reply With Quote
Old 2012-08-28, 05:02   #1547
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

New versions of 2.04 beta are uploaded here. They are based on r32 from SourceForge which is the baseline update discussed that included the filelocking. I had to make some minor adjustments to CUDALucas.cu and Parse.c to get it to compile, but I did not make any changes to the functions.

Without having to figure out what went wrong right now, as I was reviewing the changes between r32 and r37 to make it compile, I found that the modified open_s function that I added for MSVS, which used _sopen_s, was probably wrong and caused the problem... hopefully.

I have committed the changes to r38.

Everyone please test this build for the filelocking error. Thanks!


Quote:
Originally Posted by Dubslow View Post
Okay, but you'll still need a CUDA_4.2-sm_20 executable. flash doesn't compile those AFAIK.
I compiled a 4.2 | sm_20 version, in case you need it.

Last fiddled with by flashjh on 2012-08-28 at 05:13
flashjh is offline   Reply With Quote
Old 2012-08-28, 09:36   #1548
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Hi,

Quote:
Originally Posted by TObject View Post
I use a GTX 580, but I am upgrading to 4.2 because the latest version of mfaktc uses 4.2.

I run CudaLukas and mfaktc side-by-side and I had a problem with mismatched CUDA versions.

Thank you.
Quote:
Originally Posted by flashjh View Post
I can compile just about any combination, but the problem is what Dubslow already pointed out: you won't see much of an improvement.

The version that fastest on all of my 580s is CUDA_3.2 | sm_1.3. You should try that one and let us know how it works for you.
In theory all you need is a driver which is capable of CUDA 4.2 or newer. Than you download and unpack mfaktc 0.19 in one directory, mfaktc has the correct runtime libs included in the download. Than download CUDALucas and put the right runtime libs into the CUDALucas directory. You don't need to install the CUDA toolkit.
There is only one point where I'm unsure: I don't know whether you can run both apps (with different CUDA versions) concurrently or not.

Oliver
TheJudger is offline   Reply With Quote
Old 2012-08-28, 18:09   #1549
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

160658 Posts
Default

Quote:
Originally Posted by flashjh View Post
New versions of 2.04 beta are uploaded here. They are based on r32 from SourceForge which is the baseline update discussed that included the filelocking. I had to make some minor adjustments to CUDALucas.cu and Parse.c to get it to compile, but I did not make any changes to the functions.

Without having to figure out what went wrong right now, as I was reviewing the changes between r32 and r37 to make it compile, I found that the modified open_s function that I added for MSVS, which used _sopen_s, was probably wrong and caused the problem... hopefully.

I have committed the changes to r38.

Everyone please test this build for the filelocking error. Thanks!


Edit: r33 and r37 had some changes, including updated FFT lengths. Those need to be reincorporated. I'll try and make those into an r39.

Last fiddled with by Dubslow on 2012-08-28 at 18:13
Dubslow is offline   Reply With Quote
Old 2012-08-28, 19:41   #1550
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Okay, slight change of plans: I recall LaurV somewhere saying that a larger FFT length was faster than some smaller ones in CUDALucas' table, but I wasn't able to relocate that post. In addition, I will also add the signal-handling fix discussed before to r39.

In the meantime, all Windows users should test flash's latest compile for the filelocking bug; note, however, that compared to earlier beta releases, some FFT lengths might not appear. If the bug is confirmed killed, then the final release (non-beta) of 2.04 will reincorporate the changes from the old binary lost in the new ones (i.e., it will be r39). r39 will be committed when LaurV responds.
Dubslow is offline   Reply With Quote
Old 2012-08-28, 23:04   #1551
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Edit: r33 and r37 had some changes, including updated FFT lengths. Those need to be reincorporated. I'll try and make those into an r39.
Quote:
Originally Posted by Dubslow View Post
Okay, slight change of plans...
It had been so long, I couldn't remember what was done/not done. I remember making the FFT table changes now. I can help reincorporate, if you want, or just let me know when R39 is ready and I'll compile it.
flashjh is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 23:30.


Fri Aug 6 23:30:30 UTC 2021 up 14 days, 17:59, 1 user, load averages: 3.88, 3.87, 3.95

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.