mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2014-03-21, 16:44   #2157
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

25BF16 Posts
Default

Quote:
Originally Posted by owftheevil View Post
How about backing up the old fft.txt file? (I mean instead of overwriting it) Other routines depend on fft.txt being in increasing order.
Perfect for me. Rename it like "(chip)fft_0.txt", "..._1.txt", on the same idea like here (second code box). 3 copies are enough, if the guy does dot realize after 2 times that the file is overwritten renamed, that he is either stupid or he does not care. Then I/he can manually interleave and sort if I/he want(s).

(edit: the only idea is to not lose a LONG fft file with all reasonable sizes inside, without notification (like it is happening). Maybe I worked one full day to get that file and I don't have backup! I would be very angry than! - luckily I had more folders with the same content, having more of the same cards, and I had copies of the file in those folders, it may not always be the case)

(edit 2: optimization of threads works very nice, and faster than the older version. The only unchanged thing is that the work is saved at the end, which may result in trouble if there is a crash, but here is no problem, this optimization is only done once in the lifetime, and it can be split in few consecutive small jobs, I mean I don't need to use "-threadbench 1 20480 6 0", but use 3-4 "splits". Which I was enough stupid not to think about, and the job took since the last post. Fortunately finished with success )

Last fiddled with by LaurV on 2014-03-21 at 16:56
LaurV is offline   Reply With Quote
Old 2014-04-01, 20:29   #2158
pdazzl
 
Apr 2014

7·17 Posts
Default Happening to me with GTX 570

Thanks for the restart batch file.

I am getting the API runtime errors, even with the latest beta build r65 (running toolkit 5.0 and latest 335.23 nvidia drivers )....however this is only happening on my gtx 570, not my 280. I have noticed that the 570 will run stable until I stop the job and go to mfaktc and then switch back to the LL job. It'll continue happening until I reboot my box. So far that seems to be what triggers the API errors for me. I have never seen this behavior on my 280 even when switching between cuda lucas and mfaktc.



Quote:
Originally Posted by flashjh View Post
r60 compiled and tested (still needs more). CUDA 4.2 up to 5.5 all working, release and debug. All posted to SourceForge

This version (and r57 and up) include new rcb code from Prime95 that give about a 1% speed improvement! Exciting for CUDALucas, but does need testing, please.

In my testing CUDA 5.5 and Win32 are slightly faster than earlier versions or x64 (but you may need a batch file to keep it going, see below)

What works:
-cufftbench
-r
-normal testing

What Doesn't:
-threadbench

Didn't test:
-memtest

For those experiencing stops: This is an nVidia driver issue. Here is some info and I included some workarounds

<=306.97 work with x86/x64 CUDA 4.2 and CUDA 5.0 builds perfectly fine and produces no restarts (at least none from my testing over several days).

>=310.70 have resets no matter what platform/CUDA version including 5.5 with >=320.18.

There are two workarounds for anyone experiencing a similar problem described by mognuts:

1) The best way to fix the error is to downgrade your driver to one of the versions <=306.97 as mentioned above.

CUDA Driver Versions:

Code:
CUDA 5.5:                  CUDA 5.0                 CUDA 4.2
331.82  19-Nov-13          314.22   25-Mar-13       301.42  22-May-12
331.65  07-Nov-13          314.07   18-Feb-13       296.10  13-Mar-12
331.58  21-Oct-13          310.90   05-Jan-13       295.73  21-Feb-12
327.23  19-Sep-13          310.70   17-Dec-12       285.62  24-Oct-11
320.49  01-Jul-13          306.97   10-Oct-12       280.26  09-Aug-11
320.18  23-May-13          306.23   13-Sep-12       275.33  01-Jun-11
I did not actually test below 296.10 so I don't know where the CUDA changes over to < CUDA 4.2 but I figure most will be on 296.10 by now.

Windows CUDALucas from CUDA 4.0 up to 5.5, 32 or 64 bit are on SourceForge

Request: I need to know who else is having the *stop* issue and what driver and video card you have. I'm working with NVidia to try and get the drivers fixed, so it will be helpful to know what other cards have this issue.

2) The other 'fix' for this issue is to use a batch file similar to this:
Code:
@echo off
Set count=0
Set program=CUDALucas2.05Beta-CUDA5.0-Win32-r60
:loop
TITLE %program% Current Reset Count = %count%
Set /A count+=1
rem echo %count% >> log.txt
rem echo %count%
%program%.exe
GOTO loop
This will restart CUDALucas each time it stops and allow you see how many resets have occurred, if you care.

I have not been able to thoroughly test speeds yet; I know that CUDA 5.5 is usually faster, but at the cost of having the driver lockup. Combined with the batch file, there really is no issue other than if the restarts bother you as I've run many good DCs with the batch file.

With <=306.97, you don't need the batch file and there are no restarts, but it could potentially be &slightly* slower. I would love to see actual test data from everyone. Also, if anyone does experience the *stop* while on <=306.97, please let me know ASAP so I can update this info and nVidia.

As for reliability, I have completed many successful tests with 2.05 Beta, CUDA 4.0 up to 5.5, 32 and 64 bit. Many with a lot of stop and restarts and forced FFT size changes for testing the code.

pdazzl is offline   Reply With Quote
Old 2014-04-03, 20:16   #2159
MikeBerlin
 
Mar 2012
Germany

2×13 Posts
Default CudaLucas doesn´t work anymore

Round off error at iteration = 21463800, err = 0.5 > 0.40, fft = 3584K.
Increasing fft and restarting from last checkpoint.

Using threads: square 128, splice 256.

Continuing M62494429 @ iteration 21460001 with fft length 4096K, 34.34% done

After some errors more, the programm stops.
If I restart it tells at the end (example):
Processing result: M( x )C, 0xy, offset = 6684, n = 4096K, CUDALucas v2.05 Beta, g_AID: A6ACCD2C719C7543871E42683998589C.
I think, the result ist bad.
MikeBerlin is offline   Reply With Quote
Old 2014-04-03, 20:51   #2160
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

Can you recall what the "more errors" were?

The root problem is most likely memory, at least that's the only time I see a roundoff error like that. But I don't know whats going on with the apparent output of a result after the errors.
owftheevil is offline   Reply With Quote
Old 2014-04-09, 15:59   #2161
MikeBerlin
 
Mar 2012
Germany

2×13 Posts
Default

Quote:
Originally Posted by owftheevil View Post
Can you recall what the "more errors" were?
No, I cant The Logfile doesn´t exist anymore. I tried the next with the "savefile"-Option. Only if I reduce the memory-speed (-500 MHz) and the "Power Limit" (57%) -> GPU = 692 MHz, I get less errors. But what does "g_AID" mean? (last result: M( 62494429 )C, 0x7191357b114a13__, offset = 31262106, n = 4096K, CUDALucas v2.05 Beta, g_AID: EEFBC9895C77C54B1AC676621FFA____)
How can I see, how my "Computer must be proven reliable"?
Ohhhh, I forget to log in in GIMPS and lost my result. 157 GHz-days!

Now I found in my assginments the same exponent as double check!

Last fiddled with by Batalov on 2014-04-09 at 17:04 Reason: masked parts of output
MikeBerlin is offline   Reply With Quote
Old 2014-04-09, 17:04   #2162
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

9,497 Posts
Default

Quote:
Originally Posted by MikeBerlin View Post
No, I cant The Logfile doesn´t exist anymore. I tried the next with the "savefile"-Option. Only if I reduce the memory-speed (-500 MHz) and the "Power Limit" (57%) -> GPU = 692 MHz, I get less errors. But what does "g_AID" mean? (last result: M( 62494429 )C, 0x7191357b114a13__, offset = 31262106, n = 4096K, CUDALucas v2.05 Beta, g_AID: EEFBC9895C77C54B1AC676621FFA____)
How can I see, how my "Computer must be proven reliable"?
Ohhhh, I forget to log in in GIMPS and lost my result. 157 GHz-days!
PM user Prime95 and you will be helped.
Quote:
Now I found in my assginments the same exponent as double check!
That's not useful to you. The second GPU result will be given no credit.
Batalov is offline   Reply With Quote
Old 2014-04-09, 18:43   #2163
MikeBerlin
 
Mar 2012
Germany

1A16 Posts
Default

Quote:
Originally Posted by Batalov View Post
PM user Prime95 and you will be helped.
Thank very much for this hint. Maybe, he will solve my old Problem with M332,224,379
Quote:
That's not useful to you. The second GPU result will be given no credit.
yes and this is gone allone.

Last fiddled with by MikeBerlin on 2014-04-09 at 18:43
MikeBerlin is offline   Reply With Quote
Old 2014-04-10, 15:42   #2164
MikeBerlin
 
Mar 2012
Germany

328 Posts
Default

Quote:
Originally Posted by Batalov View Post
PM user Prime95 and you will be helped.
YES, he did it!Many thanks again.
MikeBerlin is offline   Reply With Quote
Old 2014-04-25, 10:13   #2165
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

2D916 Posts
Default

hi! Sorry to jump in on this thread.

How efficient is the code running?

https://developer.nvidia.com/cuFFT

I see there at bit larger transforms cuFFT gets at M2090 tesla efficiency of under 100 Gflop. Didn't checkout code yet - will soon. This Tesla delivers 666 Gflop. Not counting fused-multiply-adds (didn't check yet whether their code uses them - assuming not) then it's 333 Gflop. So efficiency of around 30%.

How is efficiency there for CUDALucas at bit larger transforms?

Interested in gpgpu fft for Riesel :)
diep is offline   Reply With Quote
Old 2014-04-28, 04:24   #2166
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

CUDALucas 2.05Beta r67 is posted for Windows. CUDA 4.2, 5.0, 5.5 and 6.0

CUDA 6.0 Libs are here

Quote:
r67 just uploaded includes a facility for backing up fft.txt files with a timestamp. I've included a README and CUDALucas.ini with a few updates. The README has a rough draft of a new section on command line options and tuning. There are still many additionsand other changes to be made...
flashjh is offline   Reply With Quote
Old 2014-04-28, 06:31   #2167
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

2·1,303 Posts
Default

Code:
CUDALucas_205Beta_CUDA6.0-x64_r67.exe  -cufftbench 1 4096 1

------- DEVICE 0 -------
name                GeForce GTX 750 Ti
Compatibility       5.0
clockRate (MHz)     1110
memClockRate (MHz)  2700
totalGlobalMem      2147483648
totalConstMem       65536
l2CacheSize         2097152
sharedMemPerBlock   49152
regsPerBlock        65536
warpSize            32
memPitch            2147483647
maxThreadsPerBlock  1024
maxThreadsPerMP     2048
multiProcessorCount 5
maxThreadsDim[3]    1024,1024,64
maxGridSize[3]      2147483647,65535,65535
textureAlignment    512
deviceOverlap       1

Using threads: square 256, splice 128.
Attached Files
File Type: txt GeForce GTX 750 Ti fft.txt (2.1 KB, 94 views)

Last fiddled with by firejuggler on 2014-04-28 at 06:36
firejuggler is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 06:48.


Fri Aug 6 06:48:02 UTC 2021 up 14 days, 1:17, 1 user, load averages: 3.02, 2.81, 2.76

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.