mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

Dubslow 2012-09-03 19:27

[QUOTE=flashjh;310180]How long does your 460 take for CL?[/QUOTE]

Mine needs ~40 hrs for a DC.

kladner 2012-09-03 19:38

[QUOTE=flashjh;310180]How long does your 460 take for CL?[/QUOTE]

Roughly 36-48 hrs for DC, about twice that for LL. (From memory. I don't have records.)

Here is a Google search for driver removal and cleanup. (Just passing it along. I found some helpful items in this lot.)
[URL]https://www.google.com/search?q=remove+nvidia+drivers+completely&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a[/URL]

And here are test utilities for CUDA and OpenCL from the Stanford Folding@Home site.
[URL]http://folding.stanford.edu/English/DownloadUtils#ntoc2[/URL]

@chalsall: Too damned funny!

Just finished 500 iterations of memtestG80 on the GTX 570:
[CODE]Final error count after 500 iterations over 1140 MiB of GPU memory: 0 errors[/CODE]

I think I'd be happier if I could get a clear error report from some of these torture programs. I guess CuLu is just a more brutal torturer.

jrk 2012-09-03 20:08

[QUOTE=kladner;310007][CODE]Iteration 10000 M( 46069867 )C, 0x21048490b7febb41, n = 2560K, CUDALucas v2.04 Beta err = 0.1172 (4:25 real, 26.4671 ms/iter, ETA 338:33:31)
Iteration 20000 M( 46069867 )C, 0xc9c576910a569076, n = 2560K, CUDALucas v2.04 Beta err = 0.1172 (4:24 real, 26.3610 ms/iter, ETA 337:07:41)
Iteration 30000 M( 46069867 )C, 0x1deb3580b15cb791, n = 2560K, CUDALucas v2.04 Beta err = 0.1172 (4:23 real, 26.3625 ms/iter, ETA 337:04:25)
SIGINT caught, writing checkpoint. Estimated time spent so far: 16:00
[/CODE][/QUOTE]
kladner, since you posted this, it appears that sometimes you are able to run CUDALucas without problem for awhile (like here) and other times you cannot run it correctly at all.

You have two GPUs in your system, correct? Have you checked that your power supply is functioning correctly (voltages) and is rated for enough wattage to run two GPUs along with the other components? (i.e. at least 800W, depending on what else you have installed).

kladner 2012-09-03 21:14

[QUOTE=jrk;310194]kladner, since you posted this, it appears that sometimes you are able to run CUDALucas without problem for awhile (like here) and other times you cannot run it correctly at all.

You have two GPUs in your system, correct? Have you checked that your power supply is functioning correctly (voltages) and is rated for enough wattage to run two GPUs along with the other components? (i.e. at least 800W, depending on what else you have installed).[/QUOTE]

I have two in the system now. But I just put the GTX 460 back in. At Jerry's suggestion I did a lot of testing of CUDALucas with just the 570 installed. When I got the 570 I did have to get a bigger PSU, though it is only rated at 750W. ATM, with 4x mfaktc on the 570, and CuLu on the 460, and 2x P-1 on the Phenom II x6 1090t, the total draw from the line is ~675W.

Even though I could not make the 570 run CuLu, even by itself, this exercise has worked out in some senses. The 570 is now in the primary PCIe slot and it is cooling much better than it did in the secondary slot. Conversely, the 460 is running somewhat hotter, but that's OK. The 570 cranks out a lot more heat so I'm glad to see it a bit cooler.

EDIT: The fact that the 570 would occasionally run CuLu correctly (until a restart) was particularly frustrating. Still, I'm having to accept that there is something funky about that card. I wish I could pin it down.

kladner 2012-09-03 23:32

I am exploring support options with Gigabyte. The card is less than 2 years old. The warrantee is 3 years.

kladner 2012-09-05 15:25

@Flash-

Thanks for your suggestion of getting the 570 running by itself. Even though I have not been able to resolve the CuLu issue, I have gotten my display feeding off the 570. Desktop responsiveness is much better, and I can now run CuLu on the 460 with Polite=0 full time without crippling general usability.

flashjh 2012-09-05 15:53

Sure, no problem. Just wish we could get your card working. Hopefully Gigabyte will swap it for you.

kladner 2012-09-05 15:57

[QUOTE=flashjh;310405]Sure, no problem. Just wish we could get your card working. Hopefully Gigabyte will swap it for you.[/QUOTE]

Still waiting for a response to my query.

kladner 2012-09-06 21:47

I just spent another half day going through driver removal and re-installation. I have settled on the detailed instructions given here:

[url]http://www.evga.com/forums/tm.aspx?high=&m=1174372&mpage=1#1174372[/url]

Some key points the poster emphasizes:[INDENT]1) Use Windows "Programs and Features" for uninstalling. Safe Mode is mostly not recommended since PhysX installer cannot run in that state. Use of Driver Sweeper/Driver Fusion are discouraged.
2) When uninstalling, always do the display driver last.
3) Do NOT install anything but the display driver and PhysX. The 3D Vision drivers and the HD Audio driver are specifically discouraged as useless to most setups, and as being troublemakers. (He calls them, and nVidia Update bloatware.)
[/INDENT]The forum post is quite long and goes into detail as to why he recommends these procedures with drivers starting in the 260 family to the present.

I tried drivers 301.42 which is the current WHQL version, and BETAs 304.79 and 306.02. I'm current running the last.

I did get a response today from Gigabyte support. The responder did not seem to be aware of GPUs being used in GIMPs. I have answered with more details about that part of GIMPS. I also pointed out that my Gigabyte GTX 460 runs CuLu without problems, and that there are other users with 570s who don't have the same problems.

The upshot is that CuLu still fails on the 570, so I'm back to using it for mfaktc and the 460 for CuLu.

The ball's back in their court.

kladner 2012-09-25 14:33

CUDALucas errors out @ particular iteration
 
After quite a number of successful LL and DC runs on the GTX 460, this morning I discovered CuLu wasn't running. After several attempts I captured the following information.
[CODE]mkdir: cannot create directory `savefiles': File exists
Continuing work from a partial result of M27278xxx fft length = 1440K iteration = 24700001
Iteration 24800000 M( 27278xxx )C, 0xc411eff38b7892cb, n = 1440K, CUDALucas v2.04 Beta err = 0.2969 (8:42 real, 5.2142 ms/iter, ETA 3:28:34)
Iteration = 24873570 >= 1000 && err = 0.35938 >= 0.35, fft length = 1440K, writing checkpoint file (because -t is enabled) and exiting.


RESTART:
Continuing work from a partial result of M27278527 fft length = 1440K iteration = 24300001
Iteration 24400000 M( 27278xxx )C, 0x3fe0d26bf5ef3efd, n = 1440K, CUDALucas v2.04 Beta err = 0.2910 (8:44 real, 5.2432 ms/iter, ETA 4:04:40)
Iteration 24500000 M( 27278xxx )C, 0xd679e2c74c32a974, n = 1440K, CUDALucas v2.04 Beta err = 0.2969 (8:45 real, 5.2509 ms/iter, ETA 3:56:17)
Iteration 24600000 M( 27278xxx )C, 0x9b52631a0b698d53, n = 1440K, CUDALucas v2.04 Beta err = 0.2891 (8:45 real, 5.2488 ms/iter, ETA 3:47:26)
Iteration 24700000 M( 27278xxx )C, 0x46cee10be9356d7a, n = 1440K, CUDALucas v2.04 Beta err = 0.3008 (8:44 real, 5.2414 ms/iter, ETA 3:38:23)
Iteration 24800000 M( 27278xxx )C, 0xc411eff38b7892cb, n = 1440K, CUDALucas v2.04 Beta err = 0.2969 (8:44 real, 5.2433 ms/iter, ETA 3:29:44)
Iteration = 24873570 >= 1000 && err = 0.35938 >= 0.35, fft length = 1440K, writing checkpoint file (because -t is enabled) and exiting.[/CODE]I would appreciate any suggestions. As you can see, this run was within a few hours of completion. The error level barely exceeded 0.35. Do I have to restart it with a higher FFT?

Sorry if this has been addressed before. I don't remember anything quite like it.

EDIT: If restarting the exponent is the only answer, I would really appreciate a suggested FFT.

EDIT2: I restarted with -t disabled and it has gone from It. 24873502 to 24900000 with the error reported as 0.2734.
It just reached It. 25000000 err = 0.2500.

Dubslow 2012-09-25 14:51

It seems to be a reproducible error, like Prime95 sometimes shows. Unfortunately, CUDALucas doesn't have a built-in way to override the error check and keep going. (Note that the error has been north of 0.25 for the whole test, so this exponent is [i]right[/i] on the edge, and probably should have used a higher FFT from the start. Do you know what the average error from the initial roundoff test at the beginning was? If it was right on the edge, then I'll probably decrease the allowable error for the initial test.)

The workaround I can think of is to save/pause, turn off -t, and then relaunch it and hope the error doesn't get caught.


All times are UTC. The time now is 23:15.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.