mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

TheJudger 2015-09-26 10:58

OK, 352.07 has lots of N/A for me while 352.41 has less N/A (more information).

blip 2015-09-28 09:25

I updated to 355.11, still the same N/A's for my GTX590.

kladner 2015-09-28 11:39

[QUOTE=blip;411459]I updated to 355.11, still the same N/A's for my GTX590.[/QUOTE]

355.82 is no different on a 580.

TheJudger 2015-09-28 17:16

OK, highly depends on the GPU itself, on my GT 630 (GK208) the number of "N/A" remains the same, too.
I saw less "N/A" so far on[LIST][*]cheap 750 (non-Ti)[*]ASUS 970 Strix OC[*]ref. GTX 980[*]Palit 980Ti OC[/LIST]

Oliver

henryzz 2015-09-30 17:25

[QUOTE=TheJudger;411491]OK, highly depends on the GPU itself, on my GT 630 (GK208) the number of "N/A" remains the same, too.
I saw less "N/A" so far on[LIST][*]cheap 750 (non-Ti)[*]ASUS 970 Strix OC[*]ref. GTX 980[*]Palit 980Ti OC[/LIST]
Oliver[/QUOTE]

All of which are Maxwell v1 or v2

kladner 2015-09-30 17:45

2 Attachment(s)
[QUOTE=wombatman;411047]This is good info. I have an EVGA, so that may be it, but I'll try your suggested tweaks and see if it does anything. Thanks![/QUOTE]

Well, so much for uninterrupted running of CuLu. I saw the display blink, and when I looked at CL, the restart count was at 5. I restarted the whole system, and CL restarted again within 20 minutes. This is with TDRDelay set at 128, which I saw someone mention somewhere on the forum. (I am also aware of your recommended setting of 10, WombatMan.:smile:)

The voltages and clock settings were the same as on a previous DC which got through 12+ hours without resetting.

I am starting to think that it is a waste of processing time to set TDRD that high. On the Afterburner monitor that card showed declining temperature for a few minutes. This suggests that it has stopped doing real work and is unlikely to resume without a reset. I will set TDRDelay set at 15, and see if I can screen-grab the pattern in the plot.

EDIT: It sneaked one past me. TDRD was still 128, but there was no apparent long decline. All I could guess at being connected to the reset is the tiny blip attached.

Notice that CL runs the card at 99% usage. This is true regardless of what else is running on the system. On the other hand, with mfaktc, usage drops from 99% to 98% when P95 gets going. This does not happen with the Small FFT stress test, but it does with Large and Blend tests.

EDIT2: Reset again, just now. Still just the momentary dip in usage. I have seen, but failed to capture ravine-like plots in temp preceding a reset.

I just made a simulation of such a plot using mfaktc. See below.

kladner 2015-09-30 22:32

Ooops! Wrong thread.

EDIT: Regarding the plots shown above, I have now concluded that the momentary dips are not associated with CuLu timing out. I have seen such when CL was running smoothly.

airsquirrels 2016-02-02 01:04

I have a system with a Titan Z, a 590, and a 690 in it.

Previously all three were running mfaktc on both GPUs without issue. I switched the Z GPUs over to LL and that has been very successful, however the 590 and 690 both return all 0x000000000000 interim residues and 0.0 error rates. No errors that I can see.

Any idea what is causing this?

Brain 2016-02-02 20:45

When testing my first Titan I couldn't run CL without wrong residues. Another user found out that downclocking the memory clock solves the problem. I'm successfully running for several years now on 2600 MHz instead of 3000 MHz. Give it a try.

airsquirrels 2016-02-02 21:05

Unfortunately my issue isn't with incorrect residues, rather no residues at all. It is as if something is failing moving the initial data to the card.

bgbeuning 2016-02-03 01:08

[QUOTE=airsquirrels;425008]Unfortunately my issue isn't with incorrect residues, rather no residues at all. It is as if something is failing moving the initial data to the card.[/QUOTE]

Me too. Every iteration says residue = 0.
This is my first time running CUDALucas so I did not know that was wrong.

I compiled CUDALucas to get it to work, so I could have easily done something
wrong. Maybe it could check for all residue 0 and quit saying something is broke.


All times are UTC. The time now is 23:01.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.