mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

nucleon 2013-04-21 08:38

I wouldn't be surprised if there is an issue in the sw stack "somewhere".

I haven't run cudalucas on the 460. I don't have the resources to check atm - it's outside a PC and I'm short PSU to run it.

IMHO the card to get was GTX580. (avoid MSI twin frozr cards - common fan failure)

In saying that, I'm in love with my titans. Even though it's not beating the GTX580 in mfaktc. It does have better power consumption while running it.

-- Craig

lycorn 2013-04-21 13:09

I own a 560Ti from Gigabyte. Works rather well with mfaktc (plenty of factors found).
The two times I tried it on CUDALucas (exponents somewhere in the 25-26M ranges) I got mismatches that later were proved wrong results. In both cases, I got no warnings whatsoever during the runs - they were silent errors. Needless to say I gave up running CL on it.
The temperatures have always been very reasonable - somewhere in the low 60s (C). Nothing hinted at any problem in the card, but well...

kracker 2013-04-21 14:30

I hear the GTX 560 and 570 most of them need to be downclocked for it to be stable. Ask kladner and Mini-Geek.

@chalsall: can't you temporarily run winblows and see what happens, and say, experiment on the memory clocks a bit? :smile:

chalsall 2013-04-21 16:14

[QUOTE=kracker;337790]@chalsall: can't you temporarily run winblows and see what happens, and say, experiment on the memory clocks a bit? :smile:[/QUOTE]

That would certainly be a worthwhile experiment to run, but unfortunately not something I can do easily. The machine is in my girl-friend's office that I don't often visit.

Also, all of my "real" work is Linux based, so that's where I need the stability. If nothing else, this whole experiment has been worthwhile in that I now know what cards not to buy (or rent) if my CV project needs to scale up.

kladner 2013-04-21 16:53

[QUOTE=kracker;337790]I hear the GTX 560 and 570 most of them need to be downclocked for it to be stable. Ask kladner and Mini-Geek.

@chalsall: can't you temporarily run winblows and see what happens, and say, experiment on the memory clocks a bit? :smile:[/QUOTE]

I have a Gigabyte 460 (GF104) and a Gigabyte 570 (GF110), with two fans and three fans respectively.

The 460 (CC 2.1) is nVidia rated at 675 MHz core, 1800 MHz VRAM. Gigabyte OC is 715 MHz core, 1800 MHz VRAM. I run it mostly at 830 MHz core, 1800 VRAM, though I got away with running the VRAM up to 2000 MHz in mfaktc. It turned in many DCLL's with CuLu at 830/1800, but had errors at 850 MHz.

The 570 (CC 2.0) is base rated at 732/1900 MHz, factory OC of 780/1900. It will not successfully run CuLu with that memory speed. A successful CuLu combination is 823/1800, but I usually hold it at 810 MHz for the core clock. This card will do long successful runs (@ factory VRAM speed) of MemTest G80 with the highest RAM allocation I could get it to accept. It runs mfaktc 0.19 and 0.20 without (visible!) problems at the highest settings shown above. It has no problems in FurMark/OCCT at the higher settings.

The 570 is also an RMA replacement. My belief is that it blew one or more capacitors, based on the sound it made when it died, plus the fact that I did not smell smoke as might have been the case if a VRM were at fault. The 570 runs hotter, though part of that is due to it having to breath some of the 460's exhaust.

firejuggler 2013-04-21 17:44

1 Attachment(s)
what can I do for you?

Batalov 2013-04-21 19:54

[QUOTE=kladner;337797]The 570 is also an RMA replacement. My belief is that it blew one or more capacitors, based on the sound it made when it died, plus the fact that I did not smell smoke as might have been the case if a VRM were at fault. [/QUOTE]
Our cards were siamese twins, remember?

Same here, with the cosmetic differences that I had four (!) RMAs; after 3rd RMA (three blowups), the replacement NV-570-OC was no good either (it wouldn't die/blow up, but would "fall off the bus", literally, and set fans to 100%; it was pretty ugly); the 4th RMA was long (they couldn't find a replacement and I did not agree to a 660Ti replacement*); I got NV-570-SO, finally. While it doesn't blow up, it is not CudaLucas stable (50/50 hit and miss), and because I am using it in the linux comp, I am in the same boat as Chris.

I am not buying another consumer-level GPU card. I've figured that their QC parameters ("users will never see dead pixels if there are few enough") are not compatible with my use.

My other, older card is very good though. (EVGA 570-minus-[SUP]1[/SUP]/[SUB]15[/SUB], a.k.a "560Ti 448-core"). I ran a few CUDALucas runs on it (including the "penultimate not-M48" check).

__________
*If you want to know their kitchen, it is pretty convoluted. Some RMA replacements (when original product is no longer available) have a sliding scale of replacement products. 660Ti was offered for free, for 670Ti they wanted me to pay. I negotiated for the 500s series upward from the original product. They didn't give me a 580. Pity! :-)

Additional free advice. Do not use webforms for the second and later RMAs. Use the phone that was included with the first RMA email correspondence ((626) 854-9338, option 4). Demand a prepaid postage label - and they will give it to you. Demand replacement, not a fix, after two RMAs.

owftheevil 2013-05-04 21:17

1 Attachment(s)
Like many of you, I have a card that finds factors with mfaktc and passes all memory tests, but gives round off errors or mis-matched residues while running CUDALucas or CUDAPm1. I wrote this simple gpu memory test to help me see where the errors are coming from. It is not very sophisticated, but has the advantage that it uses the same kind of data and similar memory use patterns as CUDALucas and CUDAPm1. My 560ti which cannot run CULU or CuPm1 fails miserably. My 570 which handles CuLu and CuPm1 unerringly also runs this without error. Give it a try if you are interested. I am curious to see your results.

chalsall 2013-05-04 21:55

1 Attachment(s)
[QUOTE=owftheevil;339265]Like many of you, I have a card that finds factors with mfaktc and passes all memory tests, but gives round off errors or mis-matched residues while running CUDALucas or CUDAPm1. I wrote this simple gpu memory test to help me see where the errors are coming from.[/QUOTE]

Coolness... The Scientific Method...

I will run the complete test with a redirection into a file, but this is what I was able to cut-and-paste from the command line. Looks good.

chalsall 2013-05-04 22:23

[QUOTE=chalsall;339269]Looks good.[/QUOTE]

Okay, Houston, we've had a problem here....

[CODE]Initializing test using 975MB of memory on device 1

Position 0, Iteration 10, Total Errors: read 0, write 0
Position 0, Iteration 20, Total Errors: read 0, write 0
Position 0, Iteration 30, Total Errors: read 0, write 0
...
Position 0, Iteration 290, Total Errors: read 0, write 0
Position 0, Iteration 300, Total Errors: read 0, write 0
Position 0, Iteration 310, Total Errors: read 1, write 0
Position 0, Iteration 320, Total Errors: read 1, write 0
...
Position 5, Iteration 710, Total Errors: read 1, write 0
Position 5, Iteration 720, Total Errors: read 1, write 0
Position 5, Iteration 730, Total Errors: read 1, write 0
....
[/CODE]

chalsall 2013-05-04 23:07

I will give you the full data-set once the run is complete, but all the errors appear to be on read:

[CODE]Position 6, Iteration 940, Total Errors: read 1, write 0
Position 6, Iteration 950, Total Errors: read 1, write 0
Position 6, Iteration 960, Total Errors: read 3, write 0
Position 6, Iteration 970, Total Errors: read 3, write 0
...
Position 11, Iteration 30, Total Errors: read 3, write 0
Position 11, Iteration 40, Total Errors: read 3, write 0
Position 11, Iteration 50, Total Errors: read 4, write 0
Position 11, Iteration 60, Total Errors: read 4, write 0
...
Position 14, Iteration 400, Total Errors: read 4, write 0
Position 14, Iteration 410, Total Errors: read 4, write 0
Position 14, Iteration 420, Total Errors: read 5, write 0
Position 14, Iteration 430, Total Errors: read 5, write 0
Position 14, Iteration 440, Total Errors: read 5, write 0
Position 14, Iteration 450, Total Errors: read 6, write 0
Position 14, Iteration 460, Total Errors: read 6, write 0
....
[/CODE]


All times are UTC. The time now is 23:13.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.