mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   Geforce GTX Titan 6GB (https://www.mersenneforum.org/showthread.php?t=17834)

chalsall 2013-03-13 17:45

[QUOTE=Dubslow;333206]You are wrong. The issue here is bitshift-edness, which non-testing versions of CuLu don't yet do. (Mostly my fault at the moment.)[/QUOTE]

Ah... Very good point! I didn't think of that.

ewmayer 2013-03-13 18:56

[QUOTE=chalsall;333207]I would argue that CUDALucas has proven itself to be as trustworthy as Prime95/mprime.[/QUOTE]

Based on all the tests done to date, what is the error rate?

chalsall 2013-03-13 19:16

[QUOTE=ewmayer;333214]Based on all the tests done to date, what is the error rate?[/QUOTE]

A good question. Without a massive amount of spidering, I can't speak to that.

I was basing my opinion on the fact that CUDALucas, and many GPUs, were used to confirm our recent find. To the best of my understanding, none which didn't self-report errors didn't confirm the result.

As always, I'm happy to be proven wrong.

Batalov 2013-03-13 19:38

[in philosophical mood]
You know how negative results are [STRIKE]never[/STRIKE] rarely published?
Second to that, when people [I]per aspera ad astra[/I] do get to positive results, do they publish all their blood, sweat and tears? No, as a general rule, they do not. I happened to see blood, sweat and tears of CUDALucas (including the validation), but I will not elaborate.

Suffice it to say, that someone who would like to run two simultaneous Tesla CUDALucas runs for 90 days on M332,xxx,xxx and match their residues along the way will do good to quit their day job for that. ([SIZE=1]Plus, of course, the shift does need to be implemented - with the retrieval of the RES64 from a proper place in the residue, so that it would be comparable. But this is no biggie. There's a program that will compare two randomly shifted equivalent savefiles, too, even though this is an overshot. If ther will be 1 bit mismatch in any given iteration, after a few iterations, every bit will be completely scrambled, so comparing RES64s is totally adequate[/SIZE].)
[mood /][back-to-work mood]

chalsall 2013-03-13 19:54

[QUOTE=Batalov;333223][in philosophical mood]
You know how negative results are [STRIKE]never[/STRIKE] rarely published?][back-to-work mood][/QUOTE]

And then you have the truly brave and honest scientists.

[URL="http://en.wikipedia.org/wiki/Andrew_Lyne"]Andrew Lyne[/URL] comes to mind.

Batalov 2013-03-13 20:44

Rosie Redfield is another.
Eric Tippmann is yet another. There are quite a few.

nucleon 2013-03-13 22:28

Another Titan double check match.

Processing result: M( 29325073 )C, 0x23c2d7bc0d08e8d4, n = 1835008, CUDALucas v2.03
LL test successfully completes double-check of M29325073
CPU credit is 29.1622 GHz-days.

To me looks like underclocking the RAM did the trick.

The hard part now is to work out the most efficient use of this card.

-- Craig

kracker 2013-03-13 23:30

I guess CuLu is *really* a intensive memory test(er).
I mean, meh, Titan is supposed to be a half-compute card....

TObject 2013-03-13 23:52

I wonder if Titans in the Titan supercomputer at Oak Ridge National Laboratory are running at full speed. Those do have ECC memory, but it must be throwing faults like crazy…

How does ECC memory on GPUs work, just silent correction?

chalsall 2013-03-14 00:01

[QUOTE=TObject;333255]How does ECC memory on GPUs work, just silent correction?[/QUOTE]

My understanding is they halt when an error is detected.

frmky 2013-03-14 00:38

The Titan cluster uses Tesla K20x's, which are clocked at 732 MHz. Basically, they are the cream of the crop and clocked a little slower to make sure they remain stable. I'm sure they are fine.


All times are UTC. The time now is 10:32.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.