mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2018-03-22, 14:50   #1
MisterBitcoin
 
MisterBitcoin's Avatar
 
"Nuri, the dragon :P"
Jul 2016
Good old Germany

23×3×37 Posts
Default Titan V may generate false results.

Different results for the same test (proteine and enzyme simulation), well thats not good.

Source.
MisterBitcoin is offline   Reply With Quote
Old 2018-03-22, 15:06   #2
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

32·131 Posts
Default

I wouldn't be surprised by this IF it was a consumer/gaming card, not a 2000-3000 €/$ card....
VictordeHolland is offline   Reply With Quote
Old 2018-03-22, 15:11   #3
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

6,793 Posts
Default

Non-ECC memory so such things should be expected. And it isn't just this card that will exhibit the problem, all systems without ECC memory are affected.
retina is online now   Reply With Quote
Old 2018-03-23, 01:35   #4
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

23·313 Posts
Default

Probably a dumb question, but does ECC memory reduce the error rate to near zero, or just by a portion?
ixfd64 is offline   Reply With Quote
Old 2018-03-23, 01:59   #5
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

2×13×131 Posts
Default

Quote:
Originally Posted by MisterBitcoin View Post
Different results for the same test (proteine and enzyme simulation), well thats not good.

Source.
From the article:
Quote:
After repeated tests on four of the top-of-the-line GPUs, he found two gave numerical errors about 10 per cent of the time.
So essentially an error rate of 5% total for the 4 machines, which, hey, is about what we see here at GIMPS.
Madpoo is offline   Reply With Quote
Old 2018-03-23, 02:03   #6
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

65168 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Probably a dumb question, but does ECC memory reduce the error rate to near zero, or just by a portion?
I'll go out on a limb and say it reduces it to near zero, at least for Prime95.

In all of my tests on ECC enabled systems, never had a bad result (and I only did double-checks, so... there you go).

There are also other users out there who use ECC systems almost exclusively (on AWS instances, or whatever else) and those also have an impeccable history.

Unfortunately Primenet isn't tracking whether or not ECC is in use on the systems (hey, feature request!) so this is anecdotal evidence.
Madpoo is offline   Reply With Quote
Old 2018-03-23, 02:07   #7
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

1A8916 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Probably a dumb question, but does ECC memory reduce the error rate to near zero, or just by a portion?
The reduction ratio would be approximately this:

Going from one single error in the gigabytes of data killing the entire run, to needing three (or more) concurrent errors all clustered into a single 72-bit (64 bits of data + 8 bits of checksum) memory address.
retina is online now   Reply With Quote
Old 2018-03-23, 02:19   #8
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

9C816 Posts
Default

I see. So that would make the chance of an error very, very low, then.
ixfd64 is offline   Reply With Quote
Old 2018-03-23, 04:42   #9
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

24·3·163 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Probably a dumb question, but does ECC memory reduce the error rate to near zero, or just by a portion?
This reference seems to indicate memory dominates because it's most of the area of the active system devices, but transmission lines and logic circuits are also affected. Cosmic rays dominate. https://en.wikipedia.org/wiki/Soft_e...oft_error_rate
kriesel is online now   Reply With Quote
Old 2018-03-23, 04:53   #10
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

6,793 Posts
Default

Quote:
Originally Posted by kriesel View Post
Cosmic rays dominate.
Do you have a reference for this? I think that is not actually correct. Most cosmic rays are stopped well before reaching ground level. I think most of the soft errors are actually caused by radioactive impurities in the plastic packaging.
retina is online now   Reply With Quote
Old 2018-03-23, 04:59   #11
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

20516 Posts
Default

Quote:
Originally Posted by retina View Post
Do you have a reference for this? I think that is not actually correct. Most cosmic rays are stopped well before reaching ground level. I think most of the soft errors are actually caused by radioactive impurities in the plastic packaging.
The Wikipedia arrival disagrees, but I don’t know if that counts as a reference.

There’s a solution though. “Burying a system in a cave reduces the rate of cosmic-ray induced soft errors to a negligible level. ”

Perhaps we should become GIMPs miners and move to the underground since were losing Madpoos double check powers.
airsquirrels is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Manual assignment results ? Have I found a new prime ? -- False alarm. JonRussell PrimeNet 21 2018-02-28 02:08
Generate Unrestricted Grammars Raman Puzzles 3 2013-09-15 09:15
New(?) Algorithm to Generate Cycles russellharper Factoring 10 2010-12-01 01:33
An equation to generate all primes that uses 2 & 3 Carl Fischbach Miscellaneous Math 16 2007-10-10 16:43
Notifying a user with false results Thomas Lounge 6 2003-07-18 07:28

All times are UTC. The time now is 15:03.


Fri Jul 7 15:03:12 UTC 2023 up 323 days, 12:31, 0 users, load averages: 1.67, 1.36, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔