![]() |
[QUOTE]Originally Posted by [B]tha[/B] [URL="http://www.mersenneforum.org/showthread.php?p=419357#post419357"][/URL] [I]I posted a thread on [URL="https://communities.intel.com/thread/96157?forceNoRedirect=true"]Intels hardware forum[/URL]. See if that gets us anything.[/I][/QUOTE]That is another really impressive presentation on the problem. I like the 'hook' you devised, "How to freeze.....".
It is very good to see the Big Guys paying at least some attention to this issue. I wish I could help with pressuring them. I hope their interest can be increased and focused. Great Work, all of you! :tu: EDIT: For some reason, I could not turn off italic on my response. I guess that came from me pasting in tha's post[I]. Hmm. I was able to change what I had already written, but this has reverted. [/I] |
[QUOTE=Dubslow;419366]I don't disagree; my suggestion should only be considered, say, once we get another "base-touch" with MSI and ASRock (to see if the former have reproduced anything, and if the latter's corporate contacts have accomplished anything in the last week).[/QUOTE]
I'd probably want something along the lines of "proof of concept" code... something that narrows down exactly what's wrong and abstracts it from Prime95. In fact, I don't know what's involved but could a simple runtime be generated that does the same thing as the 768K FFT test with AVX only (no AVX2/FMA) that simply runs and checks the round off... something that might display the current roundoff every so many iterations so you know it's doing something, and then lets you know when it crossed the threshold? I'm probably just a little leery of pointing a finger too strongly at hardware if there's any question that it's something else, or when they might point a finger back and say "Aw, that Prime95 program is buggy" or something. Don't think they wouldn't... until of course they're proven wrong. I had a bad experience with Western Digital when <name of local telco inserted here> bought a large batch of systems with WD drives... if memory serves these were the largest desktop drives of the day, a whopping 4.3 GB or whatever. Anyway, as we're rolling these systems out in whatever city we were in at the time, large #'s of the drives started crashing... click of death stuff. I made the very salient point that although HP (they were HP Vectras) sourced drives from different manufacturers, it was only the WD drives that were dying. And at a point it wasn't if they'd click-of-death, it was when. HP was great and shipped out replacements, and me being the naive fool I was at the time (think of what happened at the same telco a few years later) went online and asked around if others had issues with WD drives and generally trying to find out how widespread this was. Western Digital gets a hold of someone at <telco company> and next thing I know I'm hauled into an office where I'm told WD was very upset I had defamed them online with my unproven allegations. To their credit, my employer realized this was just WD doing a little CYA and that was basically the end of it, with my promise not to say anything bad about WD on Usenet. Sigh. Then of course WD finally gets back to HP and us and admits they had a manufacturing issue where metal shavings were getting inside the units, or something like that, or maybe it was the coating on the platters flaking off...whatever. Hard to remember details from 18 years ago or whatever. :smile: |
[QUOTE=tha;419446]Looking for the microcode version, BIOS verison, and motherboard details. [U]Internally we have managed to reproduce the hang[/U] overnight by moving to an [U]older version of microcode[/U] but we want to ensure [U]that we understand all of the ramifications to catch any potential corner cases[/U].[/QUOTE]
This is big! Great work everyone! But we're not done yet. The good news is it is possible this can be (or perhaps already has been) fixed via microcode, rather than requiring a recall. |
:popcorn:
|
[QUOTE=chalsall;419464]The good news is it is possible this can be (or perhaps already has been) fixed via microcode, rather than requiring a recall.[/QUOTE]
Well, that is bad news, you mean they will not change my bad 6700k which I bought with the sweat of my hard work, into a new 6990X? (when that will appear, I mean, I know how long they can take to solve that issue...) :razz: |
[QUOTE=Batalov;419453]Suggestion: (register and) post in [URL="https://communities.intel.com/thread/96157?forceNoRedirect=true"]Henk's thread[/URL]. This will get Intel's attention; posting here - maybe not so fast.[/QUOTE]
I was trying to register to post a detailed explanation, but I'm stuck in the email verification process which does not seem to work. I did get an email but even if it says thank you for verifying email address, it still asks for verification again. |
[QUOTE=ATH;419496]I was trying to register to post a detailed explanation, but I'm stuck in the email verification process which does not seem to work.
[/QUOTE] I got troubles registering too, but eventually had it registered. |
[QUOTE=tha;419500]I got troubles registering too, but eventually had it registered.[/QUOTE]
Maybe you can write a detailed explanation to them? Something like this: In order to replicate the error make sure hyperthreading is enabled on a Skylake 6700K. Download Prime95 version 27.9, as the error seems to occur more frequently here: [url]ftp://mersenne.org/gimps/p95v279.win64.zip[/url] The error also occurs in the newest version 28.7: [url]ftp://mersenne.org/gimps/p95v287.win64.zip[/url] but if you try this you need to create a file called "local.txt" in the directory with the line: CpuSupportsFMA3=0 because version 28.7 uses AVX2/FMA3 by default and the error seems to occur only with AVX. Start Prime95.exe and choose "Options" - "Torture test", and fill out the popup box like this: [url]http://www.bilder-hochladen.net/files/hb0a-9y-635a.jpeg[/url] except change the bottom one "Time to run each FFT size (in minutes)" to like 120min or more, however long you want to run the test. In summary the error occurs only with HT on and all 8 virtual threads running tests and only with AVX. The error is currently only experienced with 768 FFT (Fast Fourier transform) size in Prime95. |
I looked hard over there but did not find the excellent information from Ath's post as how to reproduce the error. I fear that they will say it is not reproducible. and once they say that ....
|
[QUOTE=ATH;419496]I was trying to register to post a detailed explanation, but I'm stuck in the email verification process which does not seem to work.
I did get an email but even if it says thank you for verifying email address, it still asks for verification again.[/QUOTE] I was able to register but I'm not able to log into the account: [QUOTE]Kontobestätigung unvollständig Sie müssen Ihre E-Mail-Adresse bestätigen, bevor wir Ihren Zugang zu Intel Communitys aktivieren können. Klicken Sie bitte in der Bestätigungs-E-Mail, die Sie bei der Registrierung für Intel Communitys erhalten haben, auf den Link zur Bestätigung der E-Mail-Adresse.[/QUOTE] |
[QUOTE=ATH;419502]Maybe you can write a detailed explanation to them?[/QUOTE]
Copied to the Intel forum, thanks! |
| All times are UTC. The time now is 23:23. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.