![]() |
|
|
#540 |
|
Dec 2011
After milion nines:)
1,451 Posts |
It is already reported bug/ feature
[Worker #5 Jun 20 16:44:40] Worker starting [Worker #5 Jun 20 16:44:40] Trying backup intermediate file: p8_1100198.write [Worker #5 Jun 20 16:44:40] Error reading intermediate file: p8_1100198.write [Worker #5 Jun 20 16:44:40] Renaming p8_1100198.write to p8_1100198.bad1 [Worker #5 Jun 20 16:44:40] All intermediate files bad. Temporarily abandoning work unit. And all will be good if this candidate is not erased from worktodo.txt Please add some control in program: if all intermediate files are bad then dont delete it from worktodo.txt It was fortune that I have XLS table , and when I sorted results I see one is missing Last fiddled with by pepi37 on 2020-06-21 at 15:15 |
|
|
|
|
|
#541 | |
|
Jul 2020
South Florida
22 Posts |
Quote:
I have a similar problem with small FFTs. It is reproducible in terms of both which cores generate the errors and the time at which it happens. I have never seen a BSOD or any other sort of error indication and the chip has handled literally everything else I've thrown at it. (No, not an argument for stability, or even a whine, just another data point.) Given the consistency of how reproducible the issue issue was I started looking around at it, and found that I could "move" the problem by varying the number of workers I used and how many cores I gave to each. However, within a configuration (say 4 workers each with 4 cores), the issue was 100% consistently reproducible both in terms of time and which cores failed. Even though I've now "worked through" the issue, I'm not sure what I think about my conclusions, which is why I'm here. What I found was:
So, I went and made one tweak to the UEFI, I simply set LLC. (It was an incremental change, so don't get the impression I only did it once.) That is the ONLY tweak I made to the UEFI, everything else is 100% stock on the latest and greatest revision of the UEFI. And presto! With LLC set to "low", small FFTs (or even Blend for that matter) have run for hours. The small FFTs run was just over 8 hours. I actually stopped it to try some other tests. "Low" was the smallest setting that would do the job. From what I can tell "Auto", "Normal" and "Standard" are all the same, they were tested without any effect. The first setting that appears to be different is "Low" and it worked. Caution: I don't consider LLC to be one of the "big" overclocking tools, but like any tool, it can still hurt you if you use it wrong. If you don't know what you're doing, approach the setting slowly, from the low end. Sorry, don't mean to preach to the choir, but you never know who may read things like this. Now comes the quandary, what the heck is the problem? I know what the "fix" is, but why? Is it a bad chip? Is it a bad UEFI (or bad implementation of "Auto"/"Normal"/"Standard")? That memory voltage thing? A bad VRM on the board? A bad power connector? Etc, etc... The board in question is a Gigabyte X570 Aorus Pro (not the ITX version which has a -I designator.) No, not the highest end board in the world, but it should be more than capable of running the 3950x at stock and doing all the things I need it to do. Sadly I don't have another Ryzen 9 laying around to swap my chip out and try that test. I did also do some testing with only one stick of memory (and swapping the stick I used), but those results all seemed indistinguishable from the tests with two sticks. The package temp can spike into the 80's when running Blend (I didn't look to see which test it was doing when that happened) but those are momentary. For the most part the temps stay in the low 60's (it's air cooled.) I'd love to hear what your feedback is! The one person who responded in this thread saying they were stable at stock is the only report I've heard of someone with this chip who has not struggled getting it p95 stable. On the other hand there doesn't seem to be alot of discussion around this chip, people seem more interested in Cinebench benchmarks and stuff like that. I have read forums in other places where people have described RMAing their chips (AMD seems quite accommodating on this at this point) only to get worse samples back. The one that stands out is the person who had a issue similar to mine (two "cores" consistently tossing errors at consistent times) and so they RMA'd it and got a new sample that tossed errors on five "cores". That went back to AMD and they sent another that failed on even more. I can only assume that person still working with AMD on the issue. Sorry for the long ramble! |
|
|
|
|
|
|
#542 |
|
Dec 2016
10010012 Posts |
My 3950X has been working on small FFTs on all 32 virtual cores for many weeks without any error or crash. I did not do any tweaking, it just worked out of the box.
With "small FFTs" I mean an FFT size of 1536 (not 1536k!), so everything fits into L1 cache, thus maximizing the load on the CPU. Not sure if that's what you meant with "small". My mainboard is an Asus Prime X570-Pro, btw. |
|
|
|
|
|
#543 |
|
Dec 2011
After milion nines:)
145110 Posts |
I dont have 3950X I have 3900X. And my motherboard is half price of your motherboard. I run rock stable, with very small undervolt and very small overclock ( it is at 3.9 Ghz)
I dont test like you, but for now I process over 50000 candidates from 96K to 768K , many of them was on base2 and Gerbitz test didn't catch any error, or I see any error in mprime log file. |
|
|
|
|
|
#544 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
7,537 Posts |
FYI: The mysterious "Error writing intermediate file" bug during a PRP test has been found and fixed in version 30.1. The bug is is fairly harmless and happens about 1 in every 1000 save file creation attempts.
|
|
|
|
|
|
#545 |
|
Dec 2011
After milion nines:)
5AB16 Posts |
|
|
|
|
|
|
#546 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
3×7×163 Posts |
http://mersenne.org/download/
As soon as George has finished working on it and releases it. Which isn't yet. |
|
|
|
|
|
#547 |
|
Dec 2011
After milion nines:)
5AB16 Posts |
And George if you can fix bug when you start bench marking ( under Windows) and benchmark start and exit immediately ( and of course doesn't benchmark anything)
Also in that case bench.txt file is empty ( only topology of CPU is written) Last fiddled with by pepi37 on 2020-07-13 at 21:30 |
|
|
|
|
|
#548 |
|
Jul 2020
South Florida
22 Posts |
So by "small FFT" I mean the default option on the Torture Test menu (see attached screen shot). It almost seems like what you're describing is "smallest FFT". My chip has run those successfully for over 8 hours.
|
|
|
|
|
|
#549 |
|
Oct 2019
5·19 Posts |
Is there any option for Prime95 not to generate save files for small work units(for example, <1 GHz-day)? It's a painful experience to delete a lot of such save files on Google Drive or adjust the DiskWriteTime option manually every time when doing these kind of works.
Last fiddled with by Fan Ming on 2020-07-14 at 16:12 |
|
|
|
|
|
#550 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
22·5·271 Posts |
Running dual Xeon systems, I'm seeing double to nearly 5x ratios on iteration times on workers using a Xeon or half each, for modest differences in fft length or exponent. These substantial timing differences survive prime95 stop and restart.
Last fiddled with by kriesel on 2020-07-19 at 14:43 |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Prime95 version 29.2 | Prime95 | Software | 71 | 2017-09-16 16:55 |
| Prime95 version 29.1 | Prime95 | Software | 95 | 2017-08-22 22:46 |
| Prime95 version 26.5 | Prime95 | Software | 175 | 2011-04-04 22:35 |
| Prime95 version 25.9 | Prime95 | Software | 143 | 2010-01-05 22:53 |
| Prime95 version 25.8 | Prime95 | Software | 159 | 2009-09-21 16:30 |