mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Prime95 version 27.7 / 27.9 (https://www.mersenneforum.org/showthread.php?t=16779)

Prime95 2012-12-14 23:25

[QUOTE=henryzz;321696]In other words no errors in tests were created in this. It was just the error reporting.[/QUOTE]

Correct

petrw1 2012-12-15 18:12

[QUOTE=henryzz;321696]In other words no errors in tests were created in this. It was just the error reporting.[/QUOTE]

3 is the magic number...each time the progress updated it said something like "10 Sumout errors; 3 were not reproducible...."

Batalov 2012-12-23 22:31

1 Attachment(s)
I have a weird "[I]circadian[/I]"* kernel panic on my older Phenom940.
(It is quite bad because you have to restart the machine, physically: it doesn't crash completely and it freezes the intenet connections, too. See attached screenshot /literally/)

I am not sure if it is mprime or a combination of it with mysterious daemons. Other programs don't panic; month-long sieveing runs, msieve BL, pfgw, newpgen, memtest for a day... I have manipulated these possible interactions by replacing mprime 27.7 with 27.9; replaced OpenSUSE 12.1 with 12.2; ran the comp without X (i.e. init 3). George, do any mentioned in the panic dump functions make sense? Anyone else? TIA!

P.S. Just in case: the workload on which this occurs is a bunch of 831*2^n+1 PRPs on 4 threads (see the top of the snapshot for last line of output)

________
*once a day

el15k 2013-01-07 17:36

Please help me!
When I test my PC on stability prime95 crashes () EVERY time after passes 25k test, it happens with default blend test and with custom test (max fft size 1792, mem used 4096). ALL other stability software (lynx, occt) in that time is stable, except prime95 and test after 25k, but crases after 25k stops when I increase voltage much more. Is it means my voltage is too small?

Win 7 x64, processor 3570k
event view:
[I]Faulting application name: prime95.exe, version: 27.7.1.0, time stamp: 0x4fb2d143
Faulting module name: prime95.exe, version: 27.7.1.0, time stamp: 0x4fb2d143
Exception code: 0xc0000005
Fault offset: 0x000000000014578d
Faulting process id: 0xc0c
Faulting application start time: 0x01cdecd998d17327
Faulting application path: C:\Test\p95v277.win64\prime95.exe
Faulting module path: C:\Test\p95v277.win64\prime95.exe
Report Id: c2391b26-58e2-11e2-b0fe-bc5ff447b886
[/I]

Prime95 2013-01-08 04:01

[QUOTE=el15k;323940]When I test my PC on stability prime95 crashes () EVERY time after passes 25k test, it happens with default blend test and with custom test (max fft size 1792, mem used 4096).[/QUOTE]

That is suspicious (failing in the EXACT same spot EVERY time could be a program bug). Does it fail if you run the in-place test? Does it fail with custom tests using different memory allocations?

BTW, what is the FFT size after 25K?

el15k 2013-01-08 21:09

[QUOTE=Prime95;324014]That is suspicious (failing in the EXACT same spot EVERY time could be a program bug). Does it fail if you run the in-place test? Does it fail with custom tests using different memory allocations?

BTW, what is the FFT size after 25K?[/QUOTE]
It fail only after 25k, in-place test passes OK.
It fails after 25k in standart blend mode (I dont remember what FFT size after 25k, simething around ~800) and It fails after 25k in custom test (max fft size 1792, mem used 4096) in this test prime95 crashes after 25k on 480 fft size.

Kyle 2013-01-12 12:48

Segmentation fault
 
Hi, mprime 27.9 build 1 Linux64 "crashed" this morning on my computer. It said:
[CODE]Segmentation fault[/CODE]and stopped working. Nothing in prime.log. Invoking dmesg, I got:
[CODE]mprime[35596]: segfault at 7fbb0f2c3938 ip 00007fbc12ad4771 sp 00007fafbbffd010 error 4 in libc-2.12.so[7fbc12a5d000+186000][/CODE]I dont know if you can find out the problem with so less informations. It can't be a hardware bug, since the computer memory has 2bit-ECC and temperatures never exceed 45°C.

tcharron 2013-01-23 21:45

Prime95 behaviour question
 
I have some kind of gremlin in my system. My i7 processor was running 4 LL tests. They were at 35-50% completion with no errors.

I then did something -- 3 of the 4 threads reported hundreds of rounding errors. I might have bumped the PC, maybe there was a vibration.... I don't know what, but hundreds of rounding errors in a brief time.

I didn't notice until two days later. By this point, prime95 had written checkpoint files, and rotated them through the two configured backups. Everything saved in the prime95 directory was (I'd bet) junk.

I did have a backup of the prime95 directory, and restored that and restarted. It is now running smoothly, with no errors reported. This manual restore saved me a week or two of computing time.

What I'd like to see is that prime95 automatically allow this kind of recovery. If I didn't have my manually created backup I would have lost a lot of time. P95 could either not rotate the backups after a certain number of errors, or always keep a "last good" checkpoint file available. Thoughts?

kracker 2013-01-25 05:26

Hmm, just curious are you overclocking?

LaurV 2013-01-25 06:15

Disable the HT! (or use a single thread for each worker, so if you have 4 physical cores, use 4 workers, each single threaded).

If you read through the forum, that was a problem with v27.7, when you run multithreaded workers or use HT. Other people (me including) ran into it. That problem is solved by v27.9 which I suggest you to download, if you do not have it already.

Also, this thread might be closed, as it erroneously induces the idea that this is the last version.

OTOH, the errors you get didn't affect the results. P95 is enough clever to try "slower" methods when the faster method fails. As you continued to get rounding errors, it means it was not a reproducible error, but a new one each time, the faster method failed, but the slower one passed (otherwise the test would be terminated, and larger FFT need to be used).

You could use the files and finish your work with the new version, without using the manual backup, without affecting the results. At the end the result would be marked that it had a rounding error somewhere, which was eliminated by using a slower multiplication method.

But the final result would be - most probably - correct.

flashjh 2013-01-25 14:23

I didn't know 27.9 was out :confused:


All times are UTC. The time now is 22:10.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.