mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   mprime crashed (https://www.mersenneforum.org/showthread.php?t=26269)

dabler 2020-12-04 12:06

mprime crashed
 
I just encountered the following behavior:

[FONT="Courier New"][Work thread Dec 4 12:51] Iteration: 102190000 / 109258157 [93.53%], ms/iter: 5.847, ETA: 11:28:49
[Work thread Dec 4 12:51] Hardware errors have occurred during the test!
[Work thread Dec 4 12:51] 1 Gerbicz/double-check error.
[Work thread Dec 4 12:51] Confidence in final result is excellent.
Neoprávněný přístup do paměti (SIGSEGV)
[/FONT]

and the [FONT="Courier New"]dmesg[/FONT] tells:

[FONT="Courier New"][92055.665252] traps: mprime[12021] general protection fault ip:b8ec18 sp:7fcb7be36d20 error:0
[92055.665260] traps: mprime[12024] general protection fault ip:b8ec18 sp:7fcb87e4ed20 error:0
[92055.665265] in mprime[400000+229a000]
[92055.665265] in mprime[400000+229a000]
[92055.665299] traps: mprime[12023] general protection fault ip:b8ec18 sp:7fcb8864fd20 error:0
[/FONT]

There seems to be a bug in the mprime.

VBCurtis 2020-12-04 17:20

When you say "bug in mprime", you mean "problem with my hardware."

That's why the message tells you "hardware error."

Most common is a buildup of dust in your case, leading to temperatures high enough to cause an occasional miscalculation. Faulty memory is also possible, but a bit less likely than a heat problem.

dabler 2020-12-04 17:31

I have no objection to hardware errors. My problem is the SIGSEGV that I get after 90% of the test.

kruoli 2020-12-04 17:41

Is the error reproducible and occurs always on the same iteration?

dabler 2020-12-04 18:33

I tried it several times, and it always crashed around 92 or 93%. No crash below 90 %.

Prime95 2020-12-04 20:07

Try running a torture test.
Or try reducing the memory speed or overvolt it slightly.

chris2be8 2020-12-05 16:40

Exactly which version of mprime are you running? Where did you get it from? And output from [c]uname -a[/c] and [c]cat /etc/os-release[/c] to say what Linux distro and level you have might also be useful.

Chris

dabler 2020-12-05 16:50

mprime version 30.3b6.

[FONT="Courier New"]uname -a[/FONT] gives
[FONT="Courier New"]Linux dabler 5.4.60-gentoo-9 #1 SMP Thu Aug 27 12:07:40 CEST 2020 x86_64 AMD Ryzen Threadripper 2990WX 32-Core Processor AuthenticAMD GNU/Linux[/FONT]

and [FONT="Courier New"]cat /etc/os-release[/FONT] gives
[FONT="Courier New"]NAME=Gentoo
ID=gentoo
PRETTY_NAME="Gentoo/Linux"
ANSI_COLOR="1;32"
HOME_URL="https://www.gentoo.org/"
SUPPORT_URL="https://www.gentoo.org/support/"
BUG_REPORT_URL="https://bugs.gentoo.org/"[/FONT]

retina 2020-12-06 08:46

[QUOTE=dabler;565227]Hardware errors have occurred during the test![/QUOTE]When you have a hardware problem then anything can happen (program crashes, wrong answers, reboots, HDD reformat, etc.). So there is no point in talking about the software or the OS and any other "soft" problem [b]until you fix the hardware[/b].

So, fix you hardware first before doing anything else.

dabler 2020-12-06 16:36

I managed to get over the magic limit of 92% -- 93% and complete the test. And the result seems to be correct: [URL="https://www.mersenne.org/report_exponent/?exp_lo=109258157&full=1"]https://www.mersenne.org/report_exponent/?exp_lo=109258157&full=1[/URL]


All times are UTC. The time now is 11:34.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.