mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

SELROC 2019-04-07 13:14

[QUOTE=SELROC;512945]It may be my negligence, I was running kernel 5.0.5 but with an outdated initramfs. After updating initramfs the error is not occurring.[/QUOTE]


I must note that the error (zero residue) has appeared for the first time to me with the radeon VII.

preda 2019-04-07 20:14

[QUOTE=SELROC;512961]I must note that the error (zero residue) has appeared for the first time to me with the radeon VII.[/QUOTE]

Yes, I have no idea why it happens. I'm also using an R7 now, I'll keep an eye on it and let you know if I see the same.

M344587487 2019-04-07 20:32

I've been running an R7 24/7 at 1.03V, 50mV under stock voltage since testing it a few weeks ago. No crashes or suspicious results, the only issue was a corrupted checkpoint after a power cut forced a rollback to the latest 20M checkpoint.

SELROC 2019-04-08 05:44

[QUOTE=M344587487;513007]I've been running an R7 24/7 at 1.03V, 50mV under stock voltage since testing it a few weeks ago. No crashes or suspicious results, the only issue was a corrupted checkpoint after a power cut forced a rollback to the latest 20M checkpoint.[/QUOTE]


That is another story, which happened to me also: hit ctrl-C and cut off the power without waiting. The result is a zero-length checkpoint, invalid.

SELROC 2019-04-08 05:45

[QUOTE=preda;513001]Yes, I have no idea why it happens. I'm also using an R7 now, I'll keep an eye on it and let you know if I see the same.[/QUOTE]


Thanks. I keep an eye to see if it happens again.

preda 2019-04-08 10:33

[QUOTE=M344587487;513007]I've been running an R7 24/7 at 1.03V, 50mV under stock voltage since testing it a few weeks ago. No crashes or suspicious results, the only issue was a corrupted checkpoint after a power cut forced a rollback to the latest 20M checkpoint.[/QUOTE]

There is also another checkpoint, NNNNNN-prev.owl, which likely should not be corrupted at the same time as the main checkpoint. But the recovery from it is not automatic (i.e. you have to manually rename it).

preda 2019-04-08 10:36

[QUOTE=SELROC;513036]That is another story, which happened to me also: hit ctrl-C and cut off the power without waiting. The result is a zero-length checkpoint, invalid.[/QUOTE]

Maybe recover from the "previous" checkpoint, -prev.owl, in such a situation.

SELROC 2019-04-08 10:37

[QUOTE=preda;513055]Maybe recover from the "previous" checkpoint, -prev.owl, in such a situation.[/QUOTE]


Yes I did.

M344587487 2019-04-08 10:41

Tried it and it did the same thing as the normal checkpoint. Sorry I can't remember the exact details but think the initial restart printed out half a dozen lines before erroring, could that have been enough to overwrite a good prev.owl checkpoint with a bad one before I got to it?

preda 2019-04-08 11:45

[QUOTE=M344587487;513058]Tried it and it did the same thing as the normal checkpoint. Sorry I can't remember the exact details but think the initial restart printed out half a dozen lines before erroring, could that have been enough to overwrite a good prev.owl checkpoint with a bad one before I got to it?[/QUOTE]

before writing a checkpoint, GPUowl always does a check, and only writes anything if the check succeeds. So in a bad state nothing should be written. At least that's the theory..

OTOH a check is also done on startup/load, so in a bad state it shouldn't even start.

preda 2019-04-09 21:39

News:
Added arguments for bypassing worktodo.txt
-prp <exponent>
-pm1 <exponent>

Embedded gpuowl.cl in executable, the .cl file next to the executable is not needed anymore.


All times are UTC. The time now is 23:13.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.