mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
Thread Tools
Old 2019-04-07, 13:14   #1046
SELROC
 

7·1,051 Posts
Default

Quote:
Originally Posted by SELROC View Post
It may be my negligence, I was running kernel 5.0.5 but with an outdated initramfs. After updating initramfs the error is not occurring.

I must note that the error (zero residue) has appeared for the first time to me with the radeon VII.
  Reply With Quote
Old 2019-04-07, 20:14   #1047
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

Quote:
Originally Posted by SELROC View Post
I must note that the error (zero residue) has appeared for the first time to me with the radeon VII.
Yes, I have no idea why it happens. I'm also using an R7 now, I'll keep an eye on it and let you know if I see the same.
preda is offline   Reply With Quote
Old 2019-04-07, 20:32   #1048
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

22·32·23 Posts
Default

I've been running an R7 24/7 at 1.03V, 50mV under stock voltage since testing it a few weeks ago. No crashes or suspicious results, the only issue was a corrupted checkpoint after a power cut forced a rollback to the latest 20M checkpoint.
M344587487 is offline   Reply With Quote
Old 2019-04-08, 05:44   #1049
SELROC
 

2·52·53 Posts
Default

Quote:
Originally Posted by M344587487 View Post
I've been running an R7 24/7 at 1.03V, 50mV under stock voltage since testing it a few weeks ago. No crashes or suspicious results, the only issue was a corrupted checkpoint after a power cut forced a rollback to the latest 20M checkpoint.

That is another story, which happened to me also: hit ctrl-C and cut off the power without waiting. The result is a zero-length checkpoint, invalid.
  Reply With Quote
Old 2019-04-08, 05:45   #1050
SELROC
 

32·11·53 Posts
Default

Quote:
Originally Posted by preda View Post
Yes, I have no idea why it happens. I'm also using an R7 now, I'll keep an eye on it and let you know if I see the same.

Thanks. I keep an eye to see if it happens again.
  Reply With Quote
Old 2019-04-08, 10:33   #1051
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

Quote:
Originally Posted by M344587487 View Post
I've been running an R7 24/7 at 1.03V, 50mV under stock voltage since testing it a few weeks ago. No crashes or suspicious results, the only issue was a corrupted checkpoint after a power cut forced a rollback to the latest 20M checkpoint.
There is also another checkpoint, NNNNNN-prev.owl, which likely should not be corrupted at the same time as the main checkpoint. But the recovery from it is not automatic (i.e. you have to manually rename it).
preda is offline   Reply With Quote
Old 2019-04-08, 10:36   #1052
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

25338 Posts
Default

Quote:
Originally Posted by SELROC View Post
That is another story, which happened to me also: hit ctrl-C and cut off the power without waiting. The result is a zero-length checkpoint, invalid.
Maybe recover from the "previous" checkpoint, -prev.owl, in such a situation.
preda is offline   Reply With Quote
Old 2019-04-08, 10:37   #1053
SELROC
 

3×7×112 Posts
Default

Quote:
Originally Posted by preda View Post
Maybe recover from the "previous" checkpoint, -prev.owl, in such a situation.

Yes I did.
  Reply With Quote
Old 2019-04-08, 10:41   #1054
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

22×32×23 Posts
Default

Tried it and it did the same thing as the normal checkpoint. Sorry I can't remember the exact details but think the initial restart printed out half a dozen lines before erroring, could that have been enough to overwrite a good prev.owl checkpoint with a bad one before I got to it?
M344587487 is offline   Reply With Quote
Old 2019-04-08, 11:45   #1055
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

55B16 Posts
Default

Quote:
Originally Posted by M344587487 View Post
Tried it and it did the same thing as the normal checkpoint. Sorry I can't remember the exact details but think the initial restart printed out half a dozen lines before erroring, could that have been enough to overwrite a good prev.owl checkpoint with a bad one before I got to it?
before writing a checkpoint, GPUowl always does a check, and only writes anything if the check succeeds. So in a bad state nothing should be written. At least that's the theory..

OTOH a check is also done on startup/load, so in a bad state it shouldn't even start.

Last fiddled with by preda on 2019-04-08 at 11:46
preda is offline   Reply With Quote
Old 2019-04-09, 21:39   #1056
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

101010110112 Posts
Default

News:
Added arguments for bypassing worktodo.txt
-prp <exponent>
-pm1 <exponent>

Embedded gpuowl.cl in executable, the .cl file next to the executable is not needed anymore.
preda is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 21:15.


Fri Aug 6 21:15:17 UTC 2021 up 14 days, 15:44, 1 user, load averages: 2.58, 2.53, 2.52

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.