mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2017-05-27, 05:48   #155
axn
 
axn's Avatar
 
Jun 2003

2·3·7·112 Posts
Default

IIRC the issue was when the shift was just a bit smaller than p (i.e. very close to p).
axn is offline   Reply With Quote
Old 2017-05-27, 07:03   #156
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

Quote:
Originally Posted by ewmayer View Post
[...] since all our shift arithmetic (specifically the shiftcount doubling on each iteration) is done (mod p). I could see not-properly-modded shifts *greater* than p being a problem, but in the normal course of the LL test the shift should always be in [0,p-1].
Yes, that's how I see "offset", a value in [0, Exponent - 1]. (randomly generated at the start of a new exponent, unless forced with -offset).
preda is offline   Reply With Quote
Old 2017-05-27, 07:08   #157
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

Quote:
Originally Posted by LaurV View Post
If it is more convenient for you to limit the initial shift to 16 or even 8 bits, than do so
Even when starting with a small offset, by doubling on each iteration it grows large quickly and I have to handle the full range anyhow, thus no much benefit in limiting the starting value.
preda is offline   Reply With Quote
Old 2017-05-27, 07:18   #158
GP2
 
GP2's Avatar
 
Sep 2003

A1916 Posts
Default

Quote:
Originally Posted by Madpoo View Post
I don't know the details, but there was another funny bug that showed up when I was doing (mostly) unnecessary triple-checks of previously verified work.

If the shift was smaller than the exponent (do I have that right?) it would cause a problem. It was rare, especially once the exponent sizes got larger, but we did find a few cases where it was an issue.

Specifically I noticed it when doing triple checks of every exponent below 3M or whatever, and the shift count in some cases was smaller than 3e6 so I was getting residues that didn't match.

It might not apply to your algorithm... I forget what the exact problem was (if I ever even knew the details)
See p. 30 of the Strategic Double Checking thread, it was discovered and discussed there by you and George.

The bug was triggered when the shift was greater than p − 64. Obviously, for p in the tens of millions it was extremely rare, but when you ran triple checks of very low exponents it was triggered a few times.
GP2 is offline   Reply With Quote
Old 2017-05-27, 07:25   #159
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

Quote:
Originally Posted by kracker View Post
I'm playing around with v0.3 atm.. i'm getting 5.15ms/iter compared to 5ms/iter from v0.2 without offset
3% sounds like a lot. What I'd like to achieve is no-offset performance parity when the user sets -offset 0. (This is not the case now, the perf impact is the same regardless of the offset being 0 or something else).
preda is offline   Reply With Quote
Old 2017-05-27, 14:04   #160
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

Quote:
Originally Posted by GP2 View Post
See p. 30 of the Strategic Double Checking thread, it was discovered and discussed there by you and George.

The bug was triggered when the shift was greater than p − 64. Obviously, for p in the tens of millions it was extremely rare, but when you ran triple checks of very low exponents it was triggered a few times.
Thanks for pointing out that thread. Reading it, yes such a bug makes sense, so I went and looked over my residue-computing code to make sure that the word index is correctly "rolled over" in the unlikely case that the 64-bits span the "end" boundary.
preda is offline   Reply With Quote
Old 2017-05-27, 16:46   #161
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

CF116 Posts
Default

Quote:
Originally Posted by GP2 View Post
See p. 30 of the Strategic Double Checking thread, it was discovered and discussed there by you and George.

The bug was triggered when the shift was greater than p − 64. Obviously, for p in the tens of millions it was extremely rare, but when you ran triple checks of very low exponents it was triggered a few times.
Ah, thanks for the reminder. Yeah, I was pretty sure it was a rare condition and I just couldn't remember the details (and was too lazy to go back and look).
Madpoo is offline   Reply With Quote
Old 2017-05-28, 09:19   #162
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

Quote:
Originally Posted by preda View Post
3% sounds like a lot. What I'd like to achieve is no-offset performance parity when the user sets -offset 0. (This is not the case now, the perf impact is the same regardless of the offset being 0 or something else).
I just submitted changes to avoid [most of] the performance penalty when the offset is zero (-offset 0 on exponent start). So even with v0.3, you should see now the same perf as with v0.2 when not using offset.
preda is offline   Reply With Quote
Old 2017-05-28, 11:18   #163
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

32·29·37 Posts
Default

After endless struggle we succeeded in building our own owl. Or owl own. Not clear how.
The struggle is not due to the program, but due to our "excellent knowledge" about using the tools. But, well, we succeeded at last, to get a partial selftest ok. Partial because it is still running.

We mainly followed Victor's guide, with few deviations. Deviations were due to the fact that we had to install cygwin too, needed for mingw and msys. Then, within msys, we couldn't run make, because... well... there is no make. There is a mingw32-make.exe, but we were not sure if it is the same toy or not.

After some more search, we found that we need to run

> pacman -S msys/make

Then it asked us what the hack is g++. So we also had to run

> pacman -S msys/gcc

Then the infamous "make" command produced an .exe file. Which is now self-testing. Up to now, everything ok.

We have a small issue however: it does not get out with ctrl+c (not ctrl+break). Most probably we missed a switch or a link or something..



Last fiddled with by LaurV on 2017-05-28 at 11:23
LaurV is offline   Reply With Quote
Old 2017-05-28, 13:10   #164
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

32×29×37 Posts
Default

Ok. Selftests passed, but since the worktodo.h separation, it seems we can't avoid skipping all lines from the worktodo, no matter what we put there. What are we doing wrong? (also, the harmless warning, caused by unused variable)

Code:
e:\99 - Prime\gpuOwl>gpuowl -logstep 1000 -savestep 100000
gpuOwL v0.3 GPU Lucas-Lehmer primality checker; Sun May 28 20:07:11 2017
Config: -logstep 1000 -savestep 100000 -cl ""
32x1050MHz Tahiti; OpenCL 1.2 AMD-APP (2348.3)
OpenCL compilation log:
An invalid option was specified.

Falling back to CL1.x compilation (error -11)
OpenCL compilation log:
"C:\blahblah\gpuowl.cl", line 529: warning: variable
          "H" was declared but never referenced
    uint W = 1024, H = 2048, GW = W / 64;
                   ^

Compile       :  760 ms
General setup :  320 ms
worktodo.txt line 'DoubleCheck=blahblah,40159129,72,1' skipped
worktodo.txt line 'DoubleCheck=blahblah,40159153,72,1' skipped

Bye

e:\99 - Prime\gpuOwl>

Last fiddled with by LaurV on 2017-05-28 at 13:11
LaurV is offline   Reply With Quote
Old 2017-05-28, 13:42   #165
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

Quote:
Originally Posted by LaurV View Post
Ok. Selftests passed, but since the worktodo.h separation, it seems we can't avoid skipping all lines from the worktodo, no matter what we put there. What are we doing wrong? (also, the harmless warning, caused by unused variable)
Most likely skipped because of bounds check on the exponent, 50M - 78M..? (is that the reason?)

I updated the code to be explicit on that, you may git pull again.
preda is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 16:57.


Mon Aug 2 16:57:31 UTC 2021 up 10 days, 11:26, 0 users, load averages: 2.17, 2.31, 2.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.