![]() |
![]() |
#23 |
"Victor de Hollander"
Aug 2011
the Netherlands
32·131 Posts |
![]()
Preda deserves all credit for the coding. I'm just trying to compile it for Win64 and reporting the errors I get.
|
![]() |
![]() |
![]() |
#24 |
"Kieren"
Jul 2011
In My Own Galaxy!
2·3·1,693 Posts |
![]() |
![]() |
![]() |
![]() |
#25 | ||
"Victor de Hollander"
Aug 2011
the Netherlands
32·131 Posts |
![]()
Succes here also with compiling it with MINGW64 for Windows. gpuOwL is faster and slightly lower error rates on my AMD HD7950 with the limited testing so far.
gpuOwL v0.1 Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#26 |
"Victor de Hollander"
Aug 2011
the Netherlands
32·131 Posts |
![]()
'Guide' for compiling it on Windows with msys64+MINGW64
1. Assuming you have installed msys64 with MINGW64 and you'll need a texteditor (I prefer notepad++) 2. Download+Install AMD SDK 3.0 from http://developer.amd.com/tools-and-s...ssing-app-sdk/ 3. Download the gpuowl code from https://github.com/preda/gpuowl (easiest is to download the entire map as a .zip) 4. Extract .zip (preferably to somewhere you can navigate to easily with msys64, so for instance msys64\home\gpuowl) 5. open the Makefile (that is in the gpuowlmap) with texteditor (notepad++) 6. edit the path behind "-L" argument to the OpenCL SDK install path standard path is C:\Program Files (x86)\AMD APP SDK\3.0\lib\x86_64 I copied the map contents to msys64\home\OpenCL\SDK3 for easier referencing since I always get confused with referencing with spaces and brackets. In my case the makefile contains: Code:
gpuowl: gpuowl.cpp clwrap.h tinycl.h g++ -O2 -std=c++14 gpuowl.cpp -ogpuowl -L\C\msys64\home\OpenCL\SDK3 -lOpenCL 6.1 [OPTIONAL] if you don't have a OpenCL2.0 device edit the clwrap.h file and on line 89 change "-cl-std=CL2.0" to "-cl-std=CL1.2" 7. start msys64/MINGW64 shell and navigate to the gpuowl map. If you put the map in the /home directory of msys64 you can easily go there by typing: "cd .." to get to the home directory. and then "cd yourgpuowlmapname" 8. 'make' Should look something like this so far Code:
MINGW64 ~ $ cd .. MINGW64 /home $ cd gpuOwlv0.1/ MINGW64 /home/gpuOwlv0.1 $make g++ -O2 -std=c++14 gpuowl.cpp -ogpuowl -L\C\msys64\home\OpenCL\SDK3 -lOpenCL MINGW64 /home/gpuOwlv0.1 $ 9.1 [OPTIONAL] If you wish to move the gpuowl directory somewhere else, you probably need to copy these 3 .dlls into the directory: libgcc_s_seh-1.dll libstdc++-6.dll libwinpthread-1.dll 10. create a worktodo.txt and start testing. Please note gpuOwl is not production ready. |
![]() |
![]() |
![]() |
#27 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
2×11×467 Posts |
![]()
Can someone post or PM me a windoze exe? [edit: x64]. We still have trouble compiling it (the same troubles as above - but the troubles are most probably related to our ignorance with the tools).
We are going to give it a test too - we own an old "XFX HD7970 GHz" here. P.S. we fully understand that it is not "production ready" yet, but if it is faster than clLucas and it is giving out the same DC residue, for sure we will report it to PrimeNet and get some fast credits! ![]() Last fiddled with by LaurV on 2017-04-25 at 04:47 |
![]() |
![]() |
![]() |
#28 | |
"Victor de Hollander"
Aug 2011
the Netherlands
22338 Posts |
![]() Quote:
I'll send it when I get home tonight. |
|
![]() |
![]() |
![]() |
#29 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
2×11×467 Posts |
![]() Last fiddled with by LaurV on 2017-04-25 at 15:36 Reason: link |
![]() |
![]() |
![]() |
#30 | |
"Victor de Hollander"
Aug 2011
the Netherlands
32·131 Posts |
![]() Quote:
![]() |
|
![]() |
![]() |
![]() |
#31 |
"Mihai Preda"
Apr 2015
26448 Posts |
![]()
Thanks for the MinGW compilation, and the screenshots! The screenshots show there's an error in printing the residue (leading digits being 0) -- that's hopefully fixed now (not a big deal, the problem was just 'cosmetic').
A small update on where gpuOwL is, and what I'm planning on next. I was really worried by the results of some of my own testing -- the LL was failing on known primes (24036583). Thus I decided to do some more serious testing to find the cause of the error. But after all this investigation, my conclusion was that it's not a software bug, but the GPU producing.. an erroneous result very rarely. This is disconcerning, and I'd really like to have a way to detect such problems. The LL involves two distinct computations. One is FFT-Square-IFFT, the second is "round-to-int + Carry-propagation". An error can occur in either of these, and these are the detection mechanisms that I know of: - evaluating the "max rounding error" that occurs when rounding-to-int, after the IFFT and before the carry-propagation. This is cheap to compute on the GPU, thus is always on (and printed on every logstep). This rounding error brings two pieces of information: 1. whether the FFT size is big enough for the chosen exponent, and 2. whether something went completely wrong with the FFT/IFFT. I plan to add provisions in the code for detecting a sudden jump in the rounding error (which may indicate FFT error), and re-run the last batch in that situation to check for consistent results. - evaluating the SUMINP / SUMOUT of the FFT (which is done by the CPU prime95). This is not implemented, because is seems (to me) expensive to do on the GPU. This check would have provided very good detection for FFT/IFFT errors, but no protection against rounding&carry-propagation errors. - using "offset". This changes the values fed to the FFT/IFFT, and again protects against FFT errors (but without detecting them "in real time"). As is, there is no check that I know of that covers the carry-propagation. If an error takes place in that part, it would not be detected by either the max-rounding-error check or the SUMINP/SUMOUT check. I would be interested in finding out about a GPU-cheap way to check that the integer digits of the modulo-convolution done by LL are not completely haywire. Development plan: - implement "offset", and measure performance impact. If impact is small, leave it always-on. - check for sudden jumps in rounding-error, and automatically re-try in that situation. May help detect too-overclocked GPUs (but not always). - add some simple self-test, which would run on know-primes and compare residues with a pre-saved residue list, to detect obvious errors. (to detect more subtle errors, a good but expensive way is to run to-completion on know primes (and check 0 residue), or double-check validated results). Still missing: - ability to select specific GPU in a multi-GPU system (right now, simply uses the first GPU) - get some ISA dumps (produced with "-cl -save-temps") and analyze to investigate the low performance reported - add ability to dump compiled binary (for OpenCLs that do not offer "-save-temps") |
![]() |
![]() |
![]() |
#32 | |
"Forget I exist"
Jul 2009
Dartmouth NS
2·3·23·61 Posts |
![]() Quote:
Last fiddled with by science_man_88 on 2017-04-26 at 02:09 |
|
![]() |
![]() |
![]() |
#33 |
"Mihai Preda"
Apr 2015
22×192 Posts |
![]()
No, it's not all-1s. I ran 24036583 twice, the second time the result was correct (0). I tracked the difference between the two runs by compared the residues, and at some point around 13% the residues diverged. It means, in the first run an error occurred at that point. Given that the software is supposed to be deterministic (produce identical bits every time), this could be explained by the hardware behaving funny.
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1719 | 2023-01-16 15:51 |
GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |