mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

SELROC 2019-05-07 10:57

[QUOTE=preda;516001]Thank you, so the boolean information on whether any P-1 has been done is there in the assignment in the form of "tests_saved".

Note though that in this particular case (
[URL]https://www.mersenne.org/report_exponent/?exp_lo=332252533&full=1[/URL] ), P-1 has been done before the PRP although to insufficient bounds (not by GpuOwl).

I can change GpuOwl to auto-trigger P-1 for a PRP without any P-1; but I still can't handle the case of P-1 done with insufficient bounds.[/QUOTE]


It seems the P-1 was done by MadPoo, so may be ask him how it was done.

kriesel 2019-05-07 11:56

[QUOTE=preda;516001]Thank you, so the boolean information on whether any P-1 has been done is there in the assignment in the form of "tests_saved".

Note though that in this particular case (
[URL]https://www.mersenne.org/report_exponent/?exp_lo=332252533&full=1[/URL] ), P-1 has been done before the PRP although to insufficient bounds (not by GpuOwl).

I can change GpuOwl to auto-trigger P-1 for a PRP without any P-1; but I still can't handle the case of P-1 done with insufficient bounds.[/QUOTE]CUDAPm1 and prime95 include code for determining bounds to maximize the probable computing time savings. The source is out there for you to repurpose in gpuowl. I think primenet will do the right thing and indicate P-1 needed if the bounds of what was done previously are inadequate.

SELROC 2019-05-09 10:59

[QUOTE=SELROC;515936]I am currently using ROCm 2.3 which has a performance regression. I bet that with ROCm 2.4 (if they fix the issue) the ETA for 332M will be around 13-14 days.[/QUOTE]


Disappointingly ROCm 2.4 has even worse speed regression. I have scolded them on github.


[url]https://github.com/RadeonOpenCompute/ROCm/issues/766#issuecomment-490484839[/url]

M344587487 2019-05-09 13:56

[QUOTE=SELROC;516214]Disappointingly ROCm 2.4 has even worse speed regression. I have scolded them on github.


[URL]https://github.com/RadeonOpenCompute/ROCm/issues/766#issuecomment-490484839[/URL][/QUOTE]
Were they a child or pet I'm sure scolding would go a long way. In my experience everyone else tends to tell you to sling your hook. Now go to your room and think about what you've done ;)

SELROC 2019-05-09 14:11

[QUOTE=M344587487;516232]Were they a child or pet I'm sure scolding would go a long way. In my experience everyone else tends to tell you to sling your hook. Now go to your room and think about what you've done ;)[/QUOTE]


I hope that with this they get the sense of our unsatisfied feeling :-)

kriesel 2019-05-12 17:55

A little more effort toward NVIDIA support?
 
Happened to check the gpuowl github repository today, and saw this as the latest commit, relating to an attempt for the NVIDIA RTX2070.

Does it work on NVIDIA OpenCl yet? [URL]https://github.com/preda/gpuowl/commit/c48d46fdbcba6c490c439aa9b07eb4c40bcacae0[/URL]

kriesel 2019-05-12 18:27

[QUOTE=SELROC;516005]It seems the P-1 was done by MadPoo, so may be ask him how it was done.[/QUOTE]MadPoo did a large batch of P-1 factoring attempts with very small P-1 bounds for quick ~0.8% chances of finding factors (on 100Mdigit exponents with primality tests completed or assigned and no prior P-1 factoring). See
[url]https://www.mersenneforum.org/showpost.php?p=513719&postcount=3[/url]

[url]https://www.mersenneforum.org/showpost.php?p=513803&postcount=8[/url]

chengsun 2019-05-12 21:31

[QUOTE=kriesel;516553]Happened to check the gpuowl github repository today, and saw this as the latest commit, relating to an attempt for the NVIDIA RTX2070.

Does it work on NVIDIA OpenCl yet? [URL]https://github.com/preda/gpuowl/commit/c48d46fdbcba6c490c439aa9b07eb4c40bcacae0[/URL][/QUOTE]


It now works on my Nvidia RTX 2070, as of that commit.


EDIT: should clarify that I have only tested PRP, not P-1.




(Let me know if people would be interested in benchmarks.)

kriesel 2019-05-13 00:04

[QUOTE=chengsun;516563]It now works on my Nvidia RTX 2070, as of that commit.

EDIT: should clarify that I have only tested PRP, not P-1.

(Let me know if people would be interested in benchmarks.)[/QUOTE]Always! I'm sure iteration times for specific exponents will be of interest to RTX20xx owners, or those considering buying one, for comparison to CUDALucas on the same model. And congratulations on getting it to run.

kriesel 2019-05-13 01:28

[QUOTE=preda;516001]Thank you, so the boolean information on whether any P-1 has been done is there in the assignment in the form of "tests_saved".
[/QUOTE]Not boolean; integer. Tests-saved is 0, 1 or 2, as issued by primenet. Two for saving both the first primality test and double check if a P-1 factor is found, so larger bounds are justified to increase the odds of finding a factor, or one for saving only the double check if a factor is found, so lesser bounds are justified, or 0 for don't bother with any P-1, it's already been (adequately I think) done. Although CUDAPm1 (and I think prime95/mprime) can be influenced to use higher yet bounds by going to higher values 3 to 9.

xx005fs 2019-05-13 05:26

Windows Binary
 
Any prebuilt windows binary for the newest commit? I am very keen to try out PRP on nvidia hardwares.


All times are UTC. The time now is 23:14.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.