mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

preda 2019-05-13 07:52

[QUOTE=kriesel;516569]Not boolean; integer. Tests-saved is 0, 1 or 2, as issued by primenet. Two for saving both the first primality test and double check if a P-1 factor is found, so larger bounds are justified to increase the odds of finding a factor, or one for saving only the double check if a factor is found, so lesser bounds are justified, or 0 for don't bother with any P-1, it's already been (adequately I think) done. Although CUDAPm1 (and I think prime95/mprime) can be influenced to use higher yet bounds by going to higher values 3 to 9.[/QUOTE]

So for a first-time PRP with no previous P-1, tests-saved is 2, OK.

I would like to know what happens if the same first-time PRP had P-1 done with B1=B2=100'000. Is tests-saved 2 or 0 in this case?

i.e. is an "insufficient bounds" P-1 treated the same as "no P-1" when handing out the PRP assignment?

SELROC 2019-05-13 07:59

[QUOTE=preda;516575]So for a first-time PRP with no previous P-1, tests-saved is 2, OK.

I would like to know what happens if the same first-time PRP had P-1 done with B1=B2=100'000. Is tests-saved 2 or 0 in this case?

i.e. is an "insufficient bounds" P-1 treated the same as "no P-1" when handing out the PRP assignment?[/QUOTE]


Hi Mihai,
I am doing TF up to 77 bit. TF from 76 to 77 takes 3 hour 40 minutes on the RX580.
How that performance compares to P-1 ?

preda 2019-05-13 11:22

[QUOTE=SELROC;516577]Hi Mihai,
I am doing TF up to 77 bit. TF from 76 to 77 takes 3 hour 40 minutes on the RX580.
How that performance compares to P-1 ?[/QUOTE]

What exponent size?

P-1 duration, and prob of factor, depends on the bounds. The time is often split roughly half-half between the first and second stage (B1, B2). The first stage is mostly linear with B1. Thus, running with B1=10000 and B2=2M would be one-tenth of B1=1M and B2=20M.

To get an idea of the duration of first-stage, take the duration of N iterations of PRP where
N=B1*1.44*1.2 (aprox)
and double that to include the second-stage.

One more thing, second stage goes a bit faster when there's plenty of memory. I personally prefer 16GB for second-stage on GPU; if the GPU only has only 8GB, I would probably use a lower rate B2/B1, e.g. 10.

SELROC 2019-05-13 11:40

[QUOTE=preda;516586]What exponent size?

P-1 duration, and prob of factor, depends on the bounds. The time is often split roughly half-half between the first and second stage (B1, B2). The first stage is mostly linear with B1. Thus, running with B1=10000 and B2=2M would be one-tenth of B1=1M and B2=20M.

To get an idea of the duration of first-stage, take the duration of N iterations of PRP where
N=B1*1.44*1.2 (aprox)
and double that to include the second-stage.

One more thing, second stage goes a bit faster when there's plenty of memory. I personally prefer 16GB for second-stage on GPU; if the GPU only has only 8GB, I would probably use a lower rate B2/B1, e.g. 10.[/QUOTE]


The current TF exponents are between 170M and 200M.

preda 2019-05-13 12:21

[QUOTE=SELROC;516587]The current TF exponents are between 170M and 200M.[/QUOTE]

[QUOTE]
I am doing TF up to 77 bit. TF from 76 to 77 takes 3 hour 40 minutes on the RX580.
How that performance compares to P-1 ?
[/QUOTE]

So, the question might be what is better to do for those exponents now, TF or P-1? My intuition would say P-1, but somebody should probably work out the numbers. Maybe there is not a huge difference between the order of (TF, P-1) for a few more bits of TF.

For P-1, the second question is what bounds? I also don't know. The P-1 calculator can be used as a starting point.

I think you do get more credit for TF then P-1 per unit of time, though.

Prime95 2019-05-13 13:42

[QUOTE=preda;516575]
I would like to know what happens if the same first-time PRP had P-1 done with B1=B2=100'000. Is tests-saved 2 or 0 in this case?

i.e. is an "insufficient bounds" P-1 treated the same as "no P-1" when handing out the PRP assignment?[/QUOTE]

Yes, insufficient is treated as "no P-1". I may need to tweak the server's rules for what constitutes sufficient.

kriesel 2019-05-13 15:57

[QUOTE=preda;516591]So, the question might be what is better to do for those exponents now, TF or P-1? My intuition would say P-1, but somebody should probably work out the numbers. Maybe there is not a huge difference between the order of (TF, P-1) for a few more bits of TF.

For P-1, the second question is what bounds? I also don't know. The P-1 calculator can be used as a starting point.

I think you do get more credit for TF than P-1 per unit of time, though.[/QUOTE]It's best to do some of both, and using mersenne.ca's representation of gpu72 bounds for each is not bad.
In mfaktc and mfakto, and probably elsewhere, TF effort is proportional to 2^bit-limit, inversely proportional to exponent, plus slight effects that mostly offset, of longer coding sequences for kernels for the higher bit levels, but fewer primes per linear interval at higher magnitudes. Details at [URL="https://www.mersenneforum.org/showpost.php?p=508523&postcount=6;"]https://www.mersenneforum.org/showpost.php?p=508523&postcount=6[/URL]; relative runtime scaling data at [URL="https://www.mersenneforum.org/showpost.php?p=488519&postcount=2"]https://www.mersenneforum.org/showpost.php?p=488519&postcount=2
[/URL] which shows variation of (time/bitlevel * exponent/ 2[SUP]bitlevel[/SUP]) is only about 5% over 7 bits of TF for a given exponent, or under 12% for 7 bits and 4:1 exponent variation.

On gpus, one gets far more computing credit per day doing TF, than anything else (P-1, LL, PRP), having to do with the SP/DP or integer/DPfloat performance ratios.
The credit ratio is in most cases, ~8:1 TF/other on AMD, 11-16 on NVIDIA, and up to 40. on RTX20xx, while on cpus I've seen it range 0.7 to 1.3 or so. Not sure about Radeon 7. One can generally get a rough sense of the ratio from Heinrich's TF and LL benchmark data for the same gpu if both are listed. Unfortunately they don't seem consistent with timings I've seen posted for the Radeon 7. Owners of newer cards, please submit benchmarks. [URL]https://www.mersenne.ca/cudalucas.php[/URL] [URL]https://www.mersenne.ca/mfaktc.php[/URL]

kriesel 2019-05-13 16:40

[QUOTE=SELROC;516587]The current TF exponents are between 170M and 200M.[/QUOTE]By using [url]https://www.mersenne.org/manual[/url][B]_gpu[/B]_assignment/

instead of [url]https://www.mersenne.org/manual_assignment/[/url]
we can get TF assignments down to about 93M in preparation for first-time tests, and help keep ahead of the P-1 and primality test wavefronts.

SELROC 2019-05-13 18:16

[QUOTE=kriesel;516623]By using [URL]https://www.mersenne.org/manual[/URL][B]_gpu[/B]_assignment/

instead of [URL]https://www.mersenne.org/manual_assignment/[/URL]
we can get TF assignments down to about 93M in preparation for first-time tests, and help keep ahead of the P-1 and primality test wavefronts.[/QUOTE]


I don't know how to instruct mfloop.py to request such exponents.

kriesel 2019-05-13 19:07

[QUOTE=SELROC;516639]I don't know how to instruct mfloop.py to request such exponents.[/QUOTE]I believe you could figure it out. Or ask teknohog to add it.

Probably alter this section:[CODE]def primenet_fetch(num_to_get):
if not primenet_login:
return []

# Manual assignment settings; trial factoring = 2
assignment = {"cores": "1",
"num_to_get": str(num_to_get),
"pref": "2",
"exp_lo": "",
"exp_hi": "",
}

try:
r = primenet.open(primenet_baseurl + "[B]manual_assignment[/B]/?" + ass_generate(assignment) + "B1=Get+Assignments")
return exp_increase(greplike(workpattern, r.readlines()), int(options.max_exp))
except urllib2.URLError:
debug_print("URL open error at primenet_fetch")
return []
[/CODE]

SELROC 2019-05-13 19:26

[QUOTE=kriesel;516646]I believe you could figure it out. Or ask teknohog to add it.

Probably alter this section:[CODE]def primenet_fetch(num_to_get):
if not primenet_login:
return []

# Manual assignment settings; trial factoring = 2
assignment = {"cores": "1",
"num_to_get": str(num_to_get),
"pref": "2",
"exp_lo": "",
"exp_hi": "",
}

try:
r = primenet.open(primenet_baseurl + "[B]manual_assignment[/B]/?" + ass_generate(assignment) + "B1=Get+Assignments")
return exp_increase(greplike(workpattern, r.readlines()), int(options.max_exp))
except urllib2.URLError:
debug_print("URL open error at primenet_fetch")
return []
[/CODE][/QUOTE]


The two pages have different fields, change is necessary in some python function. I have filed a commit request for Teknohog on mfloop.py, it is still waiting, meanwhile Mark Rose merged it on his own fork.


All times are UTC. The time now is 23:14.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.