mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2019-04-08, 15:37   #100
SELROC
 

24×157 Posts
Default

Quote:
Originally Posted by preda View Post
With its 16GB of RAM, R7 is also quite good at P-1.
I'm doing P-1(B1=300K,B2=9M) on 91M exponents in about 17min/test.
(and I found this 118-bit factor: 420168247365933163207630527781851871 )

I noted that the Radeon VII utilization rate never crosses 75% even for a large number like a 332M.
  Reply With Quote
Old 2019-04-08, 19:06   #101
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

2·52·19 Posts
Default

Quote:
Originally Posted by preda View Post
Yes, assignments like
PFactor=AID,1,2,91157513,-1,77,2

these are easily generated/submitted by e.g.:
gpuowl/primenet.py -u user -p passwd --dirs workdir -w PM1 --tasks 40

and for B1/B2 I add to openowl e.g.:
./openowl -B1 300000
./openowl -B1 1000000 -rB2 25

Bounds (B1,B2) can also be specified per-exponent, by prefixing the worktodo line with:
B1=x;line
B1=x,B2=y;line
There's a minor bug when finishing a P-1 job as the last in the worktodo file, it doesn't generate a result. Probably introduced by the latest commit to overlap the start and end of tests, openowl quits before generating the result. Normally primenet would keep the worktodo filled and the problem isn't encountered. Also there looks to be a minor off-by-one bug with the stage 2 round naming. Admittedly it is quite amusing to see it nearly finish crunching, say bye then just bugger off:
Code:
2019-04-08 18:40:30 Round 35 of 36: init 2.25 s; 5.59 ms/mul; 21285 muls
2019-04-08 18:40:30 Bye
Quote:
Originally Posted by SELROC View Post
I noted that the Radeon VII utilization rate never crosses 75% even for a large number like a 332M.
The utilisation granularity at least in rocm-smi is 25%, when testing TF performance it kept bouncing between 50 and 75 with nothing in between. I wouldn't trust it, 99% utilisation could show as 75% depending on how they implemented it.
M344587487 is offline   Reply With Quote
Old 2019-04-08, 19:12   #102
SELROC
 

2×11×59 Posts
Default

Quote:
Originally Posted by M344587487 View Post
There's a minor bug when finishing a P-1 job as the last in the worktodo file, it doesn't generate a result. Probably introduced by the latest commit to overlap the start and end of tests, openowl quits before generating the result. Normally primenet would keep the worktodo filled and the problem isn't encountered. Also there looks to be a minor off-by-one bug with the stage 2 round naming. Admittedly it is quite amusing to see it nearly finish crunching, say bye then just bugger off:
Code:
2019-04-08 18:40:30 Round 35 of 36: init 2.25 s; 5.59 ms/mul; 21285 muls
2019-04-08 18:40:30 Bye
The utilisation granularity at least in rocm-smi is 25%, when testing TF performance it kept bouncing between 50 and 75 with nothing in between. I wouldn't trust it, 99% utilisation could show as 75% depending on how they implemented it.

you are right, I should try with amdcovc, another gpu monitor program.
  Reply With Quote
Old 2019-04-08, 20:20   #103
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

101101011002 Posts
Default

Quote:
Originally Posted by M344587487 View Post
There's a minor bug when finishing a P-1 job as the last in the worktodo file, it doesn't generate a result. Probably introduced by the latest commit to overlap the start and end of tests, openowl quits before generating the result. Normally primenet would keep the worktodo filled and the problem isn't encountered. Also there looks to be a minor off-by-one bug with the stage 2 round naming. Admittedly it is quite amusing to see it nearly finish crunching, say bye then just bugger off:
Code:
2019-04-08 18:40:30 Round 35 of 36: init 2.25 s; 5.59 ms/mul; 21285 muls
2019-04-08 18:40:30 Bye
Yes I'm aware of this bug. I did introduce it in the last commit, where the final stage-2 GCD is done in the background (i.e. overlapped with the next test), which is a useful optimization for quick P-1 tests; but yep the bug is annoying, I'll think about a fix.
preda is offline   Reply With Quote
Old 2019-04-08, 20:29   #104
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

26548 Posts
Default

Quote:
Originally Posted by kriesel View Post
For such an exponent, B1=300K, or B2=9M, look low to me.
See for example https://www.mersenne.org/report_expo...1140901&full=1
B1= 745 000
B2= 17 135 000
run in CUDAPm1 v0.20 on a GTX1070 (8GB)

Or M90888739, B1=870000, B2=18052500, e=12, in CUDAPm1 v0.22 on a GTX1080Ti (11GB)
The theoretical probability of finding a factor (according to mersenne.ca) with those bounds is 2.16%, empirically I see 3/110. At this rate, and time per test, it is worth doing the P-1 vs. *one* PRP. The bounds are different (higher) if you consider saving two LL tests vs. one PRP.

Also, in general, lower TF makes worth it higher P-1 bounds, which is somewhat counter-intuitive. (i.e. the higher the TF, the lower the P-1 bounds that are worth it).

Edit: empiric is now 4/111 :)

Last fiddled with by preda on 2019-04-08 at 20:35
preda is offline   Reply With Quote
Old 2019-04-08, 20:48   #105
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7,823 Posts
Default

Quote:
Originally Posted by preda View Post
The theoretical probability of finding a factor (according to mersenne.ca) with those bounds is 2.16%, empirically I see 3/110. At this rate, and time per test, it is worth doing the P-1 vs. *one* PRP. The bounds are different (higher) if you consider saving two LL tests vs. one PRP.

Also, in general, lower TF makes worth it higher P-1 bounds, which is somewhat counter-intuitive. (i.e. the higher the TF, the lower the P-1 bounds that are worth it).

Edit: empiric is now 4/111 :)
P-1 should be done to bounds maximizing total run time savings for TWO primality tests. (~2.04 for LL; 2 for PRP) Both LL and PRP are being subjected to verification by DC.

Last fiddled with by kriesel on 2019-04-08 at 20:53
kriesel is online now   Reply With Quote
Old 2019-04-09, 07:21   #106
SELROC
 

2×3,673 Posts
Default

Quote:
Originally Posted by SELROC View Post
you are right, I should try with amdcovc, another gpu monitor program.

but amdcovc with ROCm doesn't give utilization percentile. Retrying with radeontop...
  Reply With Quote
Old 2019-04-09, 10:46   #107
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

22·3·112 Posts
Default

Quote:
Originally Posted by kriesel View Post
P-1 should be done to bounds maximizing total run time savings for TWO primality tests. (~2.04 for LL; 2 for PRP) Both LL and PRP are being subjected to verification by DC.
I'm not convinced of the need of 2 tests/exponent for PRP with GEC. I think the result is extremely reliable, and the double-check is needed only to detect intentional fakes. Now in my own point of view I know that I'm not producing intentional fakes, so in my POV there is very little need to double-check my own PRP results.

Another factor is *when* the double-check should be done. E.g. if somehow we should choose to double-check all the PRP results but with a delay of, let's say, 20years or more -- I could agree to that (given the large delay).

As I get more results in the following days, I'll report the empirical factor ratio with the bounds I chose. (to see if, empirically, the bounds are "useful")
preda is offline   Reply With Quote
Old 2019-04-09, 12:07   #108
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

22×3×112 Posts
Default

Quote:
Originally Posted by preda View Post
Yes I'm aware of this bug. I did introduce it in the last commit, where the final stage-2 GCD is done in the background (i.e. overlapped with the next test), which is a useful optimization for quick P-1 tests; but yep the bug is annoying, I'll think about a fix.
The bug should be fixed now -- GPUowl will wait before exit when a backgroud GCD is pending.
preda is offline   Reply With Quote
Old 2019-04-09, 12:10   #109
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7,823 Posts
Default

Quote:
Originally Posted by preda View Post
I'm not convinced of the need of 2 tests/exponent for PRP with GEC. I think the result is extremely reliable, and the double-check is needed only to detect intentional fakes. Now in my own point of view I know that I'm not producing intentional fakes, so in my POV there is very little need to double-check my own PRP results.

Another factor is *when* the double-check should be done. E.g. if somehow we should choose to double-check all the PRP results but with a delay of, let's say, 20years or more -- I could agree to that (given the large delay).

As I get more results in the following days, I'll report the empirical factor ratio with the bounds I chose. (to see if, empirically, the bounds are "useful")
The persons setting GIMPS standards seem to be convinced of the need for double checking. And bad PRP/GC results have occurred and been detected. Software bugs happen. It's unlikely they've all been found in any given GIMPS application. Some of them happen outside the PRP/GC check capability.

I'd like to see the multiyear gap between first and second primality test reduced, not increased.
kriesel is online now   Reply With Quote
Old 2019-04-09, 12:23   #110
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

22×3×112 Posts
Default

Quote:
Originally Posted by kriesel View Post
The persons setting GIMPS standards seem to be convinced of the need for double checking. And bad PRP/GC results have occurred and been detected. Software bugs happen. It's unlikely they've all been found in any given GIMPS application. Some of them happen outside the PRP/GC check capability.
Yes I agree in general about the need to check and find errors, but requiring a full 2x double-check seems overkill for PRP.

Quote:
I'd like to see the multiyear gap between first and second primality test reduced, not increased.
In abstract yes, but at what cost? if given the option of using the next 1y for either finding the next MP, or closing the DC gap, the choice is not so obvious anymore.
preda is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Vega 20 announced with 7.64 TFlops of FP64 M344587487 GPU Computing 4 2018-11-08 16:56
GTX 1180 Mars Volta consumer card specs leaked tServo GPU Computing 20 2018-06-24 08:04
RX Vega performance xx005fs GPU Computing 5 2018-01-17 00:22
Radeon Pro Duo 0PolarBearsHere GPU Computing 0 2016-03-15 01:32
AMD Radeon R9 295X2 firejuggler GPU Computing 33 2014-09-03 21:42

All times are UTC. The time now is 14:18.


Fri Jul 7 14:18:32 UTC 2023 up 323 days, 11:47, 0 users, load averages: 1.28, 1.26, 1.26

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔