mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2023-01-02, 21:36   #45
Magellan3s
 
Mar 2022
Earth

2×32×7 Posts
Default

Quote:
Originally Posted by kriesel View Post
My bad (memory). Retested, on 3 Radeon VII (not pro) GPUs, ~630 (good) - 640 (failing fan) usec/iter, gpuowl-win V6.11-38x, if the GPU VRAM is Hynix not Samsung. Set to -20% power, & VRAM as fast as it will reliably run, which varies from one GPU to the next, with "AMD Radeon Software", launched from the Windows start menu, using its tuning tab. That software can be installed along with the driver during initial installation from the same download package, or during a driver update.
George has reported being able to run VRAM at 1200 MHz; my Radeon VII GPUs mostly range up to 1170 with Hynix, ~870 with Samsung (yes, considerably below nominal clock for Samsung, and it's still too error prone to run P-1 or LLDC).
Typically gpuowl V6.11-38x is several percent faster than v7.x on the same work, hardware and tune, on Radeon VII & Windows. I have no Linux data.
If reliable enough, use a large block size which reduces GEC checking overhead and lifts performance slightly.
Same everything,including fft length, ~74M runs 580 usec/iter in LLDC or PRP.
For comparison, RX6900XT, 705 usec/iter., -10% power, +7% VRAM (allowed min and max respectively).
I ran the benchmark on a new linux install and got the following.

Code:
jesus@Magellan:~/gpuowl$ ./gpuowl -maxAlloc 16G  -prp 77936867
20230102 15:34:00  GpuOwl VERSION v7.2-131-gca22dce
20230102 15:34:00  GpuOwl VERSION v7.2-131-gca22dce
20230102 15:34:00  Note: not found 'config.txt'
20230102 15:34:00  config: -maxAlloc 16G -prp 77936867 
20230102 15:34:00  device 0, unique id 'd78a416172e17d3c'
20230102 15:34:00 d78a416172e17d3c 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
20230102 15:34:00 d78a416172e17d3c 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DWEIGHT_STEP=0.33644726404543274 -DIWEIGHT_STEP=-0.25174750481886216 -DIWEIGHTS={0,-0.44011820345520131,-0.37306474779553728,-0.29798072935699788,-0.21390437908665341,-0.11975874301407295,-0.014337887291734644,-0.44814572555075455,} -DFWEIGHTS={0,0.78609128957452257,0.5950610473469905,0.42446232150303748,0.2721098723818392,0.1360521812214803,0.014546452690911484,0.81207258201996746,}  -cl-std=CL2.0 -cl-finite-math-only "
20230102 15:34:01 d78a416172e17d3c 77936867 OpenCL compilation in 0.87 s
20230102 15:34:01 d78a416172e17d3c 77936867 PRP starting from beginning
20230102 15:34:02 d78a416172e17d3c 77936867 OK         0 on-load: blockSize 400, 0000000000000003
20230102 15:34:02 d78a416172e17d3c 77936867 validating proof residues for power 8
20230102 15:34:02 d78a416172e17d3c 77936867 Proof using power 8
20230102 15:34:07 d78a416172e17d3c 77936867     10000 fc4f135f7cf4ad29  530
20230102 15:34:12 d78a416172e17d3c 77936867     20000 3cd1bd9d5e09cbc5  530
20230102 15:34:18 d78a416172e17d3c 77936867     30000 c4e0ff35e3290d98  531
20230102 15:34:23 d78a416172e17d3c 77936867     40000 dffe1b1b0d748128  531
20230102 15:34:28 d78a416172e17d3c 77936867     50000 52e286945371ed29  532
20230102 15:34:34 d78a416172e17d3c 77936867     60000 0945da4dc08bdd95  532
20230102 15:34:39 d78a416172e17d3c 77936867     70000 7131fa4eb77f4bb2  533
20230102 15:34:44 d78a416172e17d3c 77936867     80000 8d76071d27ee4221  533
20230102 15:34:50 d78a416172e17d3c 77936867     90000 0bacff453b2f470e  533
20230102 15:34:54 d78a416172e17d3c 77936867 Stopping, please wait..
20230102 15:34:55 d78a416172e17d3c 77936867 OK     99200   0.13% f9443cde8dee98f4  533 us/it + check 0.25s + save 0.09s; ETA 11:32
20230102 15:34:55 d78a416172e17d3c  Exiting because "stop requested"
20230102 15:34:55 d78a416172e17d3c  Bye
jesus@Magellan:~/gpuowl$
Magellan3s is offline   Reply With Quote
Old 2023-01-02, 23:05   #46
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

782310 Posts
Default

Quote:
Originally Posted by Magellan3s View Post
I ran the benchmark on a new linux install and got the following.

Code:
jesus@Magellan:~/gpuowl$ ./gpuowl -maxAlloc 16G  -prp 77936867
20230102 15:34:00  GpuOwl VERSION v7.2-131-gca22dce
20230102 15:34:00  GpuOwl VERSION v7.2-131-gca22dce
20230102 15:34:00  Note: not found 'config.txt'
...
20230102 15:34:55 d78a416172e17d3c 77936867 OK     99200   0.13% f9443cde8dee98f4 533 us/it + check 0.25s + save 0.09s; ETA 11:32
20230102 15:34:55 d78a416172e17d3c  Exiting because "stop requested"
20230102 15:34:55 d78a416172e17d3c  Bye
jesus@Magellan:~/gpuowl$
For which GPU model? (I mentioned at least two) At what sclk settings, GPU power level, clocking details?
kriesel is offline   Reply With Quote
Old 2023-01-03, 00:16   #47
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

17·487 Posts
Default

Quote:
Originally Posted by kriesel View Post
Typically gpuowl V6.11-38x is several percent faster than v7.x on the same work, hardware and tune, on Radeon VII & Windows. I have no Linux data.
I've tuned all my Radeon VIIs for maximum energy efficiency. Here is the report from one:

Temp: 71.0c
Power: 117.0W (probably 130+W at the wall, I think rocm-smi's reported watts are a bit low)
Sclk: 1157Mhz 725mV
Mclk: 1201Mhz

Timings at 4M FFT: 546us (actually 2 linux instances at 1091 or 1092us, gpuowl v6.11-351-ge930f9e)
Prime95 is offline   Reply With Quote
Old 2023-01-03, 05:54   #48
yuki0831
 
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref

2×3×5 Posts
Default Disapointed XTX I sold

Hello, I sold the XTX in auction. It sold at 174000yen postage incl. 1month cost 15000yen.
I stand in line for long time .2hours to get queue ticket.
Have some trouble in setup,
I finally found that I dont expect AMD advertising blurb.FLOPS FP64etc...
If I will buy AMD GPU,I must watch Gpuowl benchmark.

Thank you for advicing me to install Ubuntu Gpuowl supoort.
yuki0831 is offline   Reply With Quote
Old 2023-01-03, 09:07   #49
moebius
 
moebius's Avatar
 
Jul 2009
Germany

2·353 Posts
Default

174.000 Yen = 1.330,44 US-Dollar = 1266,89 Euro
moebius is offline   Reply With Quote
Old 2023-01-03, 10:18   #50
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

2×52×19 Posts
Default

Hope you had fun benchmarking, thank you for putting the effort in.
M344587487 is offline   Reply With Quote
Old 2023-01-05, 02:07   #51
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

865810 Posts
Default

https://old.reddit.com/r/hardware/co...ps_vs_61tflop/

Xyzzy is offline   Reply With Quote
Old 2023-01-06, 01:56   #52
Magellan3s
 
Mar 2022
Earth

12610 Posts
Default

Quote:
Originally Posted by kriesel View Post
For which GPU model? (I mentioned at least two) At what sclk settings, GPU power level, clocking details?
Radeon PRO VII - I haven't touched any sclk settings, power level settings, or clock settings. I've actually been running GPU OWL on a windows build for the majority of the work I've submitted to GIMPS using this particular card. I like windows a little better than Ubuntu.

Last fiddled with by Magellan3s on 2023-01-06 at 01:57
Magellan3s is offline   Reply With Quote
Old 2023-01-06, 12:03   #53
PhilF
 
PhilF's Avatar
 
"6800 descendent"
Feb 2005
Colorado

32×83 Posts
Default

Quote:
Originally Posted by Magellan3s View Post
Radeon PRO VII - I haven't touched any sclk settings, power level settings, or clock settings. I've actually been running GPU OWL on a windows build for the majority of the work I've submitted to GIMPS using this particular card. I like windows a little better than Ubuntu.
If you've not done any tuning you are almost certainly using considerably more power than necessary (and generating more heat too).
PhilF is offline   Reply With Quote
Old 2023-01-07, 01:25   #54
Magellan3s
 
Mar 2022
Earth

12610 Posts
Default

Quote:
Originally Posted by PhilF View Post
If you've not done any tuning you are almost certainly using considerably more power than necessary (and generating more heat too).
The software I downloaded from the AMD website doesn't allow me to change those values. Is there a program anyone recommends?

Last fiddled with by Magellan3s on 2023-01-07 at 01:26
Magellan3s is offline   Reply With Quote
Old 2023-01-09, 17:22   #55
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

145210 Posts
Default

Quote:
Originally Posted by moebius View Post
That table is very nice!

The exponent used for benchmarking (77936867) is a bit dated though, we could start using another exponent, e.g. around 115M, as that's what is relevant at the PRP wavefront now.

115M would be towards the upper limit of 6M FFT; 120M would be towards the upper limit of 6.5M FFT.
preda is offline   Reply With Quote
Reply



All times are UTC. The time now is 13:59.


Fri Jul 7 13:59:55 UTC 2023 up 323 days, 11:28, 0 users, load averages: 1.12, 1.17, 1.16

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔