![]() |
|
|
#23 |
|
"Composite as Heck"
Oct 2017
95010 Posts |
Definitely baptism of fire, mainly because the hardware is so new that fixed distros haven't caught up and a fixed distro that rocm supports is the path of least resistance that gives a novice a fighting chance. Ubuntu 19 is too old for RDNA3, there's two choices IMO both with downsides:
Normally "it just works", the reason for this pain is that the hardware is so new. In a year most of the above steps won't need to exist. Doing the above takes the fixed 22.04 and pollutes it in a way by patching modern things on top, not ideal but IMO it's more likely you can get it to work. You can try with the pro drivers first then fall back to trying the drivers in the latest kernel if the pro drivers are problematic (preferably by wiping and starting from scratch, but you'd at least be familiar with the process by then). The other way is to use Arch or another rolling distro, install bleeding edge everything and hope. It looks like even a rolling distro might require you to jump in the deep end by going full bleeding edge for now (using *-git packages etc), if you think the above is bad that would really be the way of pain. Who knows, maybe one day rusticl (a new OpenCL implementation in mesa that aims to support intel/amd/nvidia) might remove the need for rocm entirely and make the process as simple as rolling_distro+compile_gpuowl+use_gpuowl. Maybe it already does but AFAIK it's untested (might test over xmas albeit with very limited hardware, at best might be able to test with an old vega card, at worst intel/amd igpu). Rufus is an easy way to make a bootable USB stick, or whatever you're familiar with: https://rufus.ie/en/ |
|
|
|
|
|
#24 |
|
Mar 2022
Earth
2×32×7 Posts |
I'm interested in an update from the OP. I can post a GPUOWL step by step ubuntu tutorial but the problem is my card is a Radeon VII PRO and I'm not sure it would be the same step by step process for him.
Last fiddled with by Magellan3s on 2022-12-23 at 08:58 |
|
|
|
|
|
#25 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17×487 Posts |
|
|
|
|
|
|
#26 | |
|
Jul 2009
Germany
13028 Posts |
Quote:
I asked a user from the guru3d-Forum benchmarking his Sapphire RX 7900 XTX. The result is that GPUowl V6 and V7 for Win definitely won't work with the current Windows drivers. Hopefully Linux works! The performance of the card could outperform many current scientific graphics cards. Last fiddled with by moebius on 2022-12-23 at 23:07 |
|
|
|
|
|
|
#27 |
|
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref
111102 Posts |
I will tranfer for job.I need to move on another place. Now I cleaning for my room.
Dec27,or28th I will try to install Linux Ubuntu 22.04(lateset) in the new apartment. ![]() Decided to install, I pretty nervous to lower performance of AMD 7900XTX. I think Open CL install and driver install will be difficult. In japanese site, its so complex to understand. I think I will give up run gpuowl. ![]() Windows still good for me to operate job(MS word excell and powerpoint). Win 10 is still good OS. Note that Dec29-Jan7 I will meet my parents. In this term, I cant post,but still read this thread. |
|
|
|
|
|
#28 |
|
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref
2×3×5 Posts |
Installed Ubuuntu 22.04
but cant run gpuowl. installed 7900XTX driver. Its hard for me. But could not run it. Linux is difficult for me. I attached log file. What I do ... first Radeon™ Software for Linux® version 22.40 for Ubuntu 22.04.1 Installed 2nd Download and run v7.2-70-g212618e tar.gz (linux format) 3rd fail Anyome can solve this issue? |
|
|
|
|
|
#29 |
|
Jul 2009
Germany
2×353 Posts |
I believe you have to install rocm-opencl first. Read this thread especially the posts from preda! he is the author of gpuOwl. I think he know what he is doing.
https://mersenneforum.org/showthread.php?t=25601&page=6 Last fiddled with by moebius on 2022-12-27 at 10:58 |
|
|
|
|
|
#30 |
|
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref
2×3×5 Posts |
I installed ROCM AMD driver.
I think it works.However it slower than RTX4090. Here is log.txt i copied. 2022-12-28 01:30:43 GpuOwl VERSION v7.2-70-g212618e 2022-12-28 01:30:43 config: log 1000 2022-12-28 01:30:43 config: 2022-12-28 01:30:43 device 0, unique id '' 2022-12-28 01:30:43 gfx1100-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw) 2022-12-28 01:30:43 gfx1100-0 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP=0.33644726404543274 -DIWEIGHT_STEP=-0.25174750481886216 -DIWEIGHTS={0,-0.44011820345520131,-0.37306474779553728,-0.29798072935699788,-0.21390437908665341,-0.11975874301407295,-0.014337887291734644,-0.44814572555075455,} -DFWEIGHTS={0,0.78609128957452257,0.5950610473469905,0.42446232150303748,0.2721098723818392,0.1360521812214803,0.014546452690911484,0.81207258201996746,} -cl-std=CL2.0 -cl-finite-math-only " 2022-12-28 01:30:44 gfx1100-0 77936867 OpenCL compilation in 1.07 s 2022-12-28 01:30:44 gfx1100-0 77936867 trig table : 65 points, cos 73.86 bits, sin 73.34 bits 2022-12-28 01:30:44 gfx1100-0 77936867 trig table : 257 points, cos 72.90 bits, sin 73.11 bits 2022-12-28 01:30:45 gfx1100-0 77936867 trig table : 262145 points, cos 72.03 bits, sin 72.56 bits 2022-12-28 01:30:45 gfx1100-0 77936867 maxAlloc: 0.0 GB 2022-12-28 01:30:45 gfx1100-0 77936867 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h' 2022-12-28 01:30:45 gfx1100-0 77936867 P1(0) 0 bits 2022-12-28 01:30:45 gfx1100-0 77936867 PRP starting from beginning 2022-12-28 01:30:45 gfx1100-0 77936867 OK 0 on-load: blockSize 400, 0000000000000003 2022-12-28 01:30:45 gfx1100-0 77936867 validating proof residues for power 8 2022-12-28 01:30:45 gfx1100-0 77936867 Proof using power 8 2022-12-28 01:30:46 gfx1100-0 77936867 OK 800 0.00% 1579c241dc63eca6 784 us/it + check 0.36s + save 0.11s; ETA 16:58 2022-12-28 01:30:54 gfx1100-0 77936867 10000 0.01% fc4f135f7cf4ad29 785 us/it 2022-12-28 01:31:02 gfx1100-0 77936867 20000 0.03% 3cd1bd9d5e09cbc5 788 us/it 2022-12-28 01:31:09 gfx1100-0 77936867 30000 0.04% c4e0ff35e3290d98 791 us/it 2022-12-28 01:31:17 gfx1100-0 77936867 40000 0.05% dffe1b1b0d748128 793 us/it 2022-12-28 01:31:25 gfx1100-0 77936867 50000 0.06% 52e286945371ed29 793 us/it 2022-12-28 01:31:33 gfx1100-0 77936867 60000 0.08% 0945da4dc08bdd95 795 us/it 2022-12-28 01:31:41 gfx1100-0 77936867 70000 0.09% 7131fa4eb77f4bb2 795 us/it ~~~~~~~~~~~~~~~~~~~~~~ 2022-12-28 01:32:53 gfx1100-0 77936867 160000 0.21% 25b7b6206fc6f085 800 us/it 2022-12-28 01:33:01 gfx1100-0 77936867 170000 0.22% 416816b0d9f4bba8 801 us/it 2022-12-28 01:33:09 gfx1100-0 77936867 180000 0.23% 6bee5d054f770861 804 us/it 2022-12-28 01:33:17 gfx1100-0 77936867 190000 0.24% f37f068f014b18a0 805 us/it 2022-12-28 01:33:26 gfx1100-0 77936867 OK 200000 0.26% f0b04b45b0855bd2 805 us/it + check 0.39s + save 0.12s; ETA 17:23 2022-12-28 01:33:34 gfx1100-0 77936867 210000 0.27% 43eb2fc2424d8aac 806 us/it 2022-12-28 01:33:42 gfx1100-0 77936867 220000 0.28% a1081c6dc6a7689f 805 us/it 2022-12-28 01:33:50 gfx1100-0 77936867 230000 0.30% 2387818d3d3d0d01 806 us/it 2022-12-28 01:33:58 gfx1100-0 77936867 240000 0.31% a9deae45055e5216 807 us/it 2022-12-28 01:34:06 gfx1100-0 77936867 250000 0.32% 89fcab15218f7cac 807 us/it ROCM installed detail ~$ /opt/rocm/bin/rocminfo ROCk module is loaded ===================== HSA System Attributes ===================== Runtime Version: 1.1 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) Machine Model: LARGE System Endianness: LITTLE ========== HSA Agents ========== ******* Agent 1 ******* Name: AMD Ryzen 9 5900X 12-Core Processor Uuid: CPU-XX Marketing Name: AMD Ryzen 9 5900X 12-Core Processor Vendor Name: CPU Feature: None specified Profile: FULL_PROFILE Float Round Mode: NEAR Max Queue Number: 0(0x0) Queue Min Size: 0(0x0) Queue Max Size: 0(0x0) Queue Type: MULTI Node: 0 Device Type: CPU Cache Info: L1: 32768(0x8000) KB Chip ID: 0(0x0) ASIC Revision: 0(0x0) Cacheline Size: 64(0x40) Max Clock Freq. (MHz): 3700 BDFID: 0 Internal Node ID: 0 Compute Unit: 24 SIMDs per CU: 0 Shader Engines: 0 Shader Arrs. per Eng.: 0 WatchPts on Addr. Ranges:1 Features: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: FINE GRAINED Size: 65771360(0x3eb9760) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Alignment: 4KB Accessible by all: TRUE Pool 2 Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED Size: 65771360(0x3eb9760) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Alignment: 4KB Accessible by all: TRUE Pool 3 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 65771360(0x3eb9760) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Alignment: 4KB Accessible by all: TRUE ISA Info: ******* Agent 2 ******* Name: gfx1100 Uuid: GPU-XX Marketing Name: Radeon RX 7900 XTX Vendor Name: AMD Feature: KERNEL_DISPATCH Profile: BASE_PROFILE Float Round Mode: NEAR Max Queue Number: 128(0x80) Queue Min Size: 64(0x40) Queue Max Size: 131072(0x20000) Queue Type: MULTI Node: 1 Device Type: GPU Cache Info: L1: 32(0x20) KB L2: 6144(0x1800) KB L3: 98304(0x18000) KB Chip ID: 29772(0x744c) ASIC Revision: 0(0x0) Cacheline Size: 64(0x40) Max Clock Freq. (MHz): 3220 BDFID: 2816 Internal Node ID: 1 Compute Unit: 96 SIMDs per CU: 2 Shader Engines: 12 Shader Arrs. per Eng.: 2 WatchPts on Addr. Ranges:4 Features: KERNEL_DISPATCH Fast F16 Operation: TRUE Wavefront Size: 32(0x20) Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Max Waves Per CU: 32(0x20) Max Work-item Per CU: 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) Max fbarriers/Workgrp: 32 Pool Info: Pool 1 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 25149440(0x17fc000) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Alignment: 4KB Accessible by all: FALSE Pool 2 Segment: GROUP Size: 64(0x40) KB Allocatable: FALSE Alloc Granule: 0KB Alloc Alignment: 0KB Accessible by all: FALSE ISA Info: ISA 1 Name: amdgcn-amd-amdhsa--gfx1100 Machine Models: HSA_MACHINE_MODEL_LARGE Profiles: HSA_PROFILE_BASE Default Rounding Mode: NEAR Default Rounding Mode: NEAR Fast f16: TRUE Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) FBarrier Max Size: 32 *** Done *** ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| My xtx is slower than gpuOwl benchmarks. What is the problem? |
|
|
|
|
|
#31 |
|
Jul 2009
Germany
13028 Posts |
|
|
|
|
|
|
#32 |
|
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref
2×3×5 Posts |
2022-12-28 10:00:45 GpuOwl VERSION v7.2-70-g212618e
2022-12-28 10:00:45 config: log 10000 2022-12-28 10:00:45 config: -maxAlloc 17000MB 2022-12-28 10:00:45 device 0, unique id '' 2022-12-28 10:00:45 gfx1100-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw) 2022-12-28 10:00:45 gfx1100-0 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP=0.33644726404543274 -DIWEIGHT_STEP=-0.25174750481886216 -DIWEIGHTS={0,-0.44011820345520131,-0.37306474779553728,-0.29798072935699788,-0.21390437908665341,-0.11975874301407295,-0.014337887291734644,-0.44814572555075455,} -DFWEIGHTS={0,0.78609128957452257,0.5950610473469905,0.42446232150303748,0.2721098723818392,0.1360521812214803,0.014546452690911484,0.81207258201996746,} -cl-std=CL2.0 -cl-finite-math-only " 2022-12-28 10:00:46 gfx1100-0 77936867 OpenCL compilation in 1.06 s 2022-12-28 10:00:46 gfx1100-0 77936867 trig table : 65 points, cos 73.86 bits, sin 73.34 bits 2022-12-28 10:00:46 gfx1100-0 77936867 trig table : 257 points, cos 72.90 bits, sin 73.11 bits 2022-12-28 10:00:46 gfx1100-0 77936867 trig table : 262145 points, cos 72.03 bits, sin 72.56 bits 2022-12-28 10:00:47 gfx1100-0 77936867 maxAlloc: 16.6 GB 2022-12-28 10:00:47 gfx1100-0 77936867 P1(0) 0 bits 2022-12-28 10:00:47 gfx1100-0 77936867 OK 1002400 on-load: blockSize 400, 3a4e0a72d1015e77 2022-12-28 10:00:47 gfx1100-0 77936867 validating proof residues for power 8 2022-12-28 10:00:47 gfx1100-0 77936867 Proof using power 8 2022-12-28 10:00:48 gfx1100-0 77936867 OK 1003200 1.29% 881820f695d02726 774 us/it + check 0.36s + save 0.11s; ETA 16:33 2022-12-28 10:00:54 gfx1100-0 77936867 1010000 1.30% 8d0952243041a3ad 782 us/it 2022-12-28 10:01:01 gfx1100-0 77936867 1020000 1.31% d581368f3b9cb0c0 783 us/it 2022-12-28 10:01:09 gfx1100-0 77936867 1030000 1.32% 1336214462fb09c5 782 us/it 2022-12-28 10:01:17 gfx1100-0 77936867 1040000 1.33% a9014b6061f27269 783 us/it 2022-12-28 10:01:25 gfx1100-0 77936867 1050000 1.35% 479f7a58dd17f802 783 us/it 2022-12-28 10:01:33 gfx1100-0 77936867 1060000 1.36% aaf2776230e8c12d 785 us/it 2022-12-28 10:01:41 gfx1100-0 77936867 1070000 1.37% 0c70f062e4cfc2d3 786 us/it ~~~ 2022-12-28 10:25:30 gfx1100-0 77936867 OK 1600000 2.05% afe41c2d268041c0 810 us/it + check 0.41s + save 0.11s; ETA 17:10 2022-12-28 10:25:38 gfx1100-0 77936867 1610000 2.07% befe6e60482d91f0 813 us/it 2022-12-28 10:25:47 gfx1100-0 77936867 1620000 2.08% 6129b953ed987eab 813 us/it 2022-12-28 10:25:55 gfx1100-0 77936867 1630000 2.09% 0755d6ec0609b799 813 us/it 2022-12-28 10:26:03 gfx1100-0 77936867 1640000 2.10% 32e3cc5de3d5a561 813 us/it 2022-12-28 10:26:11 gfx1100-0 77936867 1650000 2.12% 9eb157fe2143be5c 812 us/it 2022-12-28 10:26:19 gfx1100-0 77936867 1660000 2.13% 055ac0306ac4357b 811 us/it 2022-12-28 10:26:27 gfx1100-0 77936867 1670000 2.14% f1dd789bab022842 809 us/it 2022-12-28 10:26:35 gfx1100-0 77936867 1680000 2.16% 819da594d517a4b7 808 us/it 2022-12-28 10:26:43 gfx1100-0 77936867 1690000 2.17% e729f3e6a641f297 811 us/it 2022-12-28 10:26:51 gfx1100-0 77936867 1700000 2.18% da025233b421b2f5 809 us/it 2022-12-28 10:26:59 gfx1100-0 77936867 1710000 2.19% 1b7ceb6145ea37f1 809 us/it also slow iter speed. 800us/iter Version /power 8/ is also OKay? Help me! I cant cope with these problems. |
|
|
|
|
|
#33 |
|
"Eric"
Jan 2018
USA
DF16 Posts |
It does seem a bit slow, but at the same time if the AIDA64 results from babeltechreview is accurate, it is not farfetched. FP64 performance is 1594GFLOPS on the 6900XT and 1213GFLOPS on the 7900XTX, which is quite a bit slower. You can probably check the clockspeed of the 7900XTX to confirm it's running at the advertised boost clock when running GPUOWL.
|
|
|
|