![]() |
Has anyone been able to benchmark the 7900 xtx?
|
[QUOTE=Magellan3s;620591]Has anyone been able to benchmark the 7900 xtx?[/QUOTE]If there are any [url=https://www.mersenne.ca/mfaktc.php?filter=7900+xt]mfakto[/url] or [url=https://www.mersenne.ca/cudalucas.php?filter=7900+xt]gpuowl[/url] benchmarks for 7900 (XT or XTX I don't care) I would desperately likely to see them, the numbers on my charts are more-or-less made up.
|
[url]https://old.reddit.com/r/Amd/comments/zt95bg/all_of_the_internal_things_that_the_7xxx_series/[/url]
:mike: |
I don't know what to make of that thread. Particularly this quote seems misguided:
[quote]...dual SIMD is useless for some (most) applications since the added second SIMD per CU doesn't support integer ops...[/quote] given that most applications are interested in fp AFAIK. The locked PP sounds concerning, but from what I recall we could do as we pleased with Vega 10/20, and this quote: [quote]There is some small sliver of hope that AMD will eventually unlock the PPtables, but looking at Vega10/20, that doesn't seem likely.[/quote] seems to contradict that. If they're wrong about that I'm not convinced they know what they're talking about. [quote]...Also, indications are that they've moved instruction pipeline responsibilities to software, meaning you now need to carefully reorder instructions to not get pipeline stalls and/or provide hints (there's a new instruction for this specific purpose, s_delay_alu). Since many software kernels are hand-rolled in raw assembly, this is a potentially a huge pain point for developers - since this platform needs specific instructions that no other platform does....[/quote]Does this apply to gpuowl or mfakto? I don't think .cl files are assembly or that assembly is used at all but could be wrong. |
[QUOTE=James Heinrich;620616]If there are any [url=https://www.mersenne.ca/mfaktc.php?filter=7900+xt]mfakto[/url] or [url=https://www.mersenne.ca/cudalucas.php?filter=7900+xt]gpuowl[/url] benchmarks for 7900 (XT or XTX I don't care) I would desperately likely to see them, the numbers on my charts are more-or-less made up.[/QUOTE]
7900 xtx [Code]2022-12-28 01:30:43 GpuOwl VERSION v7.2-70-g212618e 2022-12-28 01:30:43 config: log 1000 2022-12-28 01:30:43 config: 2022-12-28 01:30:43 device 0, unique id '' 2022-12-28 01:30:43 gfx1100-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw) 2022-12-28 01:30:43 gfx1100-0 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP=0.33644726404543274 -DIWEIGHT_STEP=-0.25174750481886216 -DIWEIGHTS={0,-0.44011820345520131,-0.37306474779553728,-0.29798072935699788,-0.21390437908665341,-0.11975874301407295,-0.014337887291734644,-0.44814572555075455,} -DFWEIGHTS={0,0.78609128957452257,0.5950610473469905,0.42446232150303748,0.2721098723818392,0.1360521812214803,0.014546452690911484,0.81207258201996746,} -cl-std=CL2.0 -cl-finite-math-only " 2022-12-28 01:30:44 gfx1100-0 77936867 OpenCL compilation in 1.07 s 2022-12-28 01:30:44 gfx1100-0 77936867 trig table : 65 points, cos 73.86 bits, sin 73.34 bits 2022-12-28 01:30:44 gfx1100-0 77936867 trig table : 257 points, cos 72.90 bits, sin 73.11 bits 2022-12-28 01:30:45 gfx1100-0 77936867 trig table : 262145 points, cos 72.03 bits, sin 72.56 bits 2022-12-28 01:30:45 gfx1100-0 77936867 maxAlloc: 0.0 GB 2022-12-28 01:30:45 gfx1100-0 77936867 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h' 2022-12-28 01:30:45 gfx1100-0 77936867 P1(0) 0 bits 2022-12-28 01:30:45 gfx1100-0 77936867 PRP starting from beginning 2022-12-28 01:30:45 gfx1100-0 77936867 OK 0 on-load: blockSize 400, 0000000000000003 2022-12-28 01:30:45 gfx1100-0 77936867 validating proof residues for power 8 2022-12-28 01:30:45 gfx1100-0 77936867 Proof using power 8 2022-12-28 01:30:46 gfx1100-0 77936867 OK 800 0.00% 1579c241dc63eca6 784 us/it + check 0.36s + save 0.11s; ETA 16:58 2022-12-28 01:30:54 gfx1100-0 77936867 10000 0.01% fc4f135f7cf4ad29 785 us/it 2022-12-28 01:31:02 gfx1100-0 77936867 20000 0.03% 3cd1bd9d5e09cbc5 788 us/it 2022-12-28 01:31:09 gfx1100-0 77936867 30000 0.04% c4e0ff35e3290d98 791 us/it 2022-12-28 01:31:17 gfx1100-0 77936867 40000 0.05% dffe1b1b0d748128 793 us/it 2022-12-28 01:31:25 gfx1100-0 77936867 50000 0.06% 52e286945371ed29 793 us/it 2022-12-28 01:31:33 gfx1100-0 77936867 60000 0.08% 0945da4dc08bdd95 795 us/it 2022-12-28 01:31:41 gfx1100-0 77936867 70000 0.09% 7131fa4eb77f4bb2 795 us/it[/code] |
[QUOTE=Magellan3s;621144]7900 xtx
[Code]2022-12-28 01:30:43 GpuOwl VERSION v7.2-70-g212618e[/code][/QUOTE] Can you try running 2 parallel instances of gpuowl (you can use 77936923 for the second instance)? Would like to see what, if any, thruput gains we can get. |
[QUOTE=axn;621149]Can you try running 2 parallel instances of gpuowl (you can use 77936923 for the second instance)? Would like to see what, if any, thruput gains we can get.[/QUOTE]
The results are from here: [url]https://mersenneforum.org/showthread.php?t=28303[/url] |
Chips&Cheese benchmarking 7900xtx:
[url]https://chipsandcheese.com/2023/01/07/microbenchmarking-amds-rdna-3-graphics-architecture/[/url] |
The ISA has been published, a nice light read at 600 pages: [url]https://gpuopen.com/rdna3-isa-guide-now-available/[/url]
|
| All times are UTC. The time now is 14:16. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.