mersenneforum.org RDNA2 / Big Navi
 Register FAQ Search Today's Posts Mark Forums Read

2020-11-19, 04:03   #23
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

2·7·563 Posts

Apologies for sidetracking the thread ---

Quote:
 Originally Posted by Viliam Furik My Radeon VII gives about 750 GFLOPS because it's heavily underclocked (1000 mV core voltage, -5% power limit, 1600 MHz core clock, 1000 MHz memory clock - MSI Afterburner settings).
Those settings look nothing like my optimized Radeon VII settings. All cards are set to ~1040mV (actual is ~800mV), 1373MHz, memory ~1170MHz. Your core clock is way too high compared to your memory clock for that to be an efficient setting.

Are you using Windows? Use Linux instead -- much better OpenCL support an easy 10+% performance gain.

Last fiddled with by Prime95 on 2020-11-19 at 04:16 Reason: Fixed voltage info

2020-11-19, 09:53   #24
moebius

"CharlesgubOB"
Jul 2009
Germany

33×23 Posts

Quote:
 Originally Posted by Viliam Furik If the gpuOwl could use the Infinity cache effectively, with its ~1,5 TB/s of memory bandwidth, even the RX 6800 could outperform the Radeon VII, but the RX 6900XT would most probably crush it. But that's only if the Infinity cache can be used.
Based on the data available to me, my estimate for the 6800 XT is 662 us/it at PRP for the exponent 77936867, which corresponds approximately to a Tesla P-100.
Real Benchmarks are of course highly desirable
https://mersenneforum.org/showpost.p...postcount=2603

2020-11-19, 18:48   #25
Xyzzy

Aug 2002

2×52×132 Posts

Quote:
 Originally Posted by Mark Rose Also, the attachment feature of the forum doesn't like svg files, so here is a link:
This should be fixed now if you want to test it.

The image engine that creates thumbnails says it can't process SVG files but they should be able to be attached.

2020-11-19, 20:00   #26
tServo

"Marv"
May 2009
near the Tannhäuser Gate

11000010112 Posts

Quote:
 Originally Posted by moebius Based on the data available to me, my estimate for the 6800 XT is 662 us/it at PRP for the exponent 77936867, which corresponds approximately to a Tesla P-100. Real Benchmarks are of course highly desirable https://mersenneforum.org/showpost.p...postcount=2603
I applaud your enthusiasm but I don't think the RX 6000 cards are going to be as fast as you think, purely based on the difference in FP64 flops. That mystery cache would have to go a long way to make up for only 1:16 FP64 cores.

Unfortunately, the early reviews will only have gaming focus because, after all, these are cards for gamers and we won't know GIMPS performance until some lucky GIMPster is able to get one of these and posts the results. It seems the launch was pretty much like Nvidia's with everybody sold out in less than 60 seconds.

BTW, I hope you're right as Nvidia could use some compettition ( which means I'm completely dismissing Intel's imminent foray into this market. )

2020-11-19, 20:25   #27
moebius

"CharlesgubOB"
Jul 2009
Germany

33×23 Posts

Quote:
 Originally Posted by tServo I applaud your enthusiasm but I don't think the RX 6000 cards are going to be as fast as you think, purely based on the difference in FP64 flops. That mystery cache would have to go a long way to make up for only 1:16 FP 64 cores.
I have based my calculations on the 5700 XT which also has a 1:16 FP 64 cores performance with 0,609 TFLOPs which is amazing at gpuowl. If the NAVI technology doesn't get worse it would be quite possible!

2020-11-20, 03:02   #28
tServo

"Marv"
May 2009
near the Tannhäuser Gate

30B16 Posts

Quote:
 Originally Posted by moebius I have based my calculations on the 5700 XT which also has a 1:16 FP 64 cores performance with 0,609 TFLOPs which is amazing at gpuowl. If the NAVI technology doesn't get worse it would be quite possible!
To what results are you referring?

2020-11-20, 04:27   #29
moebius

"CharlesgubOB"
Jul 2009
Germany

62110 Posts

Quote:
 Originally Posted by tServo To what results are you referring?
https://openbenchmarking.org/embed.p...3bc7b2db04&p=2 and
Code:
2020-10-29 16:19:40 GpuOwl VERSION v7.1-11-g97cfbd2
2020-10-29 16:19:40 config: -iters 200000 -prp 77936867
2020-10-29 16:19:40 device 0, unique id ''
2020-10-29 16:19:40 gfx1010-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2020-10-29 16:19:41 gfx1010-0 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DAMDGPU=1 -DCARRY64=1 -DCARRYM64=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-10-29 16:19:44 gfx1010-0 77936867 OpenCL compilation in 3.53 s
2020-10-29 16:19:44 gfx1010-0 77936867 maxAlloc: 0.0 GB
2020-10-29 16:19:44 gfx1010-0 77936867 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h'
2020-10-29 16:19:44 gfx1010-0 77936867 P1(0) 0 bits
2020-10-29 16:19:44 gfx1010-0 77936867 PRP starting from beginning
2020-10-29 16:19:45 gfx1010-0 77936867 OK 0 on-load: blockSize 400, 0000000000000003
2020-10-29 16:19:45 gfx1010-0 77936867 validating proof residues for power 8
2020-10-29 16:19:45 gfx1010-0 77936867 Proof using power 8
2020-10-29 16:19:47 gfx1010-0 77936867 OK 800 0.00% 1579c241dc63eca6 1526 us/it + check 0.70s + save 0.10s; ETA 1d 09:02
2020-10-29 16:20:07 gfx1010-0 77936867 10000 0.01% fc4f135f7cf4ad29 2145 us/it
2020-10-29 16:20:22 gfx1010-0 77936867 20000 0.03% 3cd1bd9d5e09cbc5 1523 us/it
2020-10-29 16:20:37 gfx1010-0 77936867 30000 0.04% c4e0ff35e3290d98 1521 us/it
2020-10-29 16:20:52 gfx1010-0 77936867 40000 0.05% dffe1b1b0d748128 1520 us/it
2020-10-29 16:21:08 gfx1010-0 77936867 50000 0.06% 52e286945371ed29 1520 us/it
2020-10-29 16:21:23 gfx1010-0 77936867 60000 0.08% 0945da4dc08bdd95 1519 us/it
2020-10-29 16:21:38 gfx1010-0 77936867 70000 0.09% 7131fa4eb77f4bb2 1519 us/it
2020-10-29 16:21:53 gfx1010-0 77936867 80000 0.10% 8d76071d27ee4221 1521 us/it
2020-10-29 16:22:08 gfx1010-0 77936867 90000 0.12% 0bacff453b2f470e 1519 us/it
2020-10-29 16:22:24 gfx1010-0 77936867 100000 0.13% 6d7296b9e2830f50 1519 us/it
2020-10-29 16:22:39 gfx1010-0 77936867 110000 0.14% 8cbfd4435622bda7 1519 us/it
2020-10-29 16:23:09 gfx1010-0 77936867 130000 0.17% 50c97bcbf876231f 1520 us/it
2020-10-29 16:23:24 gfx1010-0 77936867 140000 0.18% e1db15f897271496 1520 us/it
2020-10-29 16:23:40 gfx1010-0 77936867 150000 0.19% 127631386c6a9b17 1520 us/it
2020-10-29 16:23:55 gfx1010-0 77936867 160000 0.21% 25b7b6206fc6f085 1519 us/it
2020-10-29 16:24:10 gfx1010-0 77936867 170000 0.22% 416816b0d9f4bba8 1520 us/it
2020-10-29 16:24:25 gfx1010-0 77936867 180000 0.23% 6bee5d054f770861 1521 us/it
2020-10-29 16:24:40 gfx1010-0 77936867 190000 0.24% f37f068f014b18a0 1520 us/it
2020-10-29 16:24:56 gfx1010-0 77936867 Stopping, please wait..
2020-10-29 16:24:56 gfx1010-0 77936867 OK 200000 0.26% f0b04b45b0855bd2 1520 us/it + check 0.67s + save 0.09s; ETA 1d 08:49
2020-10-29 16:24:56 gfx1010-0 Exiting because "stop requested"
2020-10-29 16:24:56 gfx1010-0 Bye

Last fiddled with by moebius on 2020-11-20 at 04:33

2020-11-20, 14:42   #30
M344587487

"Composite as Heck"
Oct 2017

19×47 Posts

As Linux support is reportedly looking good I'll try and get one but don't have any expectation of actually getting one. AIB models are released on the 25th which under normal circumstances you'd expect to cost some tens of dollars more than a standard card to pay for the customness, however AIB's could easily raise prices or more likely only release their top-end in numbers because they'll sell out of whatever they offer. I suggest that everyone who is even on the fence about getting one trying to get one on the 25th if you can, it increases the odds of getting benchmarks in hand and isn't much risk considering it can be returned or resold easily if it isn't a good fit.

Quote:
 Originally Posted by moebius Based on the data available to me, my estimate for the 6800 XT is 662 us/it at PRP for the exponent 77936867, which corresponds approximately to a Tesla P-100. Real Benchmarks are of course highly desirable https://mersenneforum.org/showpost.p...postcount=2603
Assuming perfect scaling with DP we get to 774 us/it, even that I think is pushing the boundaries of optimism as it assumes that Infinity Cache can perfectly make up the raw bandwidth deficit. Your estimate must be assuming that bandwidth not DP is the main limiter, and that IC is a miracle from the chip gods. I hope you're right but I'd rather be pessimistic and proven wrong than optimistic and proven wrong, YMMV.

2020-11-20, 16:46   #31
moebius

"CharlesgubOB"
Jul 2009
Germany

10011011012 Posts

Quote:
 Originally Posted by M344587487 Assuming perfect scaling with DP we get to 774 us/it, even that I think is pushing the boundaries of optimism as it assumes that Infinity Cache can perfectly make up the raw bandwidth deficit.
In view of the fact that I was calculating with the (possible OC) values ​​at openbenchmarking.org, your gpuowl-value could of course also fit. The double precision values ​​there are slightly higher than the manufacturer's specifications.

 2020-11-20, 16:59 #32 tServo     "Marv" May 2009 near the Tannhäuser Gate 30B16 Posts I finally found a reviewer that bothered to include compute stats in their results and the results are just so-so. The 6800xt performs even with the rtx 2080 ti and rtx 3070 but far behind the rtx 3080 and 3090. https://www.legitreviews.com/amd-rad...view_223774/13
2020-11-20, 17:17   #33
moebius

"CharlesgubOB"
Jul 2009
Germany

33×23 Posts

Quote:
 Originally Posted by tServo The 6800xt performs even with the rtx 2080 ti and rtx 3070 but far behind the rtx 3080 and 3090.
Do you work at Nvidia? This may even be true in certain computer games or maybe TF, but i'm more concerned with PRP performance and there are e.g. GA102 and Navi 10 XT graphics cards just on par.

Last fiddled with by moebius on 2020-11-20 at 17:32

All times are UTC. The time now is 14:22.

Mon May 16 14:22:37 UTC 2022 up 32 days, 12:23, 0 users, load averages: 1.23, 1.36, 1.29