![]() |
[QUOTE=moebius;561448]Good idea, but now it's a new link. The table is now in ODF (.ods) format. I hope you can read it well in the browser regarding the resolution.
[URL]https://drive.google.com/file/d/1Timk3v5lhipf21w2FogXn8vl57X_RWO0/view[/URL][/QUOTE]Please use wrap text for the column headings and make the columns narrower to the extent the wrap allows, so it can all be viewed at once without tiny font. It probably also ought indicate which version of gpuowl was used for that timing. Finally, please sort by model. |
AMD Windows driver speed influence
Windows AMD Adrenalin driver difference, Window 10 Pro x64, XFX Radeon VII and XFX 5700XT
[CODE]Radeon VII (power limited to ~1670Mhz gpu clock for temperature control): Exponent fft length Gpuowl Version us/it PRP delta, 20.4.2 to 20.10.1 Mersenne M words 20.4.2 20.10.1 us/it % 642589933 36M 4K:9:512 v6.11-364-g36f4e2a 6864 6944 [COLOR=Red]+80 +1.17 [COLOR=black]843112609 48M 4K:12:512 v7.0-35-gf06bc5b 10063 10433[/COLOR] +370 +3.68[/COLOR] 5700XT (free-running, not power limited): 852348659 48M 4K:12:512 v6.11-364-g36f4e2a 21829 21319 [COLOR=Lime][COLOR=SeaGreen]-510 -2.34[/COLOR][/COLOR][/CODE]This suggests segregating by gpu model to separate systems, to allow older faster driver use on Radeon VIIs. Which I was contemplating anyway since with the April driver, running the 5700XT caused driver and system instability sufficient to deter running the 5700XT. I've seen as high as 5% speed penalty for newer driver major version on older AMD gpus previously. % delta are given with excess digits to avoid adding rounding error and are maybe significant to a full decimal digit. Early indications after ~12 hours are stability is better with 20.10.1; no issues yet. |
I still use Adrenaline 19.11.3 ,Win64 10 Pro 1909 and v6.11-364-g36f4e2a with RX Vega 64.
107868373 FFT: 6M 1K:12:256 1775 us/it PRP Why updating if everything is stable. |
[QUOTE=moebius;561567]I still use Adrenaline 19.11.3 ,Win64 10 Pro 1909 and v6.11-364-g36f4e2a with RX Vega 64.
107868373 FFT: 6M 1K:12:256 1775 us/it PRP Why updating if everything is stable.[/QUOTE]Back at gpuowl V1.9 to get V2.0 to work at all, a driver update was necessary, and cost 5.1% on performance on RX480 & RX550 in V1.9 for driver v19.x vs. v18.y, as I recall. |
Does anyone have a Radeon RX 590 or below to compare? This card should perform reasonably well for a consumer card, as it can do 0.445 TLops FP64 and 7.119 TLOPS FP32.
|
1 Attachment(s)
[QUOTE=kracker;483209]As requested... instructions on how to compile on windows (I use msys2.. and also there are probably better ways to do it but it's just how I did it)
1) Download, install and follow the instructions for updating MSYS2 here: [URL]https://www.msys2.org/[/URL] 2) Download and install AMD APP SDK(make sure you use the 64bit version) for Windows: [URL]https://developer.amd.com/amd-accelerated-parallel-processing-app-sdk/[/URL] 3) Copy the contents of C:\Program Files (x86)\AMD APP SDK\3.0\lib\x86_64 to C:\msys64\mingw64\lib and C:\Program Files (x86)\AMD APP SDK\3.0\include to C:\msys64\mingw64\include 4) Install gcc (pacman -S mingw-w64-x86_64-gcc) 5) Download gpuowl sources and drop them somewhere(to /home/username/ is probably easiest) 6) Run MSYS2 from mingw64.exe and cd to the directory you extracted the source to 7) Compile by: g++ -c gpuowl.cpp g++ -o gpuowl.exe gpuowl.o -lOpenCL -static strip gpuowl.exe[/QUOTE] I just tried out a few alternative Windows openCL SDKs since AMD's isn't supported anymore, and they all worked great as drop-in replacements with no changes needed to source or makefile. This is all tested against commit 30b0117f5829ac0b3782e613bad62a88c3a0ea03 of GpuOwl 1) GPUOpen OpenCL SDK : [url]https://github.com/GPUOpen-LibrariesAndSDKs/OCL-SDK/releases[/url] (3.0 tested) 2) Intel OpenCL SDK [url]https://software.intel.com/content/www/us/en/develop/tools/opencl-sdk.html[/url] (2020 Update 3 tested) 3) nvidia OpenCL from the Cuda Toolkit: [url]https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64[/url] (11.1 Update 1 tested) I ran prp 1000003 with each build on a 1080ti as a quick check and the residue looked fine. I'm attaching binaries in case anyone wants to test these more thoroughly or on different hardware. |
Please read this thread regarding v7.1
[URL="https://mersenneforum.org/showthread.php?t=26152"]https://mersenneforum.org/showthread.php?t=26152[/URL] |
Nvidia Geforce RTX 3080 from a forum user
gpuowl-win.exe -iters 200000 -prp 77936867
2020-11-01 01:30:36 gpuowl v6.11-364-g36f4e2a 2020-11-01 01:30:36 Note: not found 'config.txt' 2020-11-01 01:30:36 config: -iters 200000 -prp 77936867 2020-11-01 01:30:36 device 0, unique id '' 2020-11-01 01:30:36 GeForce RTX 3080-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw) 2020-11-01 01:30:36 GeForce RTX 3080-0 Expected maximum carry32: 583B0000 2020-11-01 01:30:36 GeForce RTX 3080-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-11-01 01:30:36 GeForce RTX 3080-0 2020-11-01 01:30:36 GeForce RTX 3080-0 OpenCL compilation in 0.01 s 2020-11-01 01:30:37 GeForce RTX 3080-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003 2020-11-01 01:30:37 GeForce RTX 3080-0 validating proof residues for power 8 2020-11-01 01:30:37 GeForce RTX 3080-0 Proof using power 8 2020-11-01 01:30:40 GeForce RTX 3080-0 77936867 OK 800 0.00%; 1948 us/it; ETA 1d 18:11; 1579c241dc63eca6 (check 0.84s) 2020-11-01 01:37:16 GeForce RTX 3080-0 Stopping, please wait.. 2020-11-01 01:37:17 GeForce RTX 3080-0 77936867 OK 200000 0.26%; 1991 us/it; ETA 1d 18:59; f0b04b45b0855bd2 (check 0.86s) 2020-11-01 01:37:17 GeForce RTX 3080-0 Exiting because "stop requested" 2020-11-01 01:37:17 GeForce RTX 3080-0 Bye |
some Quick&Dirty benchmarks:
[LIST=1][*]A100 PCIe, reported clock rate and power consumption during run: 1215 MHz, 250W:[CODE]# ./gpuowl.exe -iters 200000 -prp 77936867 2020-11-01 14:30:43 gpuowl v6.11-380-g79ea0cc 2020-11-01 14:30:43 Note: not found 'config.txt' 2020-11-01 14:30:43 config: -iters 200000 -prp 77936867 2020-11-01 14:30:43 device 0, unique id '' 2020-11-01 14:30:43 A100-PCIE-40GB-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw) 2020-11-01 14:30:43 A100-PCIE-40GB-0 Expected maximum carry32: 583B0000 2020-11-01 14:30:44 A100-PCIE-40GB-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.5885a1af9d807p-2 -DIWEIGHT_STEP_MINUS_1=-0x1.01ca19102fbbfp-2 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-11-01 14:30:48 A100-PCIE-40GB-0 2020-11-01 14:30:48 A100-PCIE-40GB-0 OpenCL compilation in 3.98 s 2020-11-01 14:30:49 A100-PCIE-40GB-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003 2020-11-01 14:30:49 A100-PCIE-40GB-0 validating proof residues for power 8 2020-11-01 14:30:49 A100-PCIE-40GB-0 Proof using power 8 2020-11-01 14:30:49 A100-PCIE-40GB-0 77936867 OK 800 0.00%; 291 us/it; ETA 0d 06:18; 1579c241dc63eca6 (check 0.22s) 2020-11-01 14:31:49 A100-PCIE-40GB-0 Stopping, please wait.. 2020-11-01 14:31:49 A100-PCIE-40GB-0 77936867 OK 200000 0.26%; 301 us/it; ETA 0d 06:31; f0b04b45b0855bd2 (check 0.19s) 2020-11-01 14:31:49 A100-PCIE-40GB-0 Exiting because "stop requested" 2020-11-01 14:31:49 A100-PCIE-40GB-0 Bye [/CODE][*]Quadro RTX 8000, reported clock rate and power consumption during run: 1920 MHz, 200W:[CODE]# ./gpuowl.exe -iters 200000 -prp 77936867 2020-11-01 14:21:08 gpuowl v6.11-380-g79ea0cc 2020-11-01 14:21:08 Note: not found 'config.txt' 2020-11-01 14:21:08 config: -iters 200000 -prp 77936867 2020-11-01 14:21:08 device 0, unique id '' 2020-11-01 14:21:08 Quadro RTX 8000-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw) 2020-11-01 14:21:08 Quadro RTX 8000-0 Expected maximum carry32: 583B0000 2020-11-01 14:21:09 Quadro RTX 8000-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.5885a1af9d807p-2 -DIWEIGHT_STEP_MINUS_1=-0x1.01ca19102fbbfp-2 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-11-01 14:21:11 Quadro RTX 8000-0 2020-11-01 14:21:11 Quadro RTX 8000-0 OpenCL compilation in 1.63 s 2020-11-01 14:21:11 Quadro RTX 8000-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003 2020-11-01 14:21:11 Quadro RTX 8000-0 validating proof residues for power 8 2020-11-01 14:21:11 Quadro RTX 8000-0 Proof using power 8 2020-11-01 14:21:14 Quadro RTX 8000-0 77936867 OK 800 0.00%; 1812 us/it; ETA 1d 15:14; 1579c241dc63eca6 (check 0.77s) 2020-11-01 14:27:25 Quadro RTX 8000-0 Stopping, please wait.. 2020-11-01 14:27:26 Quadro RTX 8000-0 77936867 OK 200000 0.26%; 1864 us/it; ETA 1d 16:15; f0b04b45b0855bd2 (check 0.80s) 2020-11-01 14:27:26 Quadro RTX 8000-0 Exiting because "stop requested" 2020-11-01 14:27:26 Quadro RTX 8000-0 Bye [/CODE][*]Geforce RTX 3090, reported clock rate and power consumption during run: 1935 MHz, 320W:[CODE]# ./gpuowl.exe -iters 200000 -prp 77936867 2020-11-01 14:30:27 gpuowl v6.11-380-g79ea0cc 2020-11-01 14:30:27 Note: not found 'config.txt' 2020-11-01 14:30:27 config: -iters 200000 -prp 77936867 2020-11-01 14:30:27 device 0, unique id '' 2020-11-01 14:30:27 GeForce RTX 3090-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw) 2020-11-01 14:30:27 GeForce RTX 3090-0 Expected maximum carry32: 583B0000 2020-11-01 14:30:27 GeForce RTX 3090-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.5885a1af9d807p-2 -DIWEIGHT_STEP_MINUS_1=-0x1.01ca19102fbbfp-2 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-11-01 14:30:29 GeForce RTX 3090-0 2020-11-01 14:30:29 GeForce RTX 3090-0 OpenCL compilation in 1.78 s 2020-11-01 14:30:30 GeForce RTX 3090-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003 2020-11-01 14:30:30 GeForce RTX 3090-0 validating proof residues for power 8 2020-11-01 14:30:30 GeForce RTX 3090-0 Proof using power 8 2020-11-01 14:30:32 GeForce RTX 3090-0 77936867 OK 800 0.00%; 1527 us/it; ETA 1d 09:03; 1579c241dc63eca6 (check 0.66s) 2020-11-01 14:35:44 GeForce RTX 3090-0 Stopping, please wait.. 2020-11-01 14:35:45 GeForce RTX 3090-0 77936867 OK 200000 0.26%; 1572 us/it; ETA 1d 09:56; f0b04b45b0855bd2 (check 0.68s) 2020-11-01 14:35:45 GeForce RTX 3090-0 Exiting because "stop requested" 2020-11-01 14:35:45 GeForce RTX 3090-0 Bye [/CODE][/LIST] |
Thank you for the benchmark numbers. Very impressive performance from the A100, almost scales 1:1 with volta when comparing their memory bandwidth. I can't imagine the performance if the memory is overclocked.
OTOH 3090 is honestly a big disappointment, it's slower than a tuned Vega 64 (which draw a lot less power) and not much faster than Turing RTX 8000. Looking forward to the performance of 6900xt for sure but I highly doubt it'll best the Radeon VII. |
[QUOTE=TheJudger;561796]some Quick&Dirty benchmarks:
[LIST=1][*]A100 PCIe, reported clock rate and power consumption during run: 1215 MHz, 250W:[CODE]# ./gpuowl.exe -iters 200000 -prp 77936867 2020-11-01 14:30:43 gpuowl v6.11-380-g79ea0cc 2020-11-01 14:30:43 Note: not found 'config.txt' 2020-11-01 14:30:43 config: -iters 200000 -prp 77936867 2020-11-01 14:30:43 device 0, unique id '' 2020-11-01 14:30:43 A100-PCIE-40GB-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw) 2020-11-01 14:30:43 A100-PCIE-40GB-0 Expected maximum carry32: 583B0000 2020-11-01 14:30:44 A100-PCIE-40GB-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.5885a1af9d807p-2 -DIWEIGHT_STEP_MINUS_1=-0x1.01ca19102fbbfp-2 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-11-01 14:30:48 A100-PCIE-40GB-0 2020-11-01 14:30:48 A100-PCIE-40GB-0 OpenCL compilation in 3.98 s 2020-11-01 14:30:49 A100-PCIE-40GB-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003 2020-11-01 14:30:49 A100-PCIE-40GB-0 validating proof residues for power 8 2020-11-01 14:30:49 A100-PCIE-40GB-0 Proof using power 8 2020-11-01 14:30:49 A100-PCIE-40GB-0 77936867 OK 800 0.00%; 291 us/it; ETA 0d 06:18; 1579c241dc63eca6 (check 0.22s) 2020-11-01 14:31:49 A100-PCIE-40GB-0 Stopping, please wait.. 2020-11-01 14:31:49 A100-PCIE-40GB-0 77936867 OK 200000 0.26%; 301 us/it; ETA 0d 06:31; f0b04b45b0855bd2 (check 0.19s) 2020-11-01 14:31:49 A100-PCIE-40GB-0 Exiting because "stop requested" 2020-11-01 14:31:49 A100-PCIE-40GB-0 Bye [/CODE] [/LIST][/QUOTE] Holy ****. That's fast. That's like, what, 750 GHzDay/day? EDIT:- Nope, more like 900 !!! [QUOTE=xx005fs;561836]Looking forward to the performance of 6900xt for sure but I highly doubt it'll best the Radeon VII.[/QUOTE] I'm hoping that it will achieve 90%+ performance of R VII. Of course, at $999, it is still too expensive but 6800 & 6800XT might be good value. All pure speculation currently, obviously. |
| All times are UTC. The time now is 21:16. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.