![]() |
[QUOTE=kriesel;532826]exponent ~90M / 5M FFT? On Colab? Any indication of gpu clock rate?[/QUOTE]
99M, 5632K. I really need to benchmark/look at it it more thoroughly... the T2_shuffle(except for width, had the most speedup with middle) handily overcomes the drop between versions. |
[QUOTE=preda;532783]Hi, on Linux, I used to run with an old version of ROCm because it was faster. But today I started trying out 2.10, and I see that it uses 100% CPU per instance of GpuOwl -- it seems to be doing busy wait similarly to what CUDA is doing by default. Do others confirm this observation? (or is it something peculiar on my system)
Filled [url]https://github.com/RadeonOpenCompute/ROCm/issues/963[/url] maybe I'm dreaming.[/QUOTE] The 100% CPU issue that I see seems to affect ROCm starting with 2.6. So my alert "to not update to ROCm 2.10" was overblown, as probably everybody is on some version between 2.6 - 2.10 already. This raises the question why it's only me seeing the 100% CPU, maybe something specific to my system. Anyway, feel free to upgrade to 2.10. |
[QUOTE=preda;532962]The 100% CPU issue that I see seems to affect ROCm starting with 2.6. So my alert "to not update to ROCm 2.10" was overblown, as probably everybody is on some version between 2.6 - 2.10 already. This raises the question why it's only me seeing the 100% CPU, maybe something specific to my system. Anyway, feel free to upgrade to 2.10.[/QUOTE]
Is it 100% of only one core? What Linux flavour are you using? How many cores does your system have? Is it hyper-threaded? Do you use all cores? |
[QUOTE=paulunderwood;532964]Is it 100% of only one core? What Linux flavour are you using? How many cores does your system have? Is it hyper-threaded? Do you use all cores?[/QUOTE]
Every GpuOwl process has one thread that uses 100% of one (hyperthreaded) CPU core. That' s according to top, and correlates with CPU power usage/temperature. The CPU is i7-5820K, 6core/12threads. I don't much use the CPU otherwise. So e.g. with 2 GpuOwl instances, I see 2 (out of 12) cores at 100%, allocated to GpuOwl of course. |
[QUOTE=preda;532967]Every GpuOwl process has one thread that uses 100% of one (hyperthreaded) CPU core. That' s according to top, and correlates with CPU power usage/temperature. The CPU is i7-5820K, 6core/12threads. I don't much use the CPU otherwise. So e.g. with 2 GpuOwl instances, I see 2 (out of 12) cores at 100%, allocated to GpuOwl of course.[/QUOTE]
I ssh into this machine so I am not using the GPU for a desktop. I wonder if that makes a the difference. [CODE]uname -a Linux honeypot9 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u1 (2019-09-20) x86_64 GNU/Linux [/CODE] top shows 0.7% CPU usage. I presume that you start GpuOwl manually from the command line, using a configuration file. |
Let's race
Thanks to George for the extensive set of additive speed-ups!
Now my favorite R7 does FFT 5120K in: 802 us/it @1373Mhz, 142W (setsclk 3) 745 us/it @1547Mhz, 175W (setsclk 4) 709 us/it @1684Mhz, 221W (setsclk 5) (memory 1180MHz, ROCm 2.10) [QUOTE=Prime95;532810]766 us (my two best cards) sclk=4 1547MHz. rocm-smi says 167 and 174 watts. Memory overclocked to 1190 and 1200 respectively. [/QUOTE] |
[QUOTE=preda;532962]Anyway, feel free to upgrade to 2.10.[/QUOTE]
Due to the upgrade to 2.10 my iteration timing went from 832us to 828us. I wish I was confident about over-clocking the memory. EDIT: I took the plunge and o/c the ram by 15% and turned up the fan. The wattage is about ~235w. It has gone from 828us to 779us for FFT 5632K |
stock car vs. Indy
[QUOTE=preda;532970]709 us/it @1684Mhz, 221W (setsclk 5)
(memory 1180MHz, ROCm 2.10)[/QUOTE] 5M fft, exponent 89796247, XFX Radeon VII, Win 10 Pro, gpuowl v6.11-83-ge270393 PRP3 -use NO_ASM,MERGED_MIDDLE,WORKINGIN5,WORKINGOUT3,T2_SHUFFLE_MIDDLE gpu Mhz mem us/it watts hot spot C notes 1397 1050 929 130 82 1394 1100 915 134 90 1396 1150 905 137 87 1398 1175 error on load 1398 1160 904 138 88 1398 1165 903 139 90 1398 1170 error 1470 1165 867 152 89 -20% power limit 1590 1165 822 182 92 nom power limit 1682 1165 791 214 100 fan 99% 1760 1165 773 252 110 power limited, clock max at 1800 haven't tried any undervolting yet. |
[QUOTE=kriesel;533006]
haven't tried any undervolting yet.[/QUOTE] How do I under-volt with rocm-smi? |
make failed on google colab
v6.11-88-gb9f0be7 failed as follows[CODE]echo Version: `cat version.inc`
Version: "v6.11-88-gb9f0be7-dirty" g++ -MT Pm1Plan.o -MMD -MP -MF .d/Pm1Plan.Td -Wall -O2 -std=c++17 -c -o Pm1Plan.o Pm1Plan.cpp g++ -MT GmpUtil.o -MMD -MP -MF .d/GmpUtil.Td -Wall -O2 -std=c++17 -c -o GmpUtil.o GmpUtil.cpp g++ -MT Worktodo.o -MMD -MP -MF .d/Worktodo.Td -Wall -O2 -std=c++17 -c -o Worktodo.o Worktodo.cpp In file included from Worktodo.cpp:6:0: File.h:10:10: fatal error: filesystem: No such file or directory #include <filesystem> ^~~~~~~~~~~~ compilation terminated. Makefile:30: recipe for target 'Worktodo.o' failed make: *** [Worktodo.o] Error 1[/CODE]what invoked make gpuowl is the following Colab code section:[CODE]#draft Notebook to set up a gpuowl Google drive folder for a future Colab session import os.path from google.colab import drive import sys if not os.path.exists('/content/drive/My Drive'): drive.mount('/content/drive') %cd '/content/drive/My Drive//' !chmod +w '/content/drive/My Drive' if not os.path.exists('/content/drive/My Drive/gpuowl'): !mkdir '/content/drive/My Drive/gpuowl' %cd '/content/drive/My Drive/gpuowl//' !git clone https://github.com/preda/gpuowl %cd '/content/drive/My Drive/gpuowl/gpuowl//' !apt install libgmp-dev !make gpuowl [/CODE] |
comatose session
A gpuowl v6.11-83 session on Windows 10 continued to show gpu activity in gpu-z until terminated by ctrl-c, 6 hours later than it ceased showing activity at the console, in gpuowl.log, or in periodically saving checkpoint files.[CODE]2019-12-15 20:22:10 roa/radeonvii-f2 500001041 P2 2376/2880: 83377 primes; setup 1.80 s, 7.479 ms/prime
2019-12-15 20:32:36 roa/radeonvii-f2 500001041 P2 2430/2880: 83398 primes; setup 1.70 s, 7.481 ms/prime 2019-12-15 20:43:03 roa/radeonvii-f2 500001041 P2 2484/2880: 83620 primes; setup 1.79 s, 7.480 ms/prime [/CODE]was the end of the log file, when the process was terminated at 3am 12/16/19 |
| All times are UTC. The time now is 23:14. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.