![]() |
[QUOTE=nomead;532699] I'd hardly call gpuowl "minimal cpu use", even with -yield it takes about 80% of one core on my Linux machine, but luckily it's happy with a hyperthreaded core, so it doesn't affect mprime.[/QUOTE]Are you using ROCm? version? [url]https://github.com/RadeonOpenCompute/ROCm/issues/963[/url]
|
[QUOTE=kriesel;532799]Are you using ROCm? version? [url]https://github.com/RadeonOpenCompute/ROCm/issues/963[/url][/QUOTE]
No, why whould I run ROCm on Nvidia hardware? |
[QUOTE=preda;532782]Excellent!
at what frequency, and what power, is that timing?[/QUOTE] 766 us (my two best cards) sclk=4 1547MHz. rocm-smi says 167 and 174 watts. Memory overclocked to 1190 and 1200 respectively. [QUOTE=preda;532785]Warning: maybe it'd be a good idea to not upgrade to ROCm 2.10 if not already there.[/QUOTE] I'm at 2.9. I think I'll stay there. |
Spinner, utilization
In v6.11-83 gpuowl, the spinner appears during PRP, but not during P-1, even for exponents and bounds for which time between console outputs is several minutes or longer on Radeon VII.
On a 200M exponent, stage 2, also v6.11-83, Radeon VII, P-1 fluctuates from 22 to 130W and 21 to 1400 Mhz gpu clock, with period of seconds, per gpu-z. Which seems like underutilized capacity to me. |
[QUOTE=nomead;532807]No, why whould I run ROCm on Nvidia hardware?[/QUOTE]Sorry, forgot you were running an RTX2080.
|
[QUOTE=kracker;532772]Tried UNROLL_ALL on P100: expected error?
[/QUOTE] [QUOTE=nomead;532781]I get these errors on UNROLL_NONE (the same total 10 pcs) and exactly half (5 pcs) on either UNROLL_WIDTH or UNROLL_HEIGHT... while UNROLL_ALL runs fine. Weird, isn't it?[/QUOTE] Any way to see the output of the preprocessor? In my setup, I can use -dump. You might try the following. Where UNROLL_WIDTH_CONTROL and UNROLL_HEIGHT_CONTROL are #defined to be nothing. Change that to a semi-colon or other C statement that does nothing. |
Tested P-1 on P100... seems to have some regression for WORKINGIN
[code] new/current commit e928d82 929 none 947 WORKINGIN1 936 WORKINGIN1A 933 WORKINGIN2 938 WORKINGIN3 933 WORKINGIN4 929 WORKINGIN5 db9ce44 924 none 930 WORKINGIN1 929 WORKINGIN1A 930 WORKINGIN2 924 WORKINGIN3 918 WORKINGIN4 916 WORKINGIN5 [/code] Everything with NO_ASM and MERGED_MIDDLE... haven't tested anything else(yet) |
[QUOTE=kracker;532825]Tested P-1 on P100... seems to have some regression for WORKINGIN[/QUOTE]exponent ~90M / 5M FFT? On Colab? Any indication of gpu clock rate?
|
[QUOTE=Prime95;532814]Any way to see the output of the preprocessor? In my setup, I can use -dump.
You might try the following. Where UNROLL_WIDTH_CONTROL and UNROLL_HEIGHT_CONTROL are #defined to be nothing. Change that to a semi-colon or other C statement that does nothing.[/QUOTE] -dump apparently does nothing on NoVideo OpenCL -dump with a parameter (presumably folder name?) gives an error: [CODE]2019-12-13 22:01:06 Error in processing command line: Don't understand command line argument "-save-temps=foo/5M"! 2019-12-13 22:01:06 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:234 build [/CODE] Changing the #define to a semi-colon changed nothing. But changing that [C]__attribute__((opencl_unroll_hint(1)))[/C] to a semicolon removed the errors :smile: Maybe that just doesn't work under Nvidia OpenCL, it is after all 1.2 and maybe a bit broken at that? Now a completely unrelated suggestion for the Makefile, regarding gpuowl-wrap.cpp. That file is generated as needed, when gpuowl.cl is modified. However, if I copy an older version of gpuowl.cl (stashed away without some changes I wanted to test quickly) on it, make doesn't know it has changed. That is still expected behaviour. So I run [C]make clean[/C] to be sure, that everything is compiled and generated from scratch. Except... that gpuowl-wrap.cpp isn't. So I propose adding it to the files to be deleted under clean : [CODE]clean: rm -f ${OBJS} gpuowl gpuowl-win gpuowl-wrap.cpp [/CODE] |
[QUOTE=nomead;532850]Changing the #define to a semi-colon changed nothing. But changing that [C]__attribute__((opencl_unroll_hint(1)))[/C] to a semicolon removed the errors :smile:
Maybe that just doesn't work under Nvidia OpenCL, it is after all 1.2 and maybe a bit broken at that?[/QUOTE] I was not too worried about your case. UNROLL_ALL is the default on nVidia GPUs and there really is no need to change it. This option is all about bypassing ROCm optimizer problems. Though it is good to know that opencl_unroll_hint is not supported in your situation. |
[QUOTE=Prime95;532877]I was not too worried about your case. UNROLL_ALL is the default on nVidia GPUs and there really is no need to change it. This option is all about bypassing ROCm optimizer problems.[/QUOTE]
Me neither. I think it is better to concentrate efforts on where it really makes a difference i.e. Radeon VII. I'm really just along for the ride, benchmarking these things for fun :smile: If it sometimes manages to catch something that breaks compatibility with Nvidia drivers, that's a bonus of course. |
| All times are UTC. The time now is 23:14. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.