mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

kriesel 2019-12-13 15:33

[QUOTE=nomead;532699] I'd hardly call gpuowl "minimal cpu use", even with -yield it takes about 80% of one core on my Linux machine, but luckily it's happy with a hyperthreaded core, so it doesn't affect mprime.[/QUOTE]Are you using ROCm? version? [url]https://github.com/RadeonOpenCompute/ROCm/issues/963[/url]

nomead 2019-12-13 16:32

[QUOTE=kriesel;532799]Are you using ROCm? version? [url]https://github.com/RadeonOpenCompute/ROCm/issues/963[/url][/QUOTE]

No, why whould I run ROCm on Nvidia hardware?

Prime95 2019-12-13 16:38

[QUOTE=preda;532782]Excellent!
at what frequency, and what power, is that timing?[/QUOTE]

766 us (my two best cards) sclk=4 1547MHz. rocm-smi says 167 and 174 watts. Memory overclocked to 1190 and 1200 respectively.

[QUOTE=preda;532785]Warning: maybe it'd be a good idea to not upgrade to ROCm 2.10 if not already there.[/QUOTE]

I'm at 2.9. I think I'll stay there.

kriesel 2019-12-13 16:39

Spinner, utilization
 
In v6.11-83 gpuowl, the spinner appears during PRP, but not during P-1, even for exponents and bounds for which time between console outputs is several minutes or longer on Radeon VII.

On a 200M exponent, stage 2, also v6.11-83, Radeon VII, P-1 fluctuates from 22 to 130W and 21 to 1400 Mhz gpu clock, with period of seconds, per gpu-z. Which seems like underutilized capacity to me.

kriesel 2019-12-13 16:40

[QUOTE=nomead;532807]No, why whould I run ROCm on Nvidia hardware?[/QUOTE]Sorry, forgot you were running an RTX2080.

Prime95 2019-12-13 16:53

[QUOTE=kracker;532772]Tried UNROLL_ALL on P100: expected error?
[/QUOTE]

[QUOTE=nomead;532781]I get these errors on UNROLL_NONE (the same total 10 pcs) and exactly half (5 pcs) on either UNROLL_WIDTH or UNROLL_HEIGHT... while UNROLL_ALL runs fine. Weird, isn't it?[/QUOTE]

Any way to see the output of the preprocessor? In my setup, I can use -dump.

You might try the following. Where UNROLL_WIDTH_CONTROL and UNROLL_HEIGHT_CONTROL are #defined to be nothing. Change that to a semi-colon or other C statement that does nothing.

kracker 2019-12-13 18:14

Tested P-1 on P100... seems to have some regression for WORKINGIN
[code]
new/current commit e928d82
929 none
947 WORKINGIN1
936 WORKINGIN1A
933 WORKINGIN2
938 WORKINGIN3
933 WORKINGIN4
929 WORKINGIN5

db9ce44
924 none
930 WORKINGIN1
929 WORKINGIN1A
930 WORKINGIN2
924 WORKINGIN3
918 WORKINGIN4
916 WORKINGIN5
[/code]

Everything with NO_ASM and MERGED_MIDDLE... haven't tested anything else(yet)

kriesel 2019-12-13 18:16

[QUOTE=kracker;532825]Tested P-1 on P100... seems to have some regression for WORKINGIN[/QUOTE]exponent ~90M / 5M FFT? On Colab? Any indication of gpu clock rate?

nomead 2019-12-13 20:20

[QUOTE=Prime95;532814]Any way to see the output of the preprocessor? In my setup, I can use -dump.

You might try the following. Where UNROLL_WIDTH_CONTROL and UNROLL_HEIGHT_CONTROL are #defined to be nothing. Change that to a semi-colon or other C statement that does nothing.[/QUOTE]

-dump apparently does nothing on NoVideo OpenCL
-dump with a parameter (presumably folder name?) gives an error:
[CODE]2019-12-13 22:01:06 Error in processing command line: Don't understand command line argument "-save-temps=foo/5M"!
2019-12-13 22:01:06 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:234 build
[/CODE]

Changing the #define to a semi-colon changed nothing. But changing that [C]__attribute__((opencl_unroll_hint(1)))[/C] to a semicolon removed the errors :smile:

Maybe that just doesn't work under Nvidia OpenCL, it is after all 1.2 and maybe a bit broken at that?


Now a completely unrelated suggestion for the Makefile, regarding gpuowl-wrap.cpp. That file is generated as needed, when gpuowl.cl is modified. However, if I copy an older version of gpuowl.cl (stashed away without some changes I wanted to test quickly) on it, make doesn't know it has changed. That is still expected behaviour. So I run [C]make clean[/C] to be sure, that everything is compiled and generated from scratch. Except... that gpuowl-wrap.cpp isn't.

So I propose adding it to the files to be deleted under clean :
[CODE]clean:
rm -f ${OBJS} gpuowl gpuowl-win gpuowl-wrap.cpp
[/CODE]

Prime95 2019-12-14 01:27

[QUOTE=nomead;532850]Changing the #define to a semi-colon changed nothing. But changing that [C]__attribute__((opencl_unroll_hint(1)))[/C] to a semicolon removed the errors :smile:

Maybe that just doesn't work under Nvidia OpenCL, it is after all 1.2 and maybe a bit broken at that?[/QUOTE]

I was not too worried about your case. UNROLL_ALL is the default on nVidia GPUs and there really is no need to change it. This option is all about bypassing ROCm optimizer problems.

Though it is good to know that opencl_unroll_hint is not supported in your situation.

nomead 2019-12-14 02:58

[QUOTE=Prime95;532877]I was not too worried about your case. UNROLL_ALL is the default on nVidia GPUs and there really is no need to change it. This option is all about bypassing ROCm optimizer problems.[/QUOTE]

Me neither. I think it is better to concentrate efforts on where it really makes a difference i.e. Radeon VII. I'm really just along for the ride, benchmarking these things for fun :smile: If it sometimes manages to catch something that breaks compatibility with Nvidia drivers, that's a bonus of course.


All times are UTC. The time now is 23:14.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.