![]() |
Try running it with "-use NO_ASM"
(if it works, you can put that option in config.txt ) The __asm() errors are because you're running with a driver (Adrenalin/windows) that does not support assembly. ROCm/linux works fine with __asm(). Anyway, assembly support is not mandatory, you just need to disable it with -use NO_ASM . I haven't found a way yet (in OpenCL) to automatically detect __asm support. [QUOTE=ATH;521950]So I got my Radeon VII but I'm a bit lost, it has been many many years since I had an AMD card and it was way before using GPUs for any calculations, and I'm also new to gpuowl. I installed the newest drivers: Adrenalin 2019 19.7.2. I had "gpuowl-win7-x64-v6.5-c48d46f.7z" from [URL="https://mersenneforum.org/showpost.php?p=516704&postcount=1171"]post #1171[/URL] on my hard drive already from 2 months ago, I think I got it to confirm that OpenCL really worked on my RTX 2080 which it did. Now when I run it with -device 1 (Radeon VII) it only writes the first few lines but never gets to the "OpenCL compilation in ..." line and it never starts running. When I use -device 0 it works fine and runs on my RTX 2080. I tried downloading the " gpuowl-win-v6.5-84-g30c0508.7z" from [URL="https://mersenneforum.org/showpost.php?p=521225&postcount=1274"]post #1274[/URL] but it does not start at all on neither card: [CODE]2019-07-20 00:05:56 config: -device 1 2019-07-20 00:05:56 80293033 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 17.02 bits/word 2019-07-20 00:05:56 using short carry kernels 2019-07-20 00:05:56 OpenCL args "-DEXP=80293033u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -DWEIGHT_STEP=0xf.d1f3073e091p-3 -DIWEIGHT_STEP=0x8.17498299a4db8p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-07-20 00:05:56 OpenCL compilation error -11 (args -DEXP=80293033u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -DWEIGHT_STEP=0xf.d1f3073e091p-3 -DIWEIGHT_STEP=0x8.17498299a4db8p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -I. -cl-fast-relaxed-math -cl-std=CL2.0) 2019-07-20 00:05:56 C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:197:3: error: implicit declaration of function '__asm' is invalid in C99 X2(u[0], u[2]); ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:174:2: note: expanded from macro 'X2' __asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.x) : "v" (t.x), "v" (b.x)); \ ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:197:3: error: expected ')' C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:174:35: note: expanded from macro 'X2' __asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.x) : "v" (t.x), "v" (b.x)); \ ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:197:3: note: to match this '(' C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:174:7: note: expanded from macro 'X2' __asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.x) : "v" (t.x), "v" (b.x)); \ ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:197:3: error: expected ')' X2(u[0], u[2]); ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:175:35: note: expanded from macro 'X2' __asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.y) : "v" (t.y), "v" (b.y)); \ ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:197:3: note: to match this '(' C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:175:7: note: expanded from macro 'X2' __asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.y) : "v" (t.y), "v" (b.y)); \ ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:198:3: error: expected ')' X2_mul_t4(u[1], u[3]); ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:180:35: note: expanded from macro 'X2_mul_t4' __asm( "v_add_f64 %0, %1, -%2\n" : "=v" (t.x) : "v" (b.x), "v" (t.x)); \ ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:198:3: note: to match this '(' C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:180:7: note: expanded from macro 'X2_mul_t4' __asm( "v_add_f64 %0, %1, -%2\n" : "=v" (t.x) : "v" (b.x), "v" (t.x)); \ ^ C:\Users\ATH\AppData\Local\Temp\\OCL7076T0.cl:1982019-07-20 00:05:56 Exception 9gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:215 build 2019-07-20 00:05:56 Bye[/CODE] Are there any more Windows executables collected somewhere?[/QUOTE] |
Thank you, this is useful!
[QUOTE=ewmayer;521952]No, the FFT is inherently cyclic-convolutional ... the IBDWT weighting allow us to use a prime-length "bit folding" boundary in conjunction with an underlying polynomial-multiply which most naturally lends itself to a bitness which is highly composite, by way of being a multiple of the transform length. As George noted, for (mod 2^p+1) you need 2 distinct weightings: the IBDWT one to allow for a prime-length bit-folding, and the standard acyclic-effecting weighting, which for a length-n transform uses the first n complex (2*n)th roots of unity. That needs a complex FFT algorithm; for length-n real input vector you can use a length-(n/2) complex FFT. Noting that the [j]th and [j+n/2]th acyclic weights (call them 'awt') are related by awt[j+n/2] = I*awt[j], you can see that in this context it makes sense to group pairs of real inputs together not via the usual (x[j],x[j+1])-treated-as-a-complex-datum scheme but rather in (x[j],x[j+n/2]) pairs, since applying the acyclic-weights turns those 2 reals into (awt[j]*x[j],I*awt[j]*x[j+n/2]), i.e. we can pull out the shared complex acyclic-multiplier awt[j] = exp(I*j/(2*n) to get a weighted complex input awt[j]*(x[j] + I*x[j+n/2]). This is the so-called "right-angle transform" trick. Crandall & Fagin recapped it (since it wasn't new) in the Fermat-mod section of the same 1994 paper where they introduced the Mersenne-mod IBDWT.[/QUOTE] |
Thanks. I assume there is no Windows driver where it works?
Now there are no errors but it does not actually start calculating, the card is not being used at all. [CODE] gpuowl-win.exe -device 1 -use NO_ASM 2019-07-20 03:01:43 gpuowl v6.5-84-g30c0508 2019-07-20 03:01:43 Note: no config.txt file found 2019-07-20 03:01:43 config: -device 1 -use NO_ASM 2019-07-20 03:01:43 80293033 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 17.02 bits/word 2019-07-20 03:01:43 using short carry kernels 2019-07-20 03:01:43 OpenCL args "-DEXP=80293033u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -DWEIGHT_STEP=0xf.d1f3073e091p-3 -DIWEIGHT_STEP=0x8.17498299a4db8p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DNO_ASM=1 -DNO_ASM=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0"[/CODE] I was afraid I was being too optimistic trying to run Nvidia and AMD card in the same computer. Anyone else have any Windows binaries? Only Kriesel posted binaries in this thread from the latests versions. |
[QUOTE=preda;521955]Try running it with "-use NO_ASM"
(if it works, you can put that option in config.txt ) The __asm() errors are because you're running with a driver (Adrenalin/windows) that does not support assembly. ROCm/linux works fine with __asm(). Anyway, assembly support is not mandatory, you just need to disable it with -use NO_ASM . I haven't found a way yet (in OpenCL) to automatically detect __asm support.[/QUOTE]FWIW, the example run of gpuowl v6.5-84-g30c0508 in [URL]https://www.mersenneforum.org/showpost.php?p=521225&postcount=1274[/URL] was on an RX480, Win7 x64, Adrenalin 18.10.2 driver with -use ORIG_X2, after the advice of Prime95 to -use FMA_X2 at [URL]https://www.mersenneforum.org/showpost.php?p=517932&postcount=1213[/URL], plus subsequent experimentation for performance [URL]https://www.mersenneforum.org/showpost.php?p=517961&postcount=1217[/URL] No data on Radeon VII here yet. Hadn't seen NO_ASM back when I made the -use list at [URL]https://www.mersenneforum.org/showpost.php?p=517999&postcount=1222[/URL] but I see it there in the gpuowl.cl code of v6.5-76-g1ca08e2-dirty |
[QUOTE=preda;521955]Try running it with "-use NO_ASM"
(if it works, you can put that option in config.txt ) The __asm() errors are because you're running with a driver (Adrenalin/windows) that does not support assembly. ROCm/linux works fine with __asm(). Anyway, assembly support is not mandatory, you just need to disable it with -use NO_ASM . I haven't found a way yet (in OpenCL) to automatically detect __asm support.[/QUOTE] there should be a way to detect which driver is in use, amdgpu-pro doesn't support __asm() and I set -use NO_ASM. |
Normally the next log line would be something like:
"OpenCL compilation in 2195 ms" So it seems it your case it's stuck at the OpenCL compilation step. I'm sorry but I don't really know why, and unfortunatelly I can't repro. (I would be happy to have a fix if the problem is on gpuowl's side) [QUOTE=ATH;521958]Thanks. I assume there is no Windows driver where it works? Now there are no errors but it does not actually start calculating, the card is not being used at all. [CODE] gpuowl-win.exe -device 1 -use NO_ASM 2019-07-20 03:01:43 gpuowl v6.5-84-g30c0508 2019-07-20 03:01:43 Note: no config.txt file found 2019-07-20 03:01:43 config: -device 1 -use NO_ASM 2019-07-20 03:01:43 80293033 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 17.02 bits/word 2019-07-20 03:01:43 using short carry kernels 2019-07-20 03:01:43 OpenCL args "-DEXP=80293033u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -DWEIGHT_STEP=0xf.d1f3073e091p-3 -DIWEIGHT_STEP=0x8.17498299a4db8p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DNO_ASM=1 -DNO_ASM=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0"[/CODE] I was afraid I was being too optimistic trying to run Nvidia and AMD card in the same computer. Anyone else have any Windows binaries? Only Kriesel posted binaries in this thread from the latests versions.[/QUOTE] |
[QUOTE=SELROC;521978]there should be a way to detect which driver is in use, amdgpu-pro doesn't support __asm() and I set -use NO_ASM.[/QUOTE]
it is possible with a script, the "vendor" field should change accordingly for nvidia, also the "configuration: driver=amdgpu latency=0" should change accordingly: # lshw -class video *-display description: VGA compatible controller product: Ellesmere [Radeon RX 470/480] vendor: Advanced Micro Devices, Inc. [AMD/ATI] physical id: 0 bus info: pci@0000:01:00.0 version: e7 width: 64 bits clock: 33MHz capabilities: pm pciexpress msi vga_controller bus_master cap_list rom configuration: driver=amdgpu latency=0 resources: iomemory:220-21f iomemory:210-20f irq:126 memory:2200000000-23ffffffff memory:2100000000-21001fffff ioport:e000(size=256) memory:f7e00000-f7e3ffff memory:f7e40000-f7e5ffff *-display description: VGA compatible controller product: Intel Corporation vendor: Intel Corporation physical id: 2 bus info: pci@0000:00:02.0 version: 04 width: 64 bits clock: 33MHz capabilities: pciexpress msi pm vga_controller bus_master cap_list rom configuration: driver=i915 latency=0 resources: iomemory:2f0-2ef iomemory:2f0-2ef irq:125 memory:2ffe000000-2ffeffffff memory:2fe0000000-2fefffffff ioport:f000(size=64) memory:c0000-dffff |
[QUOTE=ATH;521958]
I was afraid I was being too optimistic trying to run Nvidia and AMD card in the same computer.[/QUOTE] AH! Divide and conquer. Use multiple cl test and info utilities and device manager to check how functional the AMD opencl and driver installation is. Sometimes one will claim all's fine and others will show issues. I've seen one vendor's opencl install hose another's. (iIn that case it was an NVIDIA or AMD SDK install disabling the opencl use of the Intel igp, until the SDKs were removed and the Intel opencl reinstalled.) You could try a temporary complete removal of the NVIDIA driver followed by removal and reinstall of the AMD driver. Also sometimes an additional second reboot is needed after a graphics driver install. |
[QUOTE=preda;521955]Try running it with "-use NO_ASM"
(if it works, you can put that option in config.txt ) The __asm() errors are because you're running with a driver (Adrenalin/windows) that does not support assembly. ROCm/linux works fine with __asm(). Anyway, assembly support is not mandatory, you just need to disable it with -use NO_ASM . I haven't found a way yet (in OpenCL) to automatically detect __asm support.[/QUOTE] Hi Mihai, I found a C++ library: [url]https://github.com/ThePhD/infoware[/url] Example: [url]https://github.com/ThePhD/infoware/blob/master/examples/gpu.cpp[/url] reactions? |
[QUOTE=preda;521955]Try running it with "-use NO_ASM"
(if it works, you can put that option in config.txt ) The __asm() errors are because you're running with a driver (Adrenalin/windows) that does not support assembly. ROCm/linux works fine with __asm(). Anyway, assembly support is not mandatory, you just need to disable it with -use NO_ASM . I haven't found a way yet (in OpenCL) to automatically detect __asm support.[/QUOTE] This header file detects platform: [url]https://github.com/hendrix2897/platform-detect/blob/master/PlatformDetect.h[/url] |
Hello and thank you for the program.
How much of it depends on CPU performance? Will it be significantly slower running on a Radeon VII with a pentium II, i5 2500 or R7 1800x (or 3600) ? I am about to compile in linux soon. I have a 1800x, and setup Radeon VI for gpuOwl and Nvidia 1050ti for the lesser stuff, any experience in balancing load between two gpus appreciated! [url=https://betrig.com/]betrig[/url] |
| All times are UTC. The time now is 23:15. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.