![]() |
[QUOTE=SELROC;493557]I got an opencl error:
[CODE] Usage: ./tf <exponent> <start bit> [<end bit>] LLVM ERROR: Cannot select: 0x560d6354d550: i64,glue = sube 0x560d6337e1f0, 0x560d6354c910, 0x560d635621b0:1 [/CODE] [/QUOTE] Yep, that means that the LLVM GCN backend does not know how to translate some operation to GCN. In this case it may be about a 128-bit SUB. Are you using ROCm or amdgpu-pro? if ROCm, which version? If you're on most recent ROCm (i.e. 1.8.2), we should let them know ("ROCm issues") about it. |
[QUOTE=preda;493560]Yep, that means that the LLVM GCN backend does not know how to translate some operation to GCN. In this case it may be about a 128-bit SUB.
Are you using ROCm or amdgpu-pro? if ROCm, which version? If you're on most recent ROCm (i.e. 1.8.2), we should let them know ("ROCm issues") about it.[/QUOTE] Nope, this is again amdgpu-pro 18.20 |
[QUOTE=SELROC;493561]Nope, this is again amdgpu-pro 18.20[/QUOTE]
OK -- it appears that amdgpu-pro is a bit behind on the LLVM compiler relative to ROCm. The TF code does use "non-standard" OpenCL, and I only tested on ROCm 1.8.2.. |
[QUOTE=preda;493562]OK -- it appears that amdgpu-pro is a bit behind on the LLVM compiler relative to ROCm. The TF code does use "non-standard" OpenCL, and I only tested on ROCm 1.8.2..[/QUOTE]
What is your plan about TF and automatic work fetch ? |
[QUOTE=SELROC;493555]This should be a missing compiler argument, like -fconcepts[/QUOTE]
Hi, thanks for responding. Not only do I not know what that is, what it would do, where it would go, or what if anything should follow it, it's not present in [URL]https://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/Option-Index.html#Option-Index[/URL] or [URL]https://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/Keyword-Index.html#Keyword-Index[/URL] tried $ g++ -O2 -DREV="bc4a29f" -std=c++14 OpenGpu.cpp Gpu.cpp common.cpp gpuowl.cpp -o openowl -lOpenCL -L/c/Windows /System32 and got[CODE]C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: d000035.o:(.idata$5+0x0): multiple definition of `__imp___C_specific_handler'; d000029.o:(.idata$5+0x0): first defined here C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o: in function `pre_c_init': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:136: undefined reference to `__p__fmode' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o: in function `__tm ainCRTStartup': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:280: undefined reference to `_set_invalid_para meter_handler' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/repo/ming w-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:289: undefined reference to `__p__acmdln' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\msys64\tm p\ccZCkCSW.o:common.cpp:(.text+0x13d): undefined reference to `__imp___acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\msys64\tm p\ccyP7B46.o:gpuowl.cpp:(.text$_Z6printfPKcz[_Z6printfPKcz]+0x29): undefined reference to `__imp___acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingw32.a(lib64_libmin gw32_a-merr.o): in function `_matherr': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/merr.c:72: undefined reference to `__acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingw32.a(lib64_libmin gw32_a-pseudo-reloc.o): in function `__report_error': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/pseudo-reloc.c:149: undefined reference to `__acrt_iob_ func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/repo/ming w-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/pseudo-reloc.c:150: undefined reference to `__acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingwex.a(lib64_libmin gwex_a-wassert.o): in function `_wassert': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/misc/wassert.c:35: undefined reference to `__acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingwex.a(lib64_libmin gwex_a-mingw_vfprintf.o): in function `__mingw_vfprintf': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/stdio/mingw_vfprintf.c:53: undefined reference to `_lock_file' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/repo/ming w-w64-crt-git/src/mingw-w64/mingw-w64-crt/stdio/mingw_vfprintf.c:55: undefined reference to `_unlock_file' collect2.exe: error: ld returned 1 exit status [/CODE] |
@kriesel, I'm sorry but I don't know what's causing the last batch of errors you report. (the ones with ld.exe). Seems linker related.
The previous batch can be split into two groups: - the warnings about "%llu" in printf(), just ignore them. - the missing return in modInv() -- that function can be deleted completely, not used. |
[QUOTE=SELROC;493568]What is your plan about TF and automatic work fetch ?[/QUOTE]
I'm short on time now, so I don't start anything related to automatic fetch, not right now. gpuowl seems to be rather reasonably efficient on 100M-digits exponents (332M) (8.32ms/it), so I may do some PRP on such myself. Such exponents are still huge chunks of work, taking on the order of 30days per exponent (this alleviates the need for automatic fetch&report somehow). Looking, I see that such big exponents (332M) are trial-factored to 78 or 79 bits (and not P-1). Thus, before I invest 30+ days of PRP into such an exponent, I'd like to TF it a bit more. And so here comes TF :) I wrote a nice tight (and very limited) OpenCL TF+sieve all in well under 1K LOC. It does use some non-standard OpenCL such as 128-bit ints and 1024 group size, but it works fine with ROCm. Of course the question is: why not use mfakto? I did use both mfaktc, and mfakto, as a reference for my learning. They seem both very capable and solid, and offer a large set of options. The main drawback is they are also huge and complex (IMO), and this makes modifying them hard. And... it turns out this tiny TF is also 30% faster than mfakto. I'm still finalizing the TF, and thinking about the best way to integrate it with gpuowl. In my personal usage, I plan to TF the 332M exponents to 81bits, and afterwards PRP. |
[QUOTE=preda;493585]I'm short on time now, so I don't start anything related to automatic fetch, not right now.
gpuowl seems to be rather reasonably efficient on 100M-digits exponents (332M) (8.32ms/it), so I may do some PRP on such myself. Such exponents are still huge chunks of work, taking on the order of 30days per exponent (this alleviates the need for automatic fetch&report somehow). [/QUOTE] I am doing a 332 exponent, on my hardware the ETA is 50+ days . [QUOTE]Looking, I see that such big exponents (332M) are trial-factored to 78 or 79 bits (and not P-1). Thus, before I invest 30+ days of PRP into such an exponent, I'd like to TF it a bit more. And so here comes TF :) I wrote a nice tight (and very limited) OpenCL TF+sieve all in well under 1K LOC. It does use some non-standard OpenCL such as 128-bit ints and 1024 group size, but it works fine with ROCm.[/QUOTE] Personally I have been unable until now to use ROCm, but still promising here. [QUOTE]Of course the question is: why not use mfakto? I did use both mfaktc, and mfakto, as a reference for my learning. They seem both very capable and solid, and offer a large set of options. The main drawback is they are also huge and complex (IMO), and this makes modifying them hard. And... it turns out this tiny TF is also 30% faster than mfakto.[/QUOTE] I have been unable to compile mfacto, it gives a lot of errors. [QUOTE]I'm still finalizing the TF, and thinking about the best way to integrate it with gpuowl. In my personal usage, I plan to TF the 332M exponents to 81bits, and afterwards PRP.[/QUOTE] Can you try to adapt TF for OpenCL / amdgpu-pro ? |
[QUOTE=SELROC;493587]I am doing a 332 exponent, on my hardware the ETA is 50+ days[/QUOTE] My best GPU (that I have) is Vega64 air.
[QUOTE]Personally I have been unable until now to use ROCm, but still promising here[/QUOTE] Recently 1.8.2 started to work on Ubuntu 18.04 with kernel 4.15. [QUOTE]I have been unable to compile mfacto, it gives a lot of errors.[/QUOTE] I submitted a pull request to mfakto on github that fixes those. You may try to clone my copy with the fixes: [url]https://github.com/preda/mfakto[/url] [QUOTE]Can you try to adapt TF for OpenCL / amdgpu-pro ?[/QUOTE] Yes I can try. In a few days, we'll see how much needs fixed. |
[QUOTE=kriesel;493578][CODE]C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: d000035.o:(.idata$5+0x0): multiple definition of `__imp___C_specific_handler'; d000029.o:(.idata$5+0x0): first defined here
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o: in function `pre_c_init': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:136: undefined [/CODE][/QUOTE] Maybe try adding "-lm" to link in the math library? BTW, you know that you can end all this pain by switching to Linux, right? (just joking, sorry I couldn't hold myself) |
[QUOTE=preda;493588]My best GPU (that I have) is Vega64 air.
Recently 1.8.2 started to work on Ubuntu 18.04 with kernel 4.15. I submitted a pull request to mfakto on github that fixes those. You may try to clone my copy with the fixes: [URL]https://github.com/preda/mfakto[/URL] Yes I can try. In a few days, we'll see how much needs fixed.[/QUOTE] I just tried out mfakto compilation on debian: [CODE] g++ sieve.o timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o filelocking.o output.o mfakto.o gpusieve.o perftest.o menu.o kbhit.o -m64 -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -L/home/sel/AMDAPPSDK-3.0/lib/x86_64 -L/opt/rocm/opencl/lib/x86_64 -lOpenCL -o ../mfakto lto1: fatal error: bytecode stream in file ‘sieve.o’ generated with LTO version 5.2 instead of the expected 7.0 compilation terminated. lto-wrapper: fatal error: g++ returned 1 exit status compilation terminated. /usr/bin/ld: error: lto-wrapper failed collect2: error: ld returned 1 exit status make: *** [Makefile:83: ../mfakto] Error 1 [/CODE] |
| All times are UTC. The time now is 23:03. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.