![]() |
[QUOTE=preda;493591]Maybe try adding "-lm" to link in the math library?
BTW, you know that you can end all this pain by switching to Linux, right? (just joking, sorry I couldn't hold myself)[/QUOTE] Normally compiling on Linux is a joy, compiling on Windows is like having your brains smashed out by a slice of lemon wrapped around a large gold brick. |
[QUOTE=preda;493582]@kriesel, I'm sorry but I don't know what's causing the last batch of errors you report. (the ones with ld.exe). Seems linker related.
The previous batch can be split into two groups: - the warnings about "%llu" in printf(), just ignore them. - the missing return in modInv() -- that function can be deleted completely, not used.[/QUOTE] removing -Wall took the %llu messages out. That warning is described as flagging what's "legal but dubious". modInv deleted The undefined errors beginning C:/repo/ are particularly interesting. I have no idea where that path is coming from. There's no repo folder in the root of drive c, nor should there be AFAIK. Re the other errors I'm guessing I've misinterpreted the makefile somehow. I'll go over that again and try some more things. No joy on the -lm suggestion. Re the suggestion of switching to linux: it would be much simpler for me if you'd switch to Windows 7 x64. (Join the dark side, Mihai! or, Resistance is futile, you will be assimilated.) Or become OS-bilingual, and make and post compiled executables for both Win and lin. As someone who's run DOS and Windows for 36 years, and RSX, and VMS including as system admin, for more than 15 years, and assorted minicomputers before that, I've tried linux on Intel, repeatedly over the years, and my experience has been that it is an overwhelmingly unfamiliar alien cryptic environment with seemingly infinite recursion of knowledge gaps posing obstacles to getting anything significant done in a time frame shorter than my memory retention or other alternatives. Although lsusb did a fine job of sleuthing out some key clues to what driver should be used with an unlabeled unidentified usb wireless network adapter, so linux remains as a possible last resort when the rest of the toolbox doesn't work well enough, and I keep a VM around for occasional experiments. I tried on a bare-metal install, getting an NVIDIA gpu driver installed on linux, and the default linux driver for it put up such a valiant fight to remain, that after 4 different approaches found online to overcome it failed, I gave up and put the OEM Vista back on the system, at the point where linux had been so stripped of video drivers that the display could not display the portion of the reconfiguration I needed to use to get it over to using the NVIDIA driver. Maybe not a brick wall, but more of a thorny hedge than even I had patience for. |
v3.5-457601f also no go on msys2/mingw
deleted modInv function
installed make: Pacman -Syu make modified makefile: hardcoded commit number in place of Preda's git magic (git not installed on this system); removed -Wall -Werror[CODE]# Choose one of: openowl (OpenCL) or cudaowl (CUDA). #all: openowl cudaowl HEADERS = args.h clwrap.h common.h kernel.h state.h stats.h timeutil.h tinycl.h worktodo.h Gpu.h LowGpu.h SRCS = Gpu.cpp common.cpp gpuowl.cpp # Edit the path in -L below if needed, to the folder containing OpenCL.dll on Windows or libOpenCL.so on UNIX. # The included lib paths are for ROCm, AMDGPU-pro/Linux or MSYS-2/Windows. LIBPATH = -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 openowl: ${HEADERS} ${SRCS} OpenGpu.h OpenGpu.cpp # g++ -O2 -DREV=\"`git rev-parse --short HEAD``git diff-files --quiet || echo -mod`\" -Wall -Werror -std=c++14 OpenGpu.cpp ${SRCS} -o openowl -lOpenCL ${LIBPATH} g++ -O2 -DREV=\"457601f\" -std=c++14 OpenGpu.cpp ${SRCS} -o openowl -lOpenCL ${LIBPATH} cudaowl: ${HEADERS} ${SRCS} CudaGpu.h CudaGpu.cu nvcc -O2 -DREV=\"`git rev-parse --short HEAD``git diff-files --quiet || echo -mod`\" -o cudaowl CudaGpu.cu ${SRCS} -lcufft fftbench: fftbench.cu nvcc -O2 -o fftbench fftbench.cu -lcufft tf: ${HEADERS} tf.cpp g++-8 -O2 tf.cpp common.cpp -otf -lOpenCL ${LIBPATH} [/CODE]$ make openowl[CODE] g++ -O2 -DREV=\"457601f\" -std=c++14 OpenGpu.cpp Gpu.cpp common.cpp gpuowl.cpp -o openowl -lOpenCL -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: d000035.o:(.idata$5+0x0): multiple definition of `__imp___C_specific_handler'; d000029.o:(.idata$5+0x0): first defined here C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o: in function `pre_c_init': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:136: undefined reference to `__p__fmode' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o: in function `__tmainCRTStartup': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:280: undefined reference to `_set_invalid_parameter_handler' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:289: undefined reference to `__p__acmdln' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\msys64\tmp\ccDrgNx7.o:common.cpp:(.text+0x145): undefined reference to `__imp___acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\msys64\tmp\ccsRXy7k.o:gpuowl.cpp:(.text$_Z6printfPKcz[_Z6printfPKcz]+0x29): undefined reference to `__imp___acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingw32.a(lib64_libmingw32_a-merr.o): in function `_matherr': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/merr.c:72: undefined reference to `__acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingw32.a(lib64_libmingw32_a-pseudo-reloc.o): in function `__report_error': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/pseudo-reloc.c:149: undefined reference to `__acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/pseudo-reloc.c:150: undefined reference to `__acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingwex.a(lib64_libmingwex_a-wassert.o): in function `_wassert': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/misc/wassert.c:35: undefined reference to `__acrt_iob_func' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingwex.a(lib64_libmingwex_a-mingw_vfprintf.o): in function `__mingw_vfprintf': C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/stdio/mingw_vfprintf.c:53: undefined reference to `_lock_file' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/stdio/mingw_vfprintf.c:55: undefined reference to `_unlock_file' collect2.exe: error: ld returned 1 exit status make: *** [Makefile:13: openowl] Error 1 [/CODE]modify LIBPATH: same result, as expected. [CODE]#LIBPATH = -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 LIBPATH = -L/c/Windows/System32 [/CODE]add -lm, same result. |
[QUOTE=M344587487;493606]Normally compiling on Linux is a joy, compiling on Windows is like having your brains smashed out by a slice of lemon wrapped around a large gold brick.[/QUOTE]
Normally compiling on Linux is a joy, if you have the right versions of every library and dependency that the program needs. And it is common to adjust the Makefile with the real paths of installed tools. |
[QUOTE=M344587487;493606]Normally compiling on Linux is a joy, compiling on Windows is like having your brains smashed out by a slice of lemon wrapped around a large gold brick.[/QUOTE]
Says a user if not partisan, of one camp, to a user if not partisan, of another camp. There are reasons why the market shares are what they are, despite the initial price difference. Feel free to cross compile for Win7 X64 on linux and post executables. Trying to navigate linux is like I imagine it would be to be blind, aphasic, on foot, completely unfamiliar with the area, and asking directions of mutes in a foreign country, with no language in common. And with only guessing at proper search terms to substitute for a good bilingual dictionary. I'm glad that linux is useful for you and others. The ease of learning a new language is generally strongly related to youth; much easier before age 20. I think that also applies to computing environments. Linux came along decades too late for me for that prime learning window. Compiler/linker puzzles are frustrating regardless of how familiar the environment may be. |
Fresh V2.0 build, after also failing to build via make on msys2/mingw
1 Attachment(s)
As a baseline, I decided to go back and try to build gpuOwL v2.0 again. I think there is something off about the current version of msys2/mingw. Or very likely operator error/ignorance.
Modify makefile to hardcode commit since there's no git installed, modify gpuowl.cpp with a comment line so make will not insist it's up to date, make gpuowl[CODE] g++ -DREV=\"dbc5a01\" -O2 -Wall -Werror -std=c++14 gpuowl.cpp -ogpuowl -lOpenCL -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 In file included from args.h:4,from gpuowl.cpp:6: clwrap.h:247:41: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] void setArg(cl_kernel k, int pos, const auto &value) { CHECK(clSetKernelArg(k, pos, sizeof(value), &value)); } ^~~~ In file included from gpuowl.cpp:7: kernel.h:76:46: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] void setArg(const std::string &name, const auto &arg) { ^~~~ In file included from checkpoint.h:6, from gpuowl.cpp:9: state.h:113:16: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] bool isAllZero(auto const it, auto const end) { return std::all_of(it, end, [](auto e) { return e == 0; }); } ^~~~ state.h:113:31: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] bool isAllZero(auto const it, auto const end) { return std::all_of(it, end, [](auto e) { return e == 0; }); } ^~~~ In file included from gpuowl.cpp:9: checkpoint.h:45:37: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] static bool write(FILE *fo, const auto &vect) { return fwrite(&vect[0], vect.size() * sizeof(vect[0]), 1, fo); } ^~~~ checkpoint.h:46:38: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] static bool read(FILE *fi, int n, auto &vect) { ^~~~ gpuowl.cpp:390:69: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] bool *outIsPrime, u64 *outResidue, int *outNErrors, auto modSqLoop, auto modMul, ^~~~ gpuowl.cpp:390:85: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] bool *outIsPrime, u64 *outResidue, int *outNErrors, auto modSqLoop, auto modMul, ^~~~ gpuowl.cpp:519:13: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] void append(auto &vect, auto what) { vect.insert(vect.end(), what); } ^~~~ gpuowl.cpp:519:25: error: use of 'auto' in parameter declaration only available with -fconcepts [-Werror] void append(auto &vect, auto what) { vect.insert(vect.end(), what); } ^~~~ cc1plus.exe: all warnings being treated as errors make: *** [Makefile:9: gpuowl] Error 1 [/CODE]retry after deleting -Wall -Werror: same, plus the usual 18 issues seen with V3.3 or 3.5. Various combinations of -Wall -fconcepts -Werror fail. Then, I recall, I didn't use make before, but kracker's step by step instructions for V2. modified this time, from $ g++ -c gpuowl.cpp to $ g++ -DREV="dbc5a01" -O2 -c gpuowl.cpp (reproduces the above warnings about auto) $ g++ -o gpuowl.exe gpuowl.o -lOpenCL -static links without error. $ strip gpuowl.exe $ rename gpuowl.exe gpuowl-v20-dbc5a01.exe Initial test looks ok here. So here's an executable for Win 64-bit Opencl on AMD gpus. As usual, user assumes all risks of downloaded software. And as usual it requires some separate files to run. |
[QUOTE=kriesel;493618]Says a user if not partisan, of one camp, to a user if not partisan, of another camp.
There are reasons why the market shares are what they are, despite the initial price difference. Feel free to cross compile for Win7 X64 on linux and post executables. Trying to navigate linux is like I imagine it would be to be blind, aphasic, on foot, completely unfamiliar with the area, and asking directions of mutes in a foreign country, with no language in common. And with only guessing at proper search terms to substitute for a good bilingual dictionary. I'm glad that linux is useful for you and others. The ease of learning a new language is generally strongly related to youth; much easier before age 20. I think that also applies to computing environments. Linux came along decades too late for me for that prime learning window. Compiler/linker puzzles are frustrating regardless of how familiar the environment may be.[/QUOTE] Linux is just great for things like gpuowl, well maintained git projects that simply require you to clone the repo and make. Funnily enough gpuowl has been the trickiest thing I've installed recently by virtue of requiring an OpenCL environment, which you install by googling rocm and following this, basically install dependencies add a repo and make sure you're on a supported kernel: [url]https://github.com/RadeonOpenCompute/ROCm[/url] It's not about being hard to learn something new, I think it's mainly that Linux is great at providing a common environment for compiling such that idiots like myself can type a few commands and things tend to work. On windows you have many options and normally come unstuck just by not setting up the environment quite right. Mingw as great an option as it can be can also be a major pain in the arse when things go wrong. Sure things can go wrong with Linux, I find it tends to be when you're off the beaten path and/or dealing with spottily supported hardware. I switched to Linux partly because the programming support was not good on Windows, not good being a massive understatement. If everyone programmed I wouldn't be surprised if the market shares were reversed. |
OpenOwL V3.3-bc4a29f Win X64 build
1 Attachment(s)
Forget git. Forget make. In msys2/mingw64:
[CODE]$ g++ -DREV=\"bc4a29f\" -O2 -c gpuowl.cpp -o gpuowl.o $ g++ -O2 -c OpenGpu.cpp -o OpenGpu.o $ g++ -O2 -c common.cpp -o common.o $ g++ -O2 -c Gpu.cpp -o Gpu.o $ g++ -o openowl-V33-bc4a29f-W64.exe OpenGpu.o common.o Gpu.o gpuowl.o -lOpenCL -static[/CODE]Possible pitfalls: Trying to run in the same directory as another version of gpuOwL. Make a v33test folder. Wrong version of, or missing file: Gpuowl.cl or shared.h no worktodo Sample output:[QUOTE]C:\msys64\home\ken\v33test>openowl-V33-bc4a29f-W64.exe gpuowl-OpenCL 3.3-bc4a29f FFT 5M: Width 1024 (256x4), Height 512 (64x8), Middle 5; 15.11 bits/word Note: using short carry kernels Ellesmere-36x1266-@28:0.0 Radeon (TM) RX 480 Graphics OpenCL compilation in 3698 ms, with " -DEXP=79212169u -DWIDTH=1024u -DSMALL_HEIGHT=512u -DRATIO=5u -I. -cl-fast-relaxed-math " [2018-08-10 16:50:58 Central Daylight Time] PRP M(79212169), FFT 5120K, 15.11 bits/word OK loaded: 0/79212169, blockSize 400, 0000000000000003 OK initial check: 0000000000000003 OK 2018-08-10 16:51:10 Ellesmere-36x1266-@28:0.0 800/79212169 [ 0.00%], 4.40 ms/it [4.40, 4.41]; ETA 4d 00:55; f276b04c15091c (check 2. 36s) (saved) 2018-08-10 16:51:51 Ellesmere-36x1266-@28:0.0 10000/79212169 [ 0.01%], 4.44 ms/it [4.42, 4.61]; ETA 4d 01:47; ade333e89300ed 2018-08-10 16:52:35 Ellesmere-36x1266-@28:0.0 20000/79212169 [ 0.03%], 4.45 ms/it [4.43, 4.67]; ETA 4d 02:00; 9dffd45fe2ea59 2018-08-10 16:53:20 Ellesmere-36x1266-@28:0.0 30000/79212169 [ 0.04%], 4.47 ms/it [4.44, 4.76]; ETA 4d 02:19; c832f16c4a476c Stopping, please wait.. OK 2018-08-10 16:53:31 Ellesmere-36x1266-@28:0.0 32000/79212169 [ 0.04%], 4.45 ms/it [4.44, 4.50]; ETA 4d 01:55; ec03287eed26b4 (check 2. 38s) (saved) Bye][/QUOTE]Some observations: V3.3 seems to be ~10% faster than v2.0 on a very similar fft length and same gpu and exponent and gpu driver version and OS patch state. (V3.3 4.44 ms/it on 5M, V2.0 4.88 ms/it on 5000K) Log format is different V3.3 vs 2.0, and V3.3 log frequency is lower than screen output. (That's unfortunate. Some console output is lost. But tastes vary.) Log format change means gpuOwL section of a helper application is version-dependent or some versions are incompatible. The right two hex digits are missing from the res64 output on screen and in the log file in V3.3 Because of different logging rates and other differences between gpuOwL versions, and different user or system induced stop and resume points, comparing residues between runs of the same exponent is not straightforward. -O2 seems to make very little difference in execution speed on V2.0. It did shrink the size of the executable noticeably (276 vs 328KB). No corresponding data for V3.3, since I only did -O2; linked size was 1338KB (too big to post as an .exe file). As always, the end user assumes the entire risk of a downloaded executable. |
[QUOTE=M344587487;493633]... add a repo...[/QUOTE]
Repo!? I paid cash for my car! :smile::piggie: |
V3.5-457601f build for Win 64
1 Attachment(s)
In msys2/mingw64:
[CODE]$ g++ -DREV=\"457601f\" -O2 -c gpuowl.cpp -o gpuowl.o $ g++ -O2 -c OpenGpu.cpp -o OpenGpu.o $ g++ -O2 -c common.cpp -o common.o $ g++ -O2 -c Gpu.cpp -o Gpu.o $ g++ -o openowl-V35-457601f-W64.exe OpenGpu.o common.o Gpu.o gpuowl.o -lOpenCL -static[/CODE]Brief sample run: [CODE] gpuowl-OpenCL 3.5-457601f FFT 4608K: Width 512 (64x8), Height 512 (64x8), Middle 9; 16.79 bits/word Note: using short carry kernels Ellesmere-36x1266-@28:0.0 Radeon (TM) RX 480 Graphics OpenCL compilation in 4254 ms, with " -DEXP=79212169u -DWIDTH=512u -DSMALL_HEIGHT=512u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0 " [2018-08-10 19:17:08 Central Daylight Time] PRP M(79212169), FFT 4608K, 16.79 bits/word OK loaded: 0/79212169, blockSize 400, 0000000000000003 OK initial check: 0000000000000003 OK 2018-08-10 19:17:19 Ellesmere-36x1266-@28:0.0 800/79212169 [ 0.00%], 3.82 ms/it [3.82, 3.82]; ETA 3d 12:10; f276b04c15091c (check 2.08s) (saved) OK 2018-08-10 19:27:36 Ellesmere-36x1266-@28:0.0 160000/79212169 [ 0.20%], 3.87 ms/it [3.86, 3.92]; ETA 3d 12:56; fc490c2e8f21ee (check 2.08s) (saved) [/CODE]The above V3.5 test run's fc49 0c2e 8f21 ee__ matches that of V3.3's test run at the same 160000 iterations from start, including lacking the right two hex characters. It's faster by ~15.7% in V3.5 than V3.3, same exponent on same gpu, with shorter 4608K fft than V3.3's 5120K seemingly not accounting for all the speedup (5120/4608=1.111...). As usual, the end user assumes the risk of downloaded software. Again, because this is 1344K in size, the executable is in a .7z compressed archive file to fit within file posting size limits. |
[QUOTE=kriesel;493646]The above V3.5 test run's fc49 0c2e 8f21 ee__ matches that of V3.3's test run at the same 160000 iterations from start, including lacking the right two hex characters.
It's faster by ~15.7% in V3.5 than V3.3, same exponent on same gpu, with shorter 4608K fft than V3.3's 5120K seemingly not accounting for all the speedup (5120/4608=1.111...). [/QUOTE] @kriesel , I'm happy the compilation worked! (also it's good news it didn't get slower). And thanks for the feedback too: - 14 chars hex residue: a genuine (and silly) bug. Fix incoming in the very next commit. - log frequency, different between terminal and file. Let me explain what's happening here. The self-check, which results in OK or EE, is done not very often -- as you see by default every 160K (400^2) iterations. These "checked" lines form the real log, and they go to both terminal and file. But because these "real" lines happen rarely, the user may want to see some confirmation that it's still moving, and some timing info. And these are the lines that go only to the terminal. They are not checked, and mostly a comment. (they occur every 10K its). |
| All times are UTC. The time now is 23:03. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.