mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2018-08-10, 06:50   #540
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

Quote:
Originally Posted by SELROC View Post
I got an opencl error:
Code:
Usage: ./tf <exponent> <start bit> [<end bit>]
LLVM ERROR: Cannot select: 0x560d6354d550: i64,glue = sube 0x560d6337e1f0, 0x560d6354c910, 0x560d635621b0:1
Yep, that means that the LLVM GCN backend does not know how to translate some operation to GCN. In this case it may be about a 128-bit SUB.

Are you using ROCm or amdgpu-pro? if ROCm, which version?

If you're on most recent ROCm (i.e. 1.8.2), we should let them know ("ROCm issues") about it.
preda is offline   Reply With Quote
Old 2018-08-10, 06:56   #541
SELROC
 

43·101 Posts
Default

Quote:
Originally Posted by preda View Post
Yep, that means that the LLVM GCN backend does not know how to translate some operation to GCN. In this case it may be about a 128-bit SUB.

Are you using ROCm or amdgpu-pro? if ROCm, which version?

If you're on most recent ROCm (i.e. 1.8.2), we should let them know ("ROCm issues") about it.



Nope, this is again amdgpu-pro 18.20
  Reply With Quote
Old 2018-08-10, 07:09   #542
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

Quote:
Originally Posted by SELROC View Post
Nope, this is again amdgpu-pro 18.20
OK -- it appears that amdgpu-pro is a bit behind on the LLVM compiler relative to ROCm. The TF code does use "non-standard" OpenCL, and I only tested on ROCm 1.8.2..
preda is offline   Reply With Quote
Old 2018-08-10, 08:01   #543
SELROC
 

400910 Posts
Default

Quote:
Originally Posted by preda View Post
OK -- it appears that amdgpu-pro is a bit behind on the LLVM compiler relative to ROCm. The TF code does use "non-standard" OpenCL, and I only tested on ROCm 1.8.2..



What is your plan about TF and automatic work fetch ?
  Reply With Quote
Old 2018-08-10, 12:52   #544
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

152B16 Posts
Default

Quote:
Originally Posted by SELROC View Post
This should be a missing compiler argument, like -fconcepts
Hi, thanks for responding.

Not only do I not know what that is, what it would do, where it would go, or what if anything should follow it, it's not present in

https://gcc.gnu.org/onlinedocs/gcc-4...l#Option-Index or https://gcc.gnu.org/onlinedocs/gcc-4...#Keyword-Index


tried

$ g++ -O2 -DREV="bc4a29f" -std=c++14 OpenGpu.cpp Gpu.cpp common.cpp gpuowl.cpp -o openowl -lOpenCL -L/c/Windows /System32
and got
Code:
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: d000035.o:(.idata$5+0x0): multiple definition of `__imp___C_specific_handler'; d000029.o:(.idata$5+0x0): first defined here
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi                    ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o: in function `pre_c_init':
C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:136: undefined reference to `__p__fmode'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi                    ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o: in function `__tm                    ainCRTStartup':
C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:280: undefined reference to `_set_invalid_para                    meter_handler'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/repo/ming                    w-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:289: undefined reference to `__p__acmdln'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\msys64\tm                    p\ccZCkCSW.o:common.cpp:(.text+0x13d): undefined reference to `__imp___acrt_iob_func'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\msys64\tm                    p\ccyP7B46.o:gpuowl.cpp:(.text$_Z6printfPKcz[_Z6printfPKcz]+0x29): undefined reference to `__imp___acrt_iob_func'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi                    ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingw32.a(lib64_libmin                    gw32_a-merr.o): in function `_matherr':
C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/merr.c:72: undefined reference to `__acrt_iob_func'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi                    ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingw32.a(lib64_libmin                    gw32_a-pseudo-reloc.o): in function `__report_error':
C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/pseudo-reloc.c:149: undefined reference to `__acrt_iob_                    func'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/repo/ming                    w-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/pseudo-reloc.c:150: undefined reference to `__acrt_iob_func'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi                    ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingwex.a(lib64_libmin                    gwex_a-wassert.o): in function `_wassert':
C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/misc/wassert.c:35: undefined reference to `__acrt_iob_func'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi                    ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libmingwex.a(lib64_libmin                    gwex_a-mingw_vfprintf.o): in function `__mingw_vfprintf':
C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/stdio/mingw_vfprintf.c:53: undefined reference to `_lock_file'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/repo/ming                    w-w64-crt-git/src/mingw-w64/mingw-w64-crt/stdio/mingw_vfprintf.c:55: undefined reference to `_unlock_file'
collect2.exe: error: ld returned 1 exit status

Last fiddled with by kriesel on 2018-08-10 at 13:09
kriesel is online now   Reply With Quote
Old 2018-08-10, 13:31   #545
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

@kriesel, I'm sorry but I don't know what's causing the last batch of errors you report. (the ones with ld.exe). Seems linker related.

The previous batch can be split into two groups:
- the warnings about "%llu" in printf(), just ignore them.
- the missing return in modInv() -- that function can be deleted completely, not used.
preda is offline   Reply With Quote
Old 2018-08-10, 13:57   #546
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

101010110112 Posts
Default

Quote:
Originally Posted by SELROC View Post
What is your plan about TF and automatic work fetch ?
I'm short on time now, so I don't start anything related to automatic fetch, not right now.

gpuowl seems to be rather reasonably efficient on 100M-digits exponents (332M) (8.32ms/it), so I may do some PRP on such myself. Such exponents are still huge chunks of work, taking on the order of 30days per exponent (this alleviates the need for automatic fetch&report somehow).

Looking, I see that such big exponents (332M) are trial-factored to 78 or 79 bits (and not P-1). Thus, before I invest 30+ days of PRP into such an exponent, I'd like to TF it a bit more.

And so here comes TF :)
I wrote a nice tight (and very limited) OpenCL TF+sieve all in well under 1K LOC. It does use some non-standard OpenCL such as 128-bit ints and 1024 group size, but it works fine with ROCm.

Of course the question is: why not use mfakto? I did use both mfaktc, and mfakto, as a reference for my learning. They seem both very capable and solid, and offer a large set of options. The main drawback is they are also huge and complex (IMO), and this makes modifying them hard. And... it turns out this tiny TF is also 30% faster than mfakto.

I'm still finalizing the TF, and thinking about the best way to integrate it with gpuowl. In my personal usage, I plan to TF the 332M exponents to 81bits, and afterwards PRP.

Last fiddled with by preda on 2018-08-10 at 13:58
preda is offline   Reply With Quote
Old 2018-08-10, 14:24   #547
SELROC
 

23×509 Posts
Default

Quote:
Originally Posted by preda View Post
I'm short on time now, so I don't start anything related to automatic fetch, not right now.

gpuowl seems to be rather reasonably efficient on 100M-digits exponents (332M) (8.32ms/it), so I may do some PRP on such myself. Such exponents are still huge chunks of work, taking on the order of 30days per exponent (this alleviates the need for automatic fetch&report somehow).

I am doing a 332 exponent, on my hardware the ETA is 50+ days .




Quote:
Looking, I see that such big exponents (332M) are trial-factored to 78 or 79 bits (and not P-1). Thus, before I invest 30+ days of PRP into such an exponent, I'd like to TF it a bit more.

And so here comes TF :)
I wrote a nice tight (and very limited) OpenCL TF+sieve all in well under 1K LOC. It does use some non-standard OpenCL such as 128-bit ints and 1024 group size, but it works fine with ROCm.
Personally I have been unable until now to use ROCm, but still promising here.



Quote:
Of course the question is: why not use mfakto? I did use both mfaktc, and mfakto, as a reference for my learning. They seem both very capable and solid, and offer a large set of options. The main drawback is they are also huge and complex (IMO), and this makes modifying them hard. And... it turns out this tiny TF is also 30% faster than mfakto.
I have been unable to compile mfacto, it gives a lot of errors.



Quote:
I'm still finalizing the TF, and thinking about the best way to integrate it with gpuowl. In my personal usage, I plan to TF the 332M exponents to 81bits, and afterwards PRP.
Can you try to adapt TF for OpenCL / amdgpu-pro ?
  Reply With Quote
Old 2018-08-10, 14:31   #548
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

25338 Posts
Default

Quote:
Originally Posted by SELROC View Post
I am doing a 332 exponent, on my hardware the ETA is 50+ days
My best GPU (that I have) is Vega64 air.

Quote:
Personally I have been unable until now to use ROCm, but still promising here
Recently 1.8.2 started to work on Ubuntu 18.04 with kernel 4.15.

Quote:
I have been unable to compile mfacto, it gives a lot of errors.
I submitted a pull request to mfakto on github that fixes those. You may try to clone my copy with the fixes: https://github.com/preda/mfakto

Quote:
Can you try to adapt TF for OpenCL / amdgpu-pro ?
Yes I can try. In a few days, we'll see how much needs fixed.
preda is offline   Reply With Quote
Old 2018-08-10, 14:41   #549
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

Quote:
Originally Posted by kriesel View Post
Code:
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: d000035.o:(.idata$5+0x0): multiple definition of `__imp___C_specific_handler'; d000029.o:(.idata$5+0x0): first defined here
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/msys64/mi                    ngw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o: in function `pre_c_init':
C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:136: undefined
Maybe try adding "-lm" to link in the math library?

BTW, you know that you can end all this pain by switching to Linux, right? (just joking, sorry I couldn't hold myself)
preda is offline   Reply With Quote
Old 2018-08-10, 15:11   #550
SELROC
 

2·3·179 Posts
Default

Quote:
Originally Posted by preda View Post
My best GPU (that I have) is Vega64 air.

Recently 1.8.2 started to work on Ubuntu 18.04 with kernel 4.15.

I submitted a pull request to mfakto on github that fixes those. You may try to clone my copy with the fixes: https://github.com/preda/mfakto

Yes I can try. In a few days, we'll see how much needs fixed.



I just tried out mfakto compilation on debian:
Code:
g++ sieve.o timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o filelocking.o output.o mfakto.o gpusieve.o perftest.o menu.o kbhit.o -m64  -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -L/home/sel/AMDAPPSDK-3.0/lib/x86_64 -L/opt/rocm/opencl/lib/x86_64 -lOpenCL -o ../mfakto
lto1: fatal error: bytecode stream in file ‘sieve.o’ generated with LTO version 5.2 instead of the expected 7.0
compilation terminated.
lto-wrapper: fatal error: g++ returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [Makefile:83: ../mfakto] Error 1

Last fiddled with by SELROC on 2018-08-10 at 15:11
  Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 20:31.


Sun Aug 1 20:31:39 UTC 2021 up 9 days, 15 hrs, 0 users, load averages: 2.34, 2.27, 1.95

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.