mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2018-04-04, 18:40   #2663
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×32×5×72 Posts
Default

Quote:
Originally Posted by Lexicographer View Post
Hello!

I'm not sure it's correct place to ask this, but I'm bumping into a problem while trying to compile the latest CUDALucas under Linux.

The problem is:

It's the same if I try different versions of compute/sm.

I have CUDA Toolkit 9.1 installed.

Any suggestions, please?
The existing linux executables won't do for your purposes? (Existing linux executables from 2015 are CUDA versions 4.2, 5.0, 5.5, 6.0, 6.5; all 64-bit.)

Look at NVIDIA developer support or download sites for nvml.h, and the linux equivalent of the nvml.dll required on Windows? Note from flashjh at https://sourceforge.net/p/cudalucas/code/102/ says: Updates for GPU UUID from NVML. nvml.dll is required and included with the .7z file. Have not compiled or tested linux yet. Also included code to exclude 1.x GPUs for CUDA >7.0 and future code to exclude 2.x when CUDA does not support.

https://download.mersenne.ca/ appears not to have any linux CUDALucas executables.

https://sourceforge.net/projects/cudalucas/files/ has a tgz file of precompiled V2.05.1 for linux, which is from before the added code requiring nvml.

Or see https://stackoverflow.com/questions/...r-file-missing from some years back, about the effects of a missing nmvl.h.

Good luck!
Attached Thumbnails
Click image for larger version

Name:	cudalucas nvml.dll details.png
Views:	63
Size:	27.3 KB
ID:	18057  

Last fiddled with by kriesel on 2018-04-04 at 18:46
kriesel is online now   Reply With Quote
Old 2018-04-05, 17:57   #2664
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

441010 Posts
Default

Quote:
Originally Posted by LaurV View Post
The other way, to display ETA as the "number of remaining iterations" multiplied with "iteration time", will give you an immediate result when you move it to a faster toy, but it will be very-VERY jumpy ETA, due to the fact that iteration time varies a lot with how busy your computer is. Some of us use the computers for other activities too. So it is not "reliable". Some kind of "averaging" with the past values (either SMA, or EMA) need to be done, to avoid the jumpy ETA, and you will still see "no effect" when you move it, unless the MA (moving average) main period passes. Of course, it would be nice to have an option in the ini file, for example, where to chose an averaging period, something like 255 should be the actual method, (just an example), something like 0 should be "no averaging" (jumpy). But I feel we request too much already.
I appreciate your input. I've read enough of the thread posts to know you've been around here a while.

It appears to me that different apps handle the jumpy versus averaging ETA differently. Prime95 appears to take both forks; worker window ETA fluctuates with current time per iteration and user behavior, while Status appears to be the long term averaging over the history of the exponent run approach. In my experience this often results in the actual progress beating the Status prediction to completion by days or sometimes weeks. I find the more responsive indication more useful. Others may not.

As far as requesting too much, I prefer to think of the lists I've made as representing abundant opportunity. Any would be code author or maintainer can choose how to spend his time, what his personal priorities are, and what to tackle. Any progress that's made and shared benefits all the users of a software title and the project.
kriesel is online now   Reply With Quote
Old 2018-04-06, 06:00   #2665
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Right, the subsequent posts about "generic he" have been moved to a more appropriate location.
Dubslow is offline   Reply With Quote
Old 2018-04-06, 06:07   #2666
SELROC
 

22×2,081 Posts
Default

Quote:
Originally Posted by kriesel View Post
I appreciate your input. I've read enough of the thread posts to know you've been around here a while.

It appears to me that different apps handle the jumpy versus averaging ETA differently. Prime95 appears to take both forks; worker window ETA fluctuates with current time per iteration and user behavior, while Status appears to be the long term averaging over the history of the exponent run approach. In my experience this often results in the actual progress beating the Status prediction to completion by days or sometimes weeks. I find the more responsive indication more useful. Others may not.

As far as requesting too much, I prefer to think of the lists I've made as representing abundant opportunity. Any would be code author or maintainer can choose how to spend his time, what his personal priorities are, and what to tackle. Any progress that's made and shared benefits all the users of a software title and the project.

With mprime I have noted that even on a dedicated system, with no user input, the ETA fluctuates with system load, obviously because of the system daemons that periodically occupy the CPU. Theoretically with maximum priority set for mprime, still the ETA should be fluctuating.
  Reply With Quote
Old 2018-04-07, 03:21   #2667
Lexicographer
 
Mar 2018
Shenzhen, China

2×32 Posts
Thumbs up Finally compiled on Kubuntu

Quote:
Originally Posted by kriesel View Post
V2.05.1 for linux, which is from before the added code requiring nvml.
Thanks for your help!

I downloaded file CUDALucas.cu of version 2.05.1 (r94) and the source finally compiled. I already confirmed the executable works well.

As for the precompiled executables, they just won't run on my machine:

Quote:
$ ./CUDALucas-2.05.1-CUDA6.5-linux-x86_64
bash: ./CUDALucas-2.05.1-CUDA6.5-linux-x86_64: No such file or directory
Lexicographer is offline   Reply With Quote
Old 2018-04-07, 18:13   #2668
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×32×5×72 Posts
Default

Quote:
Originally Posted by Lexicographer View Post
Thanks for your help!

I downloaded file CUDALucas.cu of version 2.05.1 (r94) and the source finally compiled. I already confirmed the executable works well.

As for the precompiled executables, they just won't run on my machine:
You're welcome.

Reviewing my offline notes, I see there was a stretch of posts 2642-2648 about the same error message you encountered.

Note that success on one fft length, or one run, while a good sign, does not ensure success on another. Running the available residue checks on the fft lengths you're likely to use is recommended. (See the -r option.) Running a separate gpu memory test program, or thorough coverage with CUDALucas's -memtest option, is also a useful precaution.

There are certain known-bad residues that sometimes appear, indicating a problem.
Continuing an exponent run after any such appear is a waste of processing time.
Restarting from a saved interim file from before such appear is recommended.
For CUDALucas, these known problem residues are:
0x0000000000000000, 0x0000000000000002, 0xfffffffffffffffd
0x0 is ok for a final residue if it does not occur before then; appearing only as the final residue is the sign of a rare prime result. Appearing earlier is a sign of something gone wrong, perhaps a host to gpu or gpu to host failed memory copy.

Residues that repeat from one iteration to the next would also be a symptom of a problem.

As I recall, the April 18 2017 2.06beta did not include the nvml requirement, but did include at least some checks for some of those known-bad residues. I suggest you try compiling that version for more likely valid results.

Please consider making your compiled executable(s) available to James Heinrich for posting on his software mirror. http://download.mersenne.ca/CUDALucas/

For performance tuning, if you haven't yet, you may want to run -fftbench and -threadbench for the range of fft lengths you're likely to use. I would use at least 3 passes on a fast card like yours, and expand the range outward to powers of two including the intended usage. For example, if intending to use 4608k-7168k, benchmark fft and threads over 4096k-8192k.

You may find skimming through the bug and wish list useful; last posted http://www.mersenneforum.org/showpos...postcount=2658
Please note any other bugs you might find, or features that would be useful, and I'll add them to the list.

Also a run-time scaling guide is available at post 2659.

Last fiddled with by kriesel on 2018-04-07 at 18:32
kriesel is online now   Reply With Quote
Old 2018-04-08, 15:34   #2669
chris2be8
 
chris2be8's Avatar
 
Sep 2009

2·33·5·7 Posts
Default

Quote:
Originally Posted by Lexicographer View Post
As for the precompiled executables, they just won't run on my machine:
Quote:
$ ./CUDALucas-2.05.1-CUDA6.5-linux-x86_64
bash: ./CUDALucas-2.05.1-CUDA6.5-linux-x86_64: No such file or directory
Please post output from:
Code:
ls -l CUDALucas-2.05.1-CUDA6.5-linux-x86_64
file CUDALucas-2.05.1-CUDA6.5-linux-x86_64
ldd CUDALucas-2.05.1-CUDA6.5-linux-x86_64
uname -a
That might give a clue as to what's wrong.

Chris
chris2be8 is offline   Reply With Quote
Old 2018-04-08, 18:52   #2670
janfrode
 
Apr 2018

1 Posts
Default Built latest CUDALucas from source..

Just a tiny FYI on what was needed to build the latest CUDALucas from source on RHEL7.4 with CUDA 9.1.

I had to add "#include <nvml.h>" to CUDALucas.cu and also it needed to be linked with /usr/local/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so to the final build command was:

gcc CUDALucas.o parse.o -O1 -Wall -fPIC -L/usr/local/cuda/lib64 -L/usr/local/cuda/targets/x86_64-linux/lib/stubs/ -lcufft -lcudart -lnvidia-ml -lm -o CUDALucas

and for ppc64le it was approximately the same command:

gcc CUDALucas.o parse.o -O1 -Wall -fPIC -L/usr/local/cuda/lib64 -L/usr/local/cuda/targets/x86_64-linux/lib/stubs/ -lcufft -lcudart -lnvidia-ml -lm -o CUDALucas

oh.. and in the Makefile I also specified "--generate-code arch=compute_50,code=sm_50" which is the capability of my 940MX, and compute_60,cm_60 for the Tesla P100.

On the ppc64le I am seeing some strange stuff (crash on -threadbench) or if I don't specify "-threads". But it seems to be working fine for LL.
janfrode is offline   Reply With Quote
Old 2018-04-09, 03:51   #2671
Lexicographer
 
Mar 2018
Shenzhen, China

2·32 Posts
Post libcufft and libcudart are not found for CUDALucas

Quote:
Originally Posted by chris2be8 View Post
Please post output from:
Code:
ls -l CUDALucas-2.05.1-CUDA6.5-linux-x86_64
file CUDALucas-2.05.1-CUDA6.5-linux-x86_64
ldd CUDALucas-2.05.1-CUDA6.5-linux-x86_64
uname -a
Results:

Quote:
$ ls -l CUDALucas-2.05.1-CUDA6.5-linux-x86_64
-rwxr-xr-x 1 andriy andriy 753136 2月 12 2015 CUDALucas-2.05.1-CUDA6.5-linux-x86_64

$ file CUDALucas-2.05.1-CUDA6.5-linux-x86_64
CUDALucas-2.05.1-CUDA6.5-linux-x86_64: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=a8b4728865a4f5a480dd218c33fd85728a4914c3, not stripped

$ ldd CUDALucas-2.05.1-CUDA6.5-linux-x86_64
linux-vdso.so.1 => (0x00007ffce48d6000)
/lib/$LIB/liblsp.so => /lib/lib/x86_64-linux-gnu/liblsp.so (0x00007f1f9f438000)
libcufft.so.6.5 => not found
libcudart.so.6.5 => not found
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f1f9f0e2000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1f9ed02000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f1f9eafe000)
/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007f1f9f63d000)
$ uname -a

Linux DeepDragon 4.13.0-38-generic #43-Ubuntu SMP Wed Mar 14 15:20:44 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
If I run `locate libcufft`, some of the results are:
Quote:
/usr/lib/x86_64-linux-gnu/libcufft.so.8.0
/usr/local/cuda-9.1/lib64/libcufft.so
/usr/local/cuda-9.1/lib64/libcufft.so.9.1
/usr/local/cuda-9.1/lib64/libcufft.so.9.1.85
And it's similar for libcudart:
Quote:
/usr/lib/x86_64-linux-gnu/libcudart.so.8.0
/usr/lib/x86_64-linux-gnu/libcudart.so.8.0.61
/usr/local/cuda-9.1/lib64/libcudart.so
/usr/local/cuda-9.1/lib64/libcudart.so.9.1
/usr/local/cuda-9.1/lib64/libcudart.so.9.1.85
Lexicographer is offline   Reply With Quote
Old 2018-04-09, 04:45   #2672
Lexicographer
 
Mar 2018
Shenzhen, China

2×32 Posts
Cool Also compiled for 1080 Ti with CUDA 9.1

Quote:
Originally Posted by janfrode View Post
I had to add "#include <nvml.h>" to CUDALucas.cu and also it needed to be linked with libnvidia-ml.so
Thanks for your help! I tried that and it worked with the latest code (2.06beta, r102).

Since the NVIDIA Management Library (nvidia-ml, nvml) was in different location on my machine, my build commands were:
Quote:
/usr/local/cuda/bin/nvcc -O1 --generate-code arch=compute_61,code=compute_61 --compiler-options=-Wall -I/usr/local/cuda/include -c CUDALucas.cu
gcc CUDALucas.o parse.o -O1 -Wall -fPIC -L/usr/local/cuda/lib64 -L/usr/lib/nvidia-390/ -lcufft -lcudart -lm -lnvidia-ml -o CUDALucas
So I linked /usr/lib/nvidia-390/ instead of /usr/local/cuda/targets/x86_64-linux/lib/stubs/, and used compute_61 for my GTX 1080 Ti.

Last fiddled with by Lexicographer on 2018-04-09 at 04:55
Lexicographer is offline   Reply With Quote
Old 2018-05-27, 16:58   #2673
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·32·5·72 Posts
Default Improved recovery from Windows TDRs on old gpus

A recovery method tested the last few days on CUDAPm1 and GTX480 may also apply here. See the detailed writeup at http://www.mersenneforum.org/showpos...8&postcount=37
kriesel is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 16:59.

Mon Sep 21 16:59:09 UTC 2020 up 11 days, 14:10, 1 user, load averages: 1.39, 1.62, 1.63

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.