mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2017-08-18, 18:23   #2718
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by kriesel View Post
The plot thickens; the -append option is not present in command line tee help on Win7, but is in Win10. I think I'm well off the thread topic now so won't go into it any further.
Still, this is an interesting subject...

Like I said, I don't use Winblows (several of my clients do). Does the Winblows shell have the "man" command? For example, under Linux in a shell you can type "man tee" and get documentation. Everything from userspace down to deep system functions for programmers. printf() or fork(), for example.

One thing to note: at least under Linux (and the non-free Unix's from the past) the options for append using tee are either "-a" or "--append". Please note the double dashes for the latter.

I have to say I find it a bit amusing that Winblows is finally catching up with Unix for those who script.
chalsall is offline   Reply With Quote
Old 2017-08-18, 18:28   #2719
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
That will send the output to mfa.txt and then to the screen once mfaktc is finished. Not simultaneously.
That depends on how much data is written to STDOUT (and the associated buffer size), and if the program uses fflush on the stream.
chalsall is offline   Reply With Quote
Old 2017-09-12, 01:14   #2720
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default mfaktc failing to run on Geforce GTX1070

Windows 64-bit CUDA6.5 V0.21 Feb-5-2015 version
Or the V0.20 equivalent, produce an error, early in self-test.

mfaktc v0.21 (64bit built)

Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled

Runtime options
SievePrimes 25000
SievePrimesAdjust 1
SievePrimesMin 5000
SievePrimesMax 100000
NumStreams 3
CPUStreams 3
GridSize 3
GPU Sieving enabled
GPUSievePrimes 82486
GPUSieveSize 64Mi bits
GPUSieveProcessSize 16Ki bits
Checkpoints enabled
CheckpointDelay 300s
WorkFileAddDelay 600s
Stages enabled
StopAfterFactor bitlevel
PrintMode full
V5UserID Kriesel
ComputerID (none)
AllowSleep yes
TimeStampInResults no

CUDA version info
binary compiled for CUDA 6.50
CUDA runtime version 6.50
CUDA driver version 8.0

CUDA device info
name GeForce GTX 1070
compute capability 6.1
max threads per block 1024
max shared memory per MP 98304 byte
number of multiprocessors 15
clock rate (CUDA cores) 1708MHz
memory clock rate: 4004MHz
memory bus width: 256 bit

Automatic parameters
threads per grid 983040
GPUSievePrimes (adjusted) 82486
GPUsieve minimum exponent 1055144

########## testcase 1/2867 ##########
Starting trial factoring M50804297 from 2^67 to 2^68 (0.59 GHz-days)
Using GPU kernel "75bit_mul32_gs"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Sep 11 19:36 | 3387 0.1% | 0.001 n.a. | n.a. 82485 n.a.%
ERROR: cudaGetLastError() returned 8: invalid device function

CUDALucas and CUDAPm1 run fine on the same gpu.
Another GPU (a GTX480) runs mfaktc 0.20 just fine.

Ideas?

Last fiddled with by kriesel on 2017-09-12 at 01:17
kriesel is offline   Reply With Quote
Old 2017-09-12, 05:52   #2721
MrRepunit
 
MrRepunit's Avatar
 
Mar 2011
Germany

10111012 Posts
Default

Quote:
Originally Posted by kriesel View Post
CUDA version info
binary compiled for CUDA 6.50
CUDA runtime version 6.50
CUDA driver version 8.0
You need to compile mfaktc for CUDA 8 by adding
Code:
NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # CC 6.x GPUs will use this code
to the makefile and install the CUDA 8 SDK.

Edit: I cannot provide with Windows binaries (only Linux), but probably somebody has uploaded it within this thread.

Hope this helps.

Last fiddled with by MrRepunit on 2017-09-12 at 05:55 Reason: Added hint for binary
MrRepunit is offline   Reply With Quote
Old 2017-09-12, 16:21   #2722
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

22×3×163 Posts
Default

Quote:
Originally Posted by kriesel View Post
Windows 64-bit CUDA6.5 V0.21 Feb-5-2015 version
Or the V0.20 equivalent, produce an error, early in self-test.

mfaktc v0.21 (64bit built)
Is there a special reason you are using CUDA 6.5 instead of 8?
storm5510 is offline   Reply With Quote
Old 2017-09-12, 18:20   #2723
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10101001010112 Posts
Default

Quote:
Originally Posted by MrRepunit View Post
You need to compile mfaktc for CUDA 8 by adding
Code:
NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # CC 6.x GPUs will use this code
to the makefile and install the CUDA 8 SDK.

Edit: I cannot provide with Windows binaries (only Linux), but probably somebody has uploaded it within this thread.

Hope this helps.
Thanks for the responses.

There are a number of precompiled versions for CUDA 4.2, 6.5, or 8.0, available for Mfaktc at http://www.mersennewiki.org/index.php/Mfaktc#Resources

It's my understanding that a CUDA 8 capable driver is able to support many earlier versions of software and lower compute capability of card.

If I had the reverse situation, a CUDA 6.5 capable driver, and software compiled to require at least CUDA 8 driver, I would need to upgrade the driver.

On other software, I have run as low as V4.0 dlls and software with CUDA 8 capable drivers on this GPU and other gpus. Generally, an exact match is not a requirement, backward compatibility over a wide range is provided. For example, a CUDA 5.0 version of CUDAPm1 runs fine on the same gpu and CUDA 8.0 capable driver:
CUDAPm1 v0.20
Warning: Couldn't parse ini file option UnusedMem; using default.
------- DEVICE 0 -------
name GeForce GTX 1070
Compatibility 6.1
clockRate (MHz) 1708
memClockRate (MHz) 4004
totalGlobalMem 8589934592
totalConstMem 65536
l2CacheSize 2097152
sharedMemPerBlock 49152
regsPerBlock 65536
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 2048
multiProcessorCount 15
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 2147483647,65535,65535
textureAlignment 512
deviceOverlap 1

CUDA reports 7991M of 8192M GPU memory free.
Index 73
Using threads: norm1 32, mult 32, norm2 32.
Using up to 4360M GPU memory.
Selected B1=1010000, B2=32572500, 5.78% chance of finding a factor
Starting stage 1 P-1, M91001161, B1 = 1010000, B2 = 32572500, fft length = 5120K

CUDALucas both 32bit CUDA5.5 and 64-bit CUD6.0 run on it too. (In fact I've benchmarked it on all 17 flavors of May 5 2017 2.06beta )

CUDALucas v2.06beta 32-bit build, compiled May 5 2017 @ 12:32:36

binary compiled for CUDA 5.50
CUDA runtime version 5.50
CUDA driver version 8.0

------- DEVICE 0 -------
name GeForce GTX 1070
UUID **64-bit only on Windows**
ECC Support? Disabled
Compatibility 6.1
clockRate (MHz) 1708
memClockRate (MHz) 4004
totalGlobalMem 4294967295
totalConstMem 65536
l2CacheSize 2097152
sharedMemPerBlock 49152
regsPerBlock 65536
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 2048
multiProcessorCount 15
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 2147483647,65535,65535
textureAlignment 512
deviceOverlap 1
pciDeviceID 0
pciBusID 3

You may experience a small delay on 1st startup to due to Just-in-Time Compilation

Using threads: square 256, splice 128.
Starting M79341173 fft length = 4608K
| Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done |
| Jun 12 21:43:10 | M79341173 50000 0x5670ca9237d7c904 | 4608K 0.05273 5.6778 283.89s | 5:05:03:23 0.06% |
| Jun 12 21:48:08 | M79341173 100000 0xc7deb1ca3091a0ff | 4608K 0.04785 5.9326 296.63s | 5:07:46:55 0.12% |



batch wrapper reports CUDALucas2.06beta-CUDA6.0-Windows-x64 -d 0(re)launch at Sat 09/02/2017 13:12:35.10

CUDALucas v2.06beta 64-bit build, compiled May 5 2017 @ 12:59:32

binary compiled for CUDA 6.0
CUDA runtime version 6.0
CUDA driver version 8.0

------- DEVICE 0 -------
name GeForce GTX 1070
UUID GPU-9b15b648-ccfe-f878-b7cb-2bba3cffd5b1
ECC Support? Disabled
Compatibility 6.1
clockRate (MHz) 1708
memClockRate (MHz) 4004
totalGlobalMem 8589934592
totalConstMem 65536
l2CacheSize 2097152
sharedMemPerBlock 49152
regsPerBlock 65536
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 2048
multiProcessorCount 15
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 2147483647,65535,65535
textureAlignment 512
deviceOverlap 1
pciDeviceID 0
pciBusID 3

You may experience a small delay on 1st startup to due to Just-in-Time Compilation

Using threads: square 256, splice 128.
Starting M75316289 fft length = 4096K
| Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done |
| Sep 02 13:20:44 | M75316289 100000 0xef47fad89747c3f4 | 4096K 0.23438 4.8794 487.94s | 4:05:56:54 0.13% |
| Sep 02 13:28:52 | M75316289 200000 0x26966af002b3846b | 4096K 0.21875 4.8795 487.95s | 4:05:48:51 0.26% |
| Sep 02 13:37:00 | M75316289 300000 0x94eeb2ce0af176ef | 4096K 0.21875 4.8800 488.00s | 4:05:40:55 0.39% |
I can and will though try a CUDA 8 version of Mfaktc on this setup.

Usually I run about CUDA 6.5 mersenne code, because on most of my gpus that is faster most of the time.

Last fiddled with by kriesel on 2017-09-12 at 18:29
kriesel is offline   Reply With Quote
Old 2017-09-17, 15:45   #2724
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

22×3×163 Posts
Default

I had the errors below occur over a 30 minute period yesterday evening:

Code:
ERROR: cudaGetLastError() returned 4: unspecified lauch failure
ERROR: cudaGetLastError() returned 30: unspecified lauch failure
Not knowing the exact source, I restarted the machine and then updated the drivers. This 'appears' to have solved the problem. I have been running mfaktc ten months and this is the first issue to arise. The hardware is a GTX-480 with Windows 10 Pro, x64.

Does anyone have any ideas regarding the cause?
storm5510 is offline   Reply With Quote
Old 2017-09-18, 13:12   #2725
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

152B16 Posts
Default

Quote:
Originally Posted by storm5510 View Post
I had the errors below occur over a 30 minute period yesterday evening:

Code:
ERROR: cudaGetLastError() returned 4: unspecified lauch failure
ERROR: cudaGetLastError() returned 30: unspecified lauch failure
Not knowing the exact source, I restarted the machine and then updated the drivers. This 'appears' to have solved the problem. I have been running mfaktc ten months and this is the first issue to arise. The hardware is a GTX-480 with Windows 10 Pro, x64.

Does anyone have any ideas regarding the cause?
What version were you running when these occurred?
kriesel is offline   Reply With Quote
Old 2017-09-18, 16:03   #2726
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

22×3×163 Posts
Default

Quote:
Originally Posted by kriesel View Post
What version were you running when these occurred?
v0.21 running an exponent in the 149M range.
storm5510 is offline   Reply With Quote
Old 2017-09-18, 19:13   #2727
monst
 
monst's Avatar
 
Mar 2007

2638 Posts
Default Request for new binaries

Are there Windows binaries of mfaktc available for CUDA 9.0?

Disregard this request. I had a hiccup while upgrading an NVIDIA driver and it wiped out the CUDA runtime. A clean install of the driver got me back to CUDA 8.0.

Last fiddled with by monst on 2017-09-18 at 20:08 Reason: there was an installation issue
monst is offline   Reply With Quote
Old 2017-09-19, 05:08   #2728
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

22·3·163 Posts
Default

Quote:
Originally Posted by monst View Post
Are there Windows binaries of mfaktc available for CUDA 9.0?

Disregard this request. I had a hiccup while upgrading an NVIDIA driver and it wiped out the CUDA runtime. A clean install of the driver got me back to CUDA 8.0.
This is a snip from a startup.
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	101
Size:	18.6 KB
ID:	16863  
storm5510 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 07:03.


Mon Aug 2 07:03:36 UTC 2021 up 10 days, 1:32, 0 users, load averages: 2.43, 1.93, 1.50

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.