mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

chalsall 2017-08-18 18:23

[QUOTE=kriesel;465817]The plot thickens; the -append option is not present in command line tee help on Win7, but is in Win10. I think I'm well off the thread topic now so won't go into it any further.[/QUOTE]

Still, this is an interesting subject...

Like I said, I don't use Winblows (several of my clients do). Does the Winblows shell have the "man" command? For example, under Linux in a shell you can type "man tee" and get [URL="http://man7.org/linux/man-pages/man1/tee.1.html"]documentation[/URL]. Everything from userspace down to deep system functions for programmers. [URL="http://man7.org/linux/man-pages/man3/printf.3.html"]printf()[/URL] or [URL="http://man7.org/linux/man-pages/man2/fork.2.html"]fork()[/URL], for example.

One thing to note: at least under Linux (and the non-free Unix's from the past) the options for append using tee are either "-a" or "--append". Please note the double dashes for the latter.

I have to say I find it a bit amusing that Winblows is finally catching up with Unix for those who script.

chalsall 2017-08-18 18:28

[QUOTE=James Heinrich;465876]That will send the output to mfa.txt and then to the screen once mfaktc is finished. Not simultaneously.[/QUOTE]

That depends on how much data is written to STDOUT (and the associated buffer size), and if the program uses [URL="http://man7.org/linux/man-pages/man3/fflush.3.html"]fflush[/URL] on the stream. :wink:

kriesel 2017-09-12 01:14

mfaktc failing to run on Geforce GTX1070
 
Windows 64-bit CUDA6.5 V0.21 Feb-5-2015 version
Or the V0.20 equivalent, produce an error, early in self-test.

mfaktc v0.21 (64bit built)

Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled

Runtime options
SievePrimes 25000
SievePrimesAdjust 1
SievePrimesMin 5000
SievePrimesMax 100000
NumStreams 3
CPUStreams 3
GridSize 3
GPU Sieving enabled
GPUSievePrimes 82486
GPUSieveSize 64Mi bits
GPUSieveProcessSize 16Ki bits
Checkpoints enabled
CheckpointDelay 300s
WorkFileAddDelay 600s
Stages enabled
StopAfterFactor bitlevel
PrintMode full
V5UserID Kriesel
ComputerID (none)
AllowSleep yes
TimeStampInResults no

CUDA version info
binary compiled for CUDA 6.50
CUDA runtime version 6.50
CUDA driver version 8.0

CUDA device info
name GeForce GTX 1070
compute capability 6.1
max threads per block 1024
max shared memory per MP 98304 byte
number of multiprocessors 15
clock rate (CUDA cores) 1708MHz
memory clock rate: 4004MHz
memory bus width: 256 bit

Automatic parameters
threads per grid 983040
GPUSievePrimes (adjusted) 82486
GPUsieve minimum exponent 1055144

########## testcase 1/2867 ##########
Starting trial factoring M50804297 from 2^67 to 2^68 (0.59 GHz-days)
Using GPU kernel "75bit_mul32_gs"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Sep 11 19:36 | 3387 0.1% | 0.001 n.a. | n.a. 82485 n.a.%
ERROR: cudaGetLastError() returned 8: invalid device function

CUDALucas and CUDAPm1 run fine on the same gpu.
Another GPU (a GTX480) runs mfaktc 0.20 just fine.

Ideas?

MrRepunit 2017-09-12 05:52

[QUOTE=kriesel;467596]
CUDA version info
binary compiled for CUDA 6.50
CUDA runtime version 6.50
CUDA driver version 8.0
[/QUOTE]

You need to compile mfaktc for CUDA 8 by adding
[CODE]NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # CC 6.x GPUs will use this code
[/CODE]to the makefile and install the CUDA 8 SDK.

Edit: I cannot provide with Windows binaries (only Linux), but probably somebody has uploaded it within this thread.

Hope this helps.

storm5510 2017-09-12 16:21

[QUOTE=kriesel;467596]Windows 64-bit [B]CUDA6.5[/B] V0.21 Feb-5-2015 version
Or the V0.20 equivalent, produce an error, early in self-test.

mfaktc v0.21 (64bit built)

[/QUOTE]

Is there a special reason you are using CUDA 6.5 instead of 8?

kriesel 2017-09-12 18:20

[QUOTE=MrRepunit;467605]You need to compile mfaktc for CUDA 8 by adding
[CODE]NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # CC 6.x GPUs will use this code
[/CODE]to the makefile and install the CUDA 8 SDK.

Edit: I cannot provide with Windows binaries (only Linux), but probably somebody has uploaded it within this thread.

Hope this helps.[/QUOTE]

Thanks for the responses.

There are a number of precompiled versions for CUDA 4.2, 6.5, or 8.0, available for Mfaktc at [URL]http://www.mersennewiki.org/index.php/Mfaktc#Resources[/URL]

It's my understanding that a CUDA 8 capable driver is able to support many earlier versions of software and lower compute capability of card.

If I had the reverse situation, a CUDA 6.5 capable driver, and software compiled to require at least CUDA 8 driver, I would need to upgrade the driver.

On other software, I have run as low as V4.0 dlls and software with CUDA 8 capable drivers on this GPU and other gpus. Generally, an exact match is not a requirement, backward compatibility over a wide range is provided. For example, a CUDA 5.0 version of CUDAPm1 runs fine on the same gpu and CUDA 8.0 capable driver:
CUDAPm1 v0.20
Warning: Couldn't parse ini file option UnusedMem; using default.
------- DEVICE 0 -------
name GeForce GTX 1070
Compatibility 6.1
clockRate (MHz) 1708
memClockRate (MHz) 4004
totalGlobalMem 8589934592
totalConstMem 65536
l2CacheSize 2097152
sharedMemPerBlock 49152
regsPerBlock 65536
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 2048
multiProcessorCount 15
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 2147483647,65535,65535
textureAlignment 512
deviceOverlap 1

CUDA reports 7991M of 8192M GPU memory free.
Index 73
Using threads: norm1 32, mult 32, norm2 32.
Using up to 4360M GPU memory.
Selected B1=1010000, B2=32572500, 5.78% chance of finding a factor
Starting stage 1 P-1, M91001161, B1 = 1010000, B2 = 32572500, fft length = 5120K

CUDALucas both 32bit CUDA5.5 and 64-bit CUD6.0 run on it too. (In fact I've benchmarked it on all 17 flavors of May 5 2017 2.06beta )

CUDALucas v2.06beta 32-bit build, compiled May 5 2017 @ 12:32:36

binary compiled for CUDA 5.50
CUDA runtime version 5.50
CUDA driver version 8.0

------- DEVICE 0 -------
name GeForce GTX 1070
UUID **64-bit only on Windows**
ECC Support? Disabled
Compatibility 6.1
clockRate (MHz) 1708
memClockRate (MHz) 4004
totalGlobalMem 4294967295
totalConstMem 65536
l2CacheSize 2097152
sharedMemPerBlock 49152
regsPerBlock 65536
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 2048
multiProcessorCount 15
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 2147483647,65535,65535
textureAlignment 512
deviceOverlap 1
pciDeviceID 0
pciBusID 3

You may experience a small delay on 1st startup to due to Just-in-Time Compilation

Using threads: square 256, splice 128.
Starting M79341173 fft length = 4608K
| Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done |
| Jun 12 21:43:10 | M79341173 50000 0x5670ca9237d7c904 | 4608K 0.05273 5.6778 283.89s | 5:05:03:23 0.06% |
| Jun 12 21:48:08 | M79341173 100000 0xc7deb1ca3091a0ff | 4608K 0.04785 5.9326 296.63s | 5:07:46:55 0.12% |



batch wrapper reports CUDALucas2.06beta-CUDA6.0-Windows-x64 -d 0(re)launch at Sat 09/02/2017 13:12:35.10

CUDALucas v2.06beta 64-bit build, compiled May 5 2017 @ 12:59:32

binary compiled for CUDA 6.0
CUDA runtime version 6.0
CUDA driver version 8.0

------- DEVICE 0 -------
name GeForce GTX 1070
UUID GPU-9b15b648-ccfe-f878-b7cb-2bba3cffd5b1
ECC Support? Disabled
Compatibility 6.1
clockRate (MHz) 1708
memClockRate (MHz) 4004
totalGlobalMem 8589934592
totalConstMem 65536
l2CacheSize 2097152
sharedMemPerBlock 49152
regsPerBlock 65536
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 2048
multiProcessorCount 15
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 2147483647,65535,65535
textureAlignment 512
deviceOverlap 1
pciDeviceID 0
pciBusID 3

You may experience a small delay on 1st startup to due to Just-in-Time Compilation

Using threads: square 256, splice 128.
Starting M75316289 fft length = 4096K
| Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done |
| Sep 02 13:20:44 | M75316289 100000 0xef47fad89747c3f4 | 4096K 0.23438 4.8794 487.94s | 4:05:56:54 0.13% |
| Sep 02 13:28:52 | M75316289 200000 0x26966af002b3846b | 4096K 0.21875 4.8795 487.95s | 4:05:48:51 0.26% |
| Sep 02 13:37:00 | M75316289 300000 0x94eeb2ce0af176ef | 4096K 0.21875 4.8800 488.00s | 4:05:40:55 0.39% |
I can and will though try a CUDA 8 version of Mfaktc on this setup.

Usually I run about CUDA 6.5 mersenne code, because on most of my gpus that is faster most of the time.

storm5510 2017-09-17 15:45

I had the errors below occur over a 30 minute period yesterday evening:

[CODE]ERROR: cudaGetLastError() returned 4: unspecified lauch failure
ERROR: cudaGetLastError() returned 30: unspecified lauch failure[/CODE]

Not knowing the exact source, I restarted the machine and then updated the drivers. This 'appears' to have solved the problem. I have been running [I]mfaktc[/I] ten months and this is the first issue to arise. The hardware is a GTX-480 with Windows 10 Pro, x64.

Does anyone have any ideas regarding the cause?

kriesel 2017-09-18 13:12

[QUOTE=storm5510;467953]I had the errors below occur over a 30 minute period yesterday evening:

[CODE]ERROR: cudaGetLastError() returned 4: unspecified lauch failure
ERROR: cudaGetLastError() returned 30: unspecified lauch failure[/CODE]Not knowing the exact source, I restarted the machine and then updated the drivers. This 'appears' to have solved the problem. I have been running [I]mfaktc[/I] ten months and this is the first issue to arise. The hardware is a GTX-480 with Windows 10 Pro, x64.

Does anyone have any ideas regarding the cause?[/QUOTE]

What version were you running when these occurred?

storm5510 2017-09-18 16:03

[QUOTE=kriesel;468029]What version were you running when these occurred?[/QUOTE]

v0.21 running an exponent in the 149M range.

monst 2017-09-18 19:13

Request for new binaries
 
[STRIKE]Are there Windows binaries of mfaktc available for CUDA 9.0?[/STRIKE]

Disregard this request. I had a hiccup while upgrading an NVIDIA driver and it wiped out the CUDA runtime. A clean install of the driver got me back to CUDA 8.0.

storm5510 2017-09-19 05:08

1 Attachment(s)
[QUOTE=monst;468048][STRIKE]Are there Windows binaries of mfaktc available for CUDA 9.0?[/STRIKE]

Disregard this request. I had a hiccup while upgrading an NVIDIA driver and it wiped out the CUDA runtime. A clean install of the driver got me back to CUDA 8.0.[/QUOTE]

This is a snip from a startup.


All times are UTC. The time now is 23:10.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.