![]() |
|
|
#2718 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
9,767 Posts |
Quote:
Like I said, I don't use Winblows (several of my clients do). Does the Winblows shell have the "man" command? For example, under Linux in a shell you can type "man tee" and get documentation. Everything from userspace down to deep system functions for programmers. printf() or fork(), for example. One thing to note: at least under Linux (and the non-free Unix's from the past) the options for append using tee are either "-a" or "--append". Please note the double dashes for the latter. I have to say I find it a bit amusing that Winblows is finally catching up with Unix for those who script. |
|
|
|
|
|
|
#2719 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
9,767 Posts |
Quote:
|
|
|
|
|
|
|
#2720 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,419 Posts |
Windows 64-bit CUDA6.5 V0.21 Feb-5-2015 version
Or the V0.20 equivalent, produce an error, early in self-test. mfaktc v0.21 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 3 CPUStreams 3 GridSize 3 GPU Sieving enabled GPUSievePrimes 82486 GPUSieveSize 64Mi bits GPUSieveProcessSize 16Ki bits Checkpoints enabled CheckpointDelay 300s WorkFileAddDelay 600s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID Kriesel ComputerID (none) AllowSleep yes TimeStampInResults no CUDA version info binary compiled for CUDA 6.50 CUDA runtime version 6.50 CUDA driver version 8.0 CUDA device info name GeForce GTX 1070 compute capability 6.1 max threads per block 1024 max shared memory per MP 98304 byte number of multiprocessors 15 clock rate (CUDA cores) 1708MHz memory clock rate: 4004MHz memory bus width: 256 bit Automatic parameters threads per grid 983040 GPUSievePrimes (adjusted) 82486 GPUsieve minimum exponent 1055144 ########## testcase 1/2867 ########## Starting trial factoring M50804297 from 2^67 to 2^68 (0.59 GHz-days) Using GPU kernel "75bit_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Sep 11 19:36 | 3387 0.1% | 0.001 n.a. | n.a. 82485 n.a.% ERROR: cudaGetLastError() returned 8: invalid device function CUDALucas and CUDAPm1 run fine on the same gpu. Another GPU (a GTX480) runs mfaktc 0.20 just fine. Ideas? Last fiddled with by kriesel on 2017-09-12 at 01:17 |
|
|
|
|
|
#2721 | |
|
Mar 2011
Germany
10111012 Posts |
Quote:
Code:
NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # CC 6.x GPUs will use this code Edit: I cannot provide with Windows binaries (only Linux), but probably somebody has uploaded it within this thread. Hope this helps. Last fiddled with by MrRepunit on 2017-09-12 at 05:55 Reason: Added hint for binary |
|
|
|
|
|
|
#2722 |
|
Random Account
Aug 2009
22×3×163 Posts |
|
|
|
|
|
|
#2723 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10101001010112 Posts |
Quote:
There are a number of precompiled versions for CUDA 4.2, 6.5, or 8.0, available for Mfaktc at http://www.mersennewiki.org/index.php/Mfaktc#Resources It's my understanding that a CUDA 8 capable driver is able to support many earlier versions of software and lower compute capability of card. If I had the reverse situation, a CUDA 6.5 capable driver, and software compiled to require at least CUDA 8 driver, I would need to upgrade the driver. On other software, I have run as low as V4.0 dlls and software with CUDA 8 capable drivers on this GPU and other gpus. Generally, an exact match is not a requirement, backward compatibility over a wide range is provided. For example, a CUDA 5.0 version of CUDAPm1 runs fine on the same gpu and CUDA 8.0 capable driver: CUDAPm1 v0.20 Warning: Couldn't parse ini file option UnusedMem; using default. ------- DEVICE 0 ------- name GeForce GTX 1070 Compatibility 6.1 clockRate (MHz) 1708 memClockRate (MHz) 4004 totalGlobalMem 8589934592 totalConstMem 65536 l2CacheSize 2097152 sharedMemPerBlock 49152 regsPerBlock 65536 warpSize 32 memPitch 2147483647 maxThreadsPerBlock 1024 maxThreadsPerMP 2048 multiProcessorCount 15 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 2147483647,65535,65535 textureAlignment 512 deviceOverlap 1 CUDA reports 7991M of 8192M GPU memory free. Index 73 Using threads: norm1 32, mult 32, norm2 32. Using up to 4360M GPU memory. Selected B1=1010000, B2=32572500, 5.78% chance of finding a factor Starting stage 1 P-1, M91001161, B1 = 1010000, B2 = 32572500, fft length = 5120K CUDALucas both 32bit CUDA5.5 and 64-bit CUD6.0 run on it too. (In fact I've benchmarked it on all 17 flavors of May 5 2017 2.06beta ) CUDALucas v2.06beta 32-bit build, compiled May 5 2017 @ 12:32:36 binary compiled for CUDA 5.50 CUDA runtime version 5.50 CUDA driver version 8.0 ------- DEVICE 0 ------- name GeForce GTX 1070 UUID **64-bit only on Windows** ECC Support? Disabled Compatibility 6.1 clockRate (MHz) 1708 memClockRate (MHz) 4004 totalGlobalMem 4294967295 totalConstMem 65536 l2CacheSize 2097152 sharedMemPerBlock 49152 regsPerBlock 65536 warpSize 32 memPitch 2147483647 maxThreadsPerBlock 1024 maxThreadsPerMP 2048 multiProcessorCount 15 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 2147483647,65535,65535 textureAlignment 512 deviceOverlap 1 pciDeviceID 0 pciBusID 3 You may experience a small delay on 1st startup to due to Just-in-Time Compilation Using threads: square 256, splice 128. Starting M79341173 fft length = 4608K | Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done | | Jun 12 21:43:10 | M79341173 50000 0x5670ca9237d7c904 | 4608K 0.05273 5.6778 283.89s | 5:05:03:23 0.06% | | Jun 12 21:48:08 | M79341173 100000 0xc7deb1ca3091a0ff | 4608K 0.04785 5.9326 296.63s | 5:07:46:55 0.12% | batch wrapper reports CUDALucas2.06beta-CUDA6.0-Windows-x64 -d 0(re)launch at Sat 09/02/2017 13:12:35.10 CUDALucas v2.06beta 64-bit build, compiled May 5 2017 @ 12:59:32 binary compiled for CUDA 6.0 CUDA runtime version 6.0 CUDA driver version 8.0 ------- DEVICE 0 ------- name GeForce GTX 1070 UUID GPU-9b15b648-ccfe-f878-b7cb-2bba3cffd5b1 ECC Support? Disabled Compatibility 6.1 clockRate (MHz) 1708 memClockRate (MHz) 4004 totalGlobalMem 8589934592 totalConstMem 65536 l2CacheSize 2097152 sharedMemPerBlock 49152 regsPerBlock 65536 warpSize 32 memPitch 2147483647 maxThreadsPerBlock 1024 maxThreadsPerMP 2048 multiProcessorCount 15 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 2147483647,65535,65535 textureAlignment 512 deviceOverlap 1 pciDeviceID 0 pciBusID 3 You may experience a small delay on 1st startup to due to Just-in-Time Compilation Using threads: square 256, splice 128. Starting M75316289 fft length = 4096K | Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done | | Sep 02 13:20:44 | M75316289 100000 0xef47fad89747c3f4 | 4096K 0.23438 4.8794 487.94s | 4:05:56:54 0.13% | | Sep 02 13:28:52 | M75316289 200000 0x26966af002b3846b | 4096K 0.21875 4.8795 487.95s | 4:05:48:51 0.26% | | Sep 02 13:37:00 | M75316289 300000 0x94eeb2ce0af176ef | 4096K 0.21875 4.8800 488.00s | 4:05:40:55 0.39% | I can and will though try a CUDA 8 version of Mfaktc on this setup. Usually I run about CUDA 6.5 mersenne code, because on most of my gpus that is faster most of the time. Last fiddled with by kriesel on 2017-09-12 at 18:29 |
|
|
|
|
|
|
#2724 |
|
Random Account
Aug 2009
22×3×163 Posts |
I had the errors below occur over a 30 minute period yesterday evening:
Code:
ERROR: cudaGetLastError() returned 4: unspecified lauch failure ERROR: cudaGetLastError() returned 30: unspecified lauch failure Does anyone have any ideas regarding the cause? |
|
|
|
|
|
#2725 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
152B16 Posts |
Quote:
|
|
|
|
|
|
|
#2726 |
|
Random Account
Aug 2009
22×3×163 Posts |
|
|
|
|
|
|
#2727 |
|
Mar 2007
2638 Posts |
Disregard this request. I had a hiccup while upgrading an NVIDIA driver and it wiped out the CUDA runtime. A clean install of the driver got me back to CUDA 8.0. Last fiddled with by monst on 2017-09-18 at 20:08 Reason: there was an installation issue |
|
|
|
|
|
#2728 |
|
Random Account
Aug 2009
22·3·163 Posts |
This is a snip from a startup.
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |