Some of the changes in my earlier post turned out to be unnecessary. In compatibility.h, I restored #if def_MSC_VER, but I did change the following line:
Code:
#define strncasecmp -srtnicmp
by deleting the - . Again, I'm not sure why or how this worked, but now I don't get compiler messages about not being able to find srtnicmp.
After trying to learn more about rpaths by googling, I did manage to get the compiled mfaktc.exe to run ! The necessary change was in the following line in Makefile:
Code:
LDFLAGS = -fPIC $(CUDA_LIB) -lcudart -lm
changed to
Code:
LDFLAGS = -fPIC $(CUDA_LIB)/libcudart.6.5.dylib -lm -rpath /Developer/NVIDIA/CUDA-6.5/lib/
and now the dylib was properly found at runtime. I think there may be some redundancy in that line now, but I'm not sure. This may also make line 4 of the Makefile
Code:
CUDA_LIB = -L$(CUDA_DIR)/lib/
which I changed to
Code:
CUDA_LIB = $(CUDA_DIR)/lib
redundant, but again I'm not sure. Also, in line 3 of Makefile, I removed the final / , because I was getting compiler messages with double slashes // in pathnames.
Finally, I reinstalled CUDA 6.5 because I thought I might have messed up some install paths earlier. The compiled binary ran, but it was finding my GeForce 8500 GT (which actually drives the monitor) rather than the Tesla !
Quote:
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Oct 02 23:51 | 201 4.3% | 8.401 2h08m | 1.54 82485 n.a.%
Oct 02 23:51 | 204 4.4% | 8.438 2h09m | 1.54 82485 n.a.%
Oct 02 23:51 | 209 4.5% | 7.829 1h59m | 1.65 82485 n.a.%
Oct 02 23:51 | 212 4.6% | 8.148 2h04m | 1.59 82485 n.a.%
Oct 02 23:51 | 216 4.7% | 8.158 2h04m | 1.59 82485 n.a.%
Oct 02 23:51 | 224 4.8% | 8.272 2h06m | 1.57 82485 n.a.%
Oct 02 23:52 | 225 4.9% | 7.851 1h59m | 1.65 82485 n.a.%
Oct 02 23:52 | 236 5.0% | 5.763 1h27m | 2.25 82485 n.a.%
ERROR: cudaGetLastError() returned 4: unspecified launch failure
|
I have no idea what went wrong there -- apparently mfaktc continued on to find a factor from the sample exponent provided; the GUI responded very slowly, as if the GPU were still in use, for several minutes afterward:
Quote:
UID: Random GPUs/MacPro.M2050, M66362159 has a factor: 63205291599831307391 [TF:64:67*:mfaktc 0.21 barrett76_mul32_gs]
UID: Random GPUs/MacPro.M2050, found 1 factor for M66362159 from 2^64 to 2^67 (partially tested) [mfaktc 0.21 barrett76_mul32_gs]
|
After this I focused on getting the Tesla M2050 working. Connected auxiliary power, rebooted. About This Mac/More Info.../System Report/PCI Cards now reports an error when trying to identify PCI cards. Under System Preferences .../CUDA there was a message that a driver update was available, so I installed it. The 7.5 driver seems to be needed by this card. I compiled the program deviceQuery from the NVIDIA samples (NVIDIAs "CUDA_Getting_Started_Mac" was very helpful) which accurately reports "Tesla M2050". But actually running the binary gives the same error message -- then the system crashes ! I think the crash may have been due to overheating; I have installed Macs Fan Control and linked it to the "MCP heatsink" sensor, which seems to be either on the Tesla or very near it, as it gets very hot when the Tesla runs. So I'll see if it fixes that problem. Otherwise I'll need an actual fan on the card, in addition to the MacPro's fan. Here's the limited output from that run:
Quote:
MacPro:~ shn$ /Users/shn/Desktop/mfaktc-0.21_inst1/mf1 ; exit;
mfaktc v0.21 (64bit built)
Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled
Runtime options
WARNING: Cannot read SievePrimes from mfaktc.ini, using default value (25000)
SievePrimes 25000
WARNING: Cannot read SievePrimesAdjust from mfaktc.ini, using default value (1)
SievePrimesAdjust 1
WARNING: Cannot read SievePrimesMin from mfaktc.ini, using min value (2000)
SievePrimesMin 2000
WARNING: Cannot read SievePrimesMax from mfaktc.ini, using max value (200000)
SievePrimesMax 200000
WARNING: Cannot read NumStreams from mfaktc.ini, using default value (3)
NumStreams 3
WARNING: Cannot read CPUStreams from mfaktc.ini, using default value (3)
CPUStreams 3
WARNING: Cannot read GridSize from mfaktc.ini, using default value (3)
GridSize 3
WARNING: Cannot read SieveOnGPU from mfaktc.ini, enabled by default
GPU Sieving enabled
WARNING: Cannot read GPUSievePrimes from mfaktc.ini, using default value (82486)
GPUSievePrimes 82486
WARNING: Cannot read GPUSieveSize from mfaktc.ini, using default value (64)
GPUSieveSize 64Mi bits
WARNING: Cannot read GPUSieveProcessSize from mfaktc.ini, using default value (16)
GPUSieveProcessSize 16Ki bits
WARNING: Cannot read Checkpoints from mfaktc.ini, enabled by default
Checkpoints enabled
WARNING: Cannot read CheckpointDelay from mfaktc.ini, set to 30s by default
CheckpointDelay 30s
WARNING: Cannot read WorkFileAddDelay from mfaktc.ini, set to 600s by default
WorkFileAddDelay 600s
WARNING: Cannot read Stages from mfaktc.ini, enabled by default
Stages enabled
WARNING: Cannot read StopAfterFactor from mfaktc.ini, set to 1 by default
StopAfterFactor bitlevel
WARNING: Cannot read PrintMode from mfaktc.ini, set to 0 by default
PrintMode full
V5UserID (none)
ComputerID (none)
WARNING, no ProgressHeader specified in mfaktc.ini, using default
WARNING, no ProgressFormat specified in mfaktc.ini, using default
WARNING: Cannot read AllowSleep from mfaktc.ini, set to 0 by default
AllowSleep no
WARNING: Cannot read TimeStampInResults from mfaktc.ini, set to 0 by default
TimeStampInResults no
CUDA version info
binary compiled for CUDA 6.50
CUDA runtime version 6.50
CUDA driver version 7.50
CUDA device info
name Tesla M2050
compute capability 2.0
max threads per block 1024
max shared memory per MP 49152 byte
number of multiprocessors 14
CUDA cores per MP 32
CUDA cores - total 448
clock rate (CUDA cores) 1147MHz
memory clock rate: 1546MHz
memory bus width: 384 bit
Automatic parameters
threads per grid 917504
GPUSievePrimes (adjusted) 82486
GPUsieve minimum exponent 1055144
running a simple selftest...
per class final cudaThreadSynchronize failed!
ERROR: cudaGetLastError() returned 4: unspecified launch failure
logout
|
I'm not sure what else to try at this point. Maybe downgrade to CUDA 7.0 again ? I couldn't get 7.0 to work at first, but I've made a lot of changes since then. I've had trouble trying to run mfaktc (compiled with CUDA 6.5) under 7.5 on Manjaro Linux, but it runs OK under 7.0. Any advice would be appreciated, but remember you're addressing a tyro.