![]() |
|
|
#1 |
|
"David Kirkby"
Jan 2021
Althorne, Essex, UK
1CA16 Posts |
I'm not sure where best to post this - it is mainly a software issue, but there's no GPU specific forum for software, but there is for hardware. However, there is a bit of hardware specific stuff, so I guess here is okay.
I tried to build mfaktc 0.21 https://www.mersenneforum.org/mfaktc...tc-0.21.tar.gz on a Dell 7920 tower workstation with a Nvidia Quadro P2200 graphics card running Ubuntu 20.04 linux. It would not build, essentially as the Nvida have dropped support for early comute versions. That means mfaktc will not build with the latest Nviida CUDA development tools without some changes. These tips might help. 1) When one downloads the development kit from Nvidia, it puts most/all files in /usr/local/cuda-11.3. The Makefile expects them to be in /usr/local/cuda, but that's obviously easy to change. 2) The Makefile has these lines Code:
NVCCFLAGS += --generate-code arch=compute_11,code=sm_11 # CC 1.1, 1.2 and 1.3 GPUs will use this code (1.0 is not possible for mfaktc) NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # CC 2.x GPUs will use this code, one code fits all! NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code NVCCFLAGS += --generate-code arch=compute_35,code=sm_35 # but CC 3.5 (3.2?) _CAN_ use funnel shift which is useful for mfaktc NVCCFLAGS += --generate-code arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code Code:
gcc -Wall -Wextra -O2 -I/usr/local/cuda-11.3/include/ -malign-double -c output.c -o output.o nvcc -I/usr/local/cuda-11.3/include/ --ptxas-options=-v --generate-code arch=compute_11,code=sm_11 --generate-code arch=compute_20,code=sm_20 --generate-code arch=compute_30,code=sm_30 --generate-code arch=compute_35,code=sm_35 --generate-code arch=compute_50,code=sm_50 --compiler-options=-Wall -c tf_72bit.cu -o tf_72bit.o nvcc fatal : Unsupported gpu architecture 'compute_11' make: *** [Makefile:56: tf_72bit.o] Error 1 a) The executable has the .exe extension - most unusual on a Linux system. b) The executable is put in the directory above the location of the source code - I have never seen this before. There was an exe there before, so the build overwrites that. However, although the executable will run, it would not work with my card. I think both the CC 3.5 and 5.0 are too old for my Nvidia P2200 graphics card. I get the following error message, when its self-test runs. Code:
drkirkby@jackdaw:~/mfaktc-0.21$ ./mfaktc.exe mfaktc v0.21 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 3 CPUStreams 3 GridSize 3 GPU Sieving enabled GPUSievePrimes 82486 GPUSieveSize 64Mi bits GPUSieveProcessSize 16Ki bits Checkpoints enabled CheckpointDelay 30s WorkFileAddDelay 600s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID (none) ComputerID (none) AllowSleep no TimeStampInResults no CUDA version info binary compiled for CUDA 11.30 CUDA runtime version 11.30 CUDA driver version 11.30 CUDA device info name NVIDIA Quadro P2200 compute capability 6.1 max threads per block 1024 max shared memory per MP 98304 byte number of multiprocessors 10 clock rate (CUDA cores) 1493MHz memory clock rate: 5005MHz memory bus width: 160 bit Automatic parameters threads per grid 655360 GPUSievePrimes (adjusted) 82486 GPUsieve minimum exponent 1055144 running a simple selftest... ERROR: cudaGetLastError() returned 209: no kernel image is available for execution on the device compute capability 6.1 At that point, I took a guess that adding this line in the Makefile would add the 6.1 capability my card possibly needs. Code:
NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 Code:
The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). Code:
drkirkby@jackdaw:~/mfaktc-0.21/src$ diff Makefile Makefile.bak 2c2 < CUDA_DIR = /usr/local/cuda-11.3 --- > CUDA_DIR = /usr/local/cuda 16,18c16,18 < #NVCCFLAGS += --generate-code arch=compute_11,code=sm_11 # CC 1.1, 1.2 and 1.3 GPUs will use this code (1.0 is not possible for mfaktc) < #NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # CC 2.x GPUs will use this code, one code fits all! < #NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code --- > NVCCFLAGS += --generate-code arch=compute_11,code=sm_11 # CC 1.1, 1.2 and 1.3 GPUs will use this code (1.0 is not possible for mfaktc) > NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # CC 2.x GPUs will use this code, one code fits all! > NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code 21d20 < NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # Needed for my Nvidia P2200 with compute capability 6.1 Code:
./mfaktc.exe -tf 754454689 72 73 https://www.mersenne.org/report_expo...exp_hi=&full=1 mfaktc found the factor 7136025663302317823497 okay. The reported speed is around 191 GHz-day/day. Code:
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait May 13 21:04 | 0 0.1% | 0.595 9m31s | 191.77 82485 n.a.% May 13 21:04 | 11 0.2% | 0.580 9m16s | 196.73 82485 n.a.% May 13 21:04 | 15 0.3% | 0.593 9m28s | 192.42 82485 n.a.% May 13 21:04 | 20 0.4% | 0.588 9m22s | 194.05 82485 n.a.% Based on my interests, I will not be shelling out the cost of a new GPU. Dave Last fiddled with by drkirkby on 2021-05-13 at 20:22 |
|
|
|
|
|
#2 |
|
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2·229 Posts |
Oops, I just realised that gpuowl was running on the graphics card at the same time as mfaktc! Now the speed of mfaktc seems much more impressive than the Intel Xeon Platinum 8167M. In fact, I have a PRP test of 103777013 to do, but I'm going to do a trial factor to 77 bits on the GPU, as it will only take a few hours. If it finds a factor I will not bother with the PRP test.
https://www.mersenne.org/report_expo...3777013&full=1 Code:
Date Time Pct ETA | Exponent Bits | GHz-d/day Sieve Wait May 13 21:45 0.2 8h22m | 103777013 76-77 | 421.78 82485 n.a.% May 13 21:46 0.3 8h24m | 103777013 76-77 | 419.59 82485 n.a.% Last fiddled with by drkirkby on 2021-05-13 at 20:56 |
|
|
|
|
|
#3 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24×3×163 Posts |
The Quadro P2200 TF GhzD/day performance, while much higher than its PRP or LL or P-1 performance, is modest.
RTX2080 (by no means fastest) : got assignment: exp=114021059 bit_min=75 bit_max=76 (67.11 GHz-days) Starting trial factoring M114021059 from 2^75 to 2^76 (67.11 GHz-days) k_min = 165666466326900 k_max = 331332932655512 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait May 13 23:22 | 0 0.1% | 1.921 30m42s | 3144.20 106037 n.a.% An alternative to the iterative compile approach to finding what compute levels are supported is to read the release notes. Or https://www.mersenneforum.org/showpo...1&postcount=11 Next step is to read about and try tuning mfaktc for your card and workload. Last fiddled with by kriesel on 2021-05-14 at 04:56 |
|
|
|
|
|
#4 | |
|
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2·229 Posts |
Quote:
Quadro cards are generally over-priced compared to the more mainstream cards, but the over-priced (£21,000 GBP per year for use of just 4 cores) software I used was optimised for Quadro cards. I had a limited time to use that software, so just went for supported operating system (CentOS) and graphics card (Quadro). FWIW, I did let that trial factor complete Code:
May 14 06:14 99.7 1m36s | 103777013 76-77 | 414.84 82485 n.a.%
May 14 06:15 99.8 1m04s | 103777013 76-77 | 414.29 82485 n.a.%
May 14 06:15 99.9 0m32s | 103777013 76-77 | 414.84 82485 n.a.%
May 14 06:16 100.0 0m00s | 103777013 76-77 | 415.10 82485 n.a.%
no factor for M103777013 from 2^76 to 2^77 [mfaktc 0.21 barrett87_mul32_gs]
tf(): time spent since restart: 8h 30m 42.972s
estimated total time spent: 8h 31m 14.925s
|
|
|
|
|
|
|
#5 |
|
"David Kirkby"
Jan 2021
Althorne, Essex, UK
45810 Posts |
Given I have a PRP test reserved on 103777013, but no trial-factor assignment ID, can the results be usefully uploaded?
Code:
no factor for M103777013 from 2^76 to 2^77 [mfaktc 0.21 barrett87_mul32_gs] I don't have an ego and so need to collect CPU days, although I would like to find the trick to getting allocated category 0 assignments. I've had one of them. I can get category 1 assignments easy enough, and are completing more than one per day, but I can't seem to find the way to get category 0 assignments. Last fiddled with by drkirkby on 2021-05-14 at 12:42 |
|
|
|
|
|
#6 |
|
Jan 2021
California
24·5·7 Posts |
Cat 0 assignments are always fully assigned. They can only become available when an assignment within the cat 0 range expires or TF/PM1 completes on an assignment in cat 0 range, and will only remain unassigned for a few minutes unless a large number expire at once.
You shouldn't worry about getting cat 0 assignments. The category only exists to make sure that the assignments at the trailing edge will get cleared eventually. Last fiddled with by slandrum on 2021-05-14 at 13:26 |
|
|
|
|
|
#7 | ||||
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
172208 Posts |
Quote:
Quote:
Quote:
Quote:
There's a very good chance of completing double checking up to Mp#48* this year. There's also the strategic rechecking list Uncwilly updates regularly, of exponents with conflicting results, which would benefit from some quick tie breaker runs from those 26-core cpus. |
||||
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| PARI/GP tips and tricks | CRGreathouse | PARI/GP | 54 | 2023-01-26 13:06 |
| mfaktc on Ubuntu 15.04 | JC | GPU Computing | 4 | 2015-09-22 04:40 |
| Building msieve under Ubuntu 12.04 | VolMike | Msieve | 9 | 2012-10-14 07:57 |
| Tool Tips clobbered by Prime95 (on Win2K) | Bob Stein | Information & Answers | 1 | 2008-04-11 17:52 |
| Help/Tips on Buiding Computer? | Unreg | Hardware | 6 | 2004-09-18 18:19 |