![]() |
|
|
#3235 | |
|
Sep 2002
Database er0rr
1110101100012 Posts |
Quote:
Last fiddled with by paulunderwood on 2019-12-09 at 18:31 |
|
|
|
|
|
|
#3236 |
|
Random Account
Aug 2009
7A416 Posts |
Aptitude was not installed on my system so I ran the Nvidia update from a terminal windows. There did not seem to be any problems.
I ran "make" and the last line said, "unsupported hardware" and ended there. So, I went back to my original mfaktc folder, tried the one there, and it runs. The Nvidia update gave it what it needed. I ran a self-test and everything was successful. For James Heinrich's project, there does not seem to be a "less classes" version for Linux. |
|
|
|
|
|
#3237 | |
|
Sep 2002
Database er0rr
72618 Posts |
Quote:
Code:
# Compiler settings for .cu files (CPU/GPU) NVCC = nvcc -ccbin clang-3.8 NVCCFLAGS = $(CUDA_INCLUDE) --ptxas-options=-v Code:
# generate code for various compute capabilities #NVCCFLAGS += --generate-code arch=compute_11,code=sm_11 # CC 1.1, 1.2 and 1.3 GPUs will use this code (1.0 is not possible for mfaktc) #NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # CC 2.x GPUs will use this code, one code fits all! NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code #NVCCFLAGS += --generate-code arch=compute_35,code=sm_35 # but CC 3.5 (3.2?) _CAN_ use funnel shift which is useful for mfaktc #NVCCFLAGS += --generate-code arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code Code:
01:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 710B] (rev a1) Last fiddled with by paulunderwood on 2019-12-10 at 01:28 |
|
|
|
|
|
|
#3238 |
|
Bemusing Prompter
"Danny"
Dec 2002
California
239510 Posts |
I noticed that compiling mfaktc for a system with a Tesla V100 and CUDA 10.x will result in an compilation errors because the V100 only supports up to compute capability 7.0 unlike newer Volta cards. Running make build=cuda100 would result in an "Unsupported gpu architecture" error. Therefore, I updated my custom makefile to support CUDA 9.0 builds.
I've also updated my script for launching multiple mfaktc instances. It is now more compact and uses less variables. Last fiddled with by ixfd64 on 2019-12-29 at 19:10 |
|
|
|
|
|
#3239 | |
|
"Marv"
May 2009
near the Tannhäuser Gate
29216 Posts |
Quote:
Since they all use the same gpu, they are all cuda 7.0. |
|
|
|
|
|
|
#3240 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
152A16 Posts |
Quote:
Code:
GTX 480, GPU clock 701Mhz, memory clock 924 Mhz tune mfaktc.ini gpu sieve parameters in this order: GPUSieveProcessSize GPUSieveSize GPUSievePrimes Single application instance tests except where indicated dual, starting from GPUSievePrimes=82486 GPUSieveSize=64 GPUSieveProcessSize=16 ~330 GhzD/day, tune Mfaktc v0.21 Feb 5 2015 mfaktc-win-64.exe (cuda 6.5) GPUSieveProcessSize=32 214.5 GPUSieveProcessSize=24 317.94 GPUSieveProcessSize=16 328.10 * GPUSieveProcessSize=8 318.18 GPUSieveSize=128 329.85 * GPUSieveSize=64 318.55 GPUSieveSize=96 327.13 GPUSieveSize=32 299.28 GPUSieveSize=16 267.98 (82% gpu load) GPUSieveProcessSize=16, GPUSieveSize=128 GPUSievePrimes=90000 340.8 GPUSievePrimes=100000 339.80 GPUSievePrimes=80000 340.97 GPUSievePrimes=110000 338.37 GPUSievePrimes=85000 340.77 GPUSievePrimes=82000 341.08 * ~97% gpu load 2 instances 178+177.34 = 355.34 (GPU load 99%) 1 instance, GPUSievePrimes=82000, GPUSieveProcessSize=16, nomead mfaktc-more-cuda65-64 GPUSieveSize=128 346.27 GPUSieveSize=256 352.25 GPUSieveSize=512 356.48 GPUSieveSize=1023 357.98 99-100% gpu load on gpu-z GPUSieveSize=1024 failed after 1 class, with error message: ERROR: cudaGetLastError() returned 9: invalid configuration argument GPUSieveSize=2047 not attempted 2 instances GPUSieveSize=1023, GPUSievePrimes=82000, GPUSieveProcessSize=16, nomead mfaktc-more-cuda65-64 100034929,75,76 first instance (also used in preceding single instance tests) 179.12 100108363,75,76 second instance (also used in preceding double instance tests) 182.77 combined throughput 361.89 GhzD/day, consistent 100% gpu load in gpu-z second instance raised throughput over single instance, 361.89/357.98 = 1.0109 Ratio between GPUSieveSize-optimized versions, 2 instances, "2047"/128 max, 361.89/355.34 = 1.0184 Perhaps the error relates to the CUDA 1D Linear Texture Size 134217728? (134217728 bytes = 1024 Mib) The rating for the GTX480 at mersenne.ca is ~368. GhzD/day, but is n practice a function of exponent and bit depth. Last fiddled with by kriesel on 2020-01-27 at 19:26 |
|
|
|
|
|
|
#3241 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10101001010102 Posts |
Quote:
Code:
Quadro 5000, GPU clock 513 Mhz, memory clock 750 Mhz tune mfaktc.ini gpu sieve parameters in this order: GPUSieveProcessSize GPUSieveSize GPUSievePrimes starting from GPUSievePrimes=82486 GPUSieveSize=64 GPUSieveProcessSize=16 184.23 GhzD/day, tune Mfaktc v0.21 Feb 5 2015 mfaktc-win-64.exe (cuda 6.5), 96-97% gpu load Factor=97482377,74,75 GPUSieveProcessSize=32 119.69 GPUSieveProcessSize=24 184.07 GPUSieveProcessSize=16 184.75 * GPUSieveProcessSize=8 178.41 Factor=100103767,75,76; GPUSieveProcessSize=16, GPUSievePrimes=82486 GPUSieveSize=128 188.19 (98% gpu load) * GPUSieveSize=96 186.70 (98% gpu load) GPUSieveSize=64 184.11 (97% gpu load) GPUSieveSize=32 176.72 (93% gpu load) GPUSieveSize=16 163.33 (88% gpu load) GPUSieveProcessSize=16, GPUSieveSize=128 GPUSievePrimes=90000 188.32 GPUSievePrimes=100000 187.72 GPUSievePrimes=80000 188.73 GPUSievePrimes=85000 188.56 GPUSievePrimes=82000 188.38 ~98% gpu load GPUSievePrimes=75000 188.64 GPUSievePrimes=78000 188.72 GPUSievePrimes=79000 188.80 * 2 instances 96.33 + 95.32 = 191.65 (GPU load 99-100%) 1 instance, GPUSievePrimes=79000, GPUSieveProcessSize=16, nomead mfaktc-more-cuda65-64 GPUSieveSize=128 189.25 GPUSieveSize=256 191.23 (99% gpu load) GPUSieveSize=512 192.00 GPUSieveSize=1023 192.75 100% gpu load on gpu-z * GPUSieveSize=1024 error GPUSieveSize=2047 not tried 2 instances GPUSieveSize=1023, GPUSievePrimes=79000, GPUSieveProcessSize=16, nomead mfaktc-more-cuda65-64 97482377,74,75 first instance (also used in preceding tests) 99.12 100124191,73,74 second instance (also used in preceding tests) 95.01 combined throughput 194.13 GhzD/day, consistent 100% gpu load in gpu-z, 194.13/192.75 = 1.0072 x single instance throughput Ratio between GPUSieveSize-optimized versions, 2 instances, "2047"/128 max, = 194.13/191.65 = 1.0129 Overall gain, 194.13/184.23 = 1.0537. |
|
|
|
|
|
|
#3242 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10101001010102 Posts |
Code:
tune mfaktc.ini gpu sieve parameters in this order: GPUSieveProcessSize GPUSieveSize GPUSievePrimes Q2000 gpu clock 625.9 Mhz, memory 652 Mhz (GPUSieveSize=64, GPUSievePrimes=82486) GPUSieveProcessSize=16 86.794 GPUSieveProcessSize=24 86.77 GPUSieveProcessSize=32 46.3 GPUSieveProcessSize=8 87.19 * GPUSieveSize=128 87.56 * GPUSieveSize=96 87.42 GPUSieveSize=32 86.34 GPUSieveSize=16 84.95 GPUSievePrimes=100000 87.75 * GPUSievePrimes=120000 87.70 GPUSievePrimes=110000 87.72 GPUSievePrimes=95000 87.698 GPUSievePrimes=102000 87.75 above with cuda6.5win32 2015 executable: 88.09 * GPUSieveProcessSize=8 GPUSieveSize=128 GPUSievePrimes=100000 mfaktc-more-cuda65-64 from nomead October 2019 post: GPUSieveSize=128 88.44 GPUSieveSize=256 88.67 GPUSieveSize=511 88.79 * GPUSieveSize=512 fail higher values not tried advantage to increased gpusievesize 88.79/88.09 =~ 1.0079 |
|
|
|
|
|
#3243 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×32×7×43 Posts |
Spellcheck. It will probably ignoar misplled keywrds, and use the defaults instead.
Last fiddled with by kriesel on 2020-01-28 at 17:44 |
|
|
|
|
|
#3244 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×32×7×43 Posts |
Code:
gtx1060 mfaktc tune with nomead's CUDA8 2047-capable executable on Win7 X64 Pro GPUSievePrimes=82486 GPUSieveSize=128 GPUSieveProcessSize=32 503 GhzD/day (32=max; increments of 8) * GPUSieveProcessSize=24 492.7 GPUSieveProcessSize=16 492.7 GPUSieveProcessSize=8 489 GPUSieveProcessSize=32 503; GPUSievePrimes=82486 GPUSieveSize=128 503.5 GPUSieveSize=64 493.1 GPUSieveSize=256 509.46 GPUSieveSize=512 511.71 GPUSieveSize=1024 513.2 GPUSieveSize=2047 513.64 * GPUSieveProcessSize=32, GPUSieveSize=2047 GPUSievePrimes=82486 513.64 GPUSievePrimes=90000 514.64 GPUSievePrimes=100000 515.2 GPUSievePrimes=110000 515.32 GPUSievePrimes=120000 514.71 GPUSievePrimes=115000 515.01 GPUSievePrimes=106000 515.63 * 2 instances: 258.04+258.03 = 516.07 516.07/515.63-1 = .085% gain from two instances over one There is an additional gain if shutting one down for some sort of software maintenance; the other uses the full gpu while one is stopped, so no productive time lost. Last fiddled with by kriesel on 2020-01-28 at 17:46 |
|
|
|
|
|
#3245 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2·32·7·43 Posts |
Code:
Q4000 mfaktc tuning 1) 2015 mfaktc v0.21 GPUSievePrimes=82486 (0-1075000) GPUSieveSize=64 (4-128) varying GPUSieveProcessSize (value fequired to be multiples of 8) GPUSieveProcessSize=16 127.56 GhzD/day GPUSieveProcessSize=24 127.8 * GPUSieveProcessSize=32 82.24 GPUSieveProcessSize=8 127.69 GPUSieveProcessSize=24, GPUSievePrimes=82486, vary GPUSieveSize GPUSieveSize=64 127.8 GPUSieveSize=32 125.89 GPUSieveSize=96 123.72 GPUSieveSize=128 128.9 * GPUSieveProcessSize=24, GPUSievePrimes varied, GPUSieveSize=128 GPUSievePrimes=82486 128.9 * GPUSievePrimes=90000 128.87 GPUSievePrimes=70000 128.36 GPUSievePrimes=100000 126.97 GPUSievePrimes=86000 127.86 2) mfaktc-more-cuda65-64.exe from nomead allowing 2047Mib GPUSieveSize: GPUSievePrimes=82486, GPUSieveProcessSize=24 Factor=95389123,75,76 GPUSieveSize=96 124.16 GPUSieveSize=192 124.83 GPUSieveSize=384 125.17 GPUSieveSize=768 125.33 GPUSieveSize=1024 failed GPUSievePrimes=82486, GPUSieveProcessSize=16 GPUSieveSize=16 122.82 GPUSieveSize=32 126.09 GPUSieveSize=64 128.12 GPUSieveSize=128 129.16 GPUSieveSize=256 129.71 GPUSieveSize=512 129.99 GPUSieveSize=1008 130.13 (221MiB used) * advantage due to increased GPUSieveSize 130.13/128.9 = 1.0095 2 instances Factor=95389123,75,76 64.35 Factor=100110187,75,76 65.78 total 130.13 Two-instance gain = none. GPU ram occupancy 389 MiB |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |