![]() |
|
|
#3576 |
|
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/
24×199 Posts |
I've been playing with my 8 x 3070 machine tonight. It's got a lowly 2 core/2 thread G4400 CPU and single channel DDR4-2133 memory.
Compared to my older 1070s, the 3070s in this system benefit greatly from changing GPUSieveSize=128 (from 64), especially when working on 332M exponents. I saw GHz-d/d go from ~2600 to ~3050, and volatile usage in nvidia-smi go from ~75% to ~85%. CPU usage also dropped from ~60% to ~40% (system usage 40%->25%). Once running a GPUSieveSize of 128, running two instances per card resulted in equal or less overall throughput depending on CPU saturation. Recompiling and setting GPUSieveSize=2047 gives around 3500 GHz-d/d on a 332M number (80->81 is faster than 79->80, both using barrett87_mul32_gs) and 3700 GHz-d/d for a 128M assignment (76->77). Volatile GPU usage rose to 99% and CPU usage dropped to under 10%. Guess I didn't need to buy that quad-core i5-7400 for $40. So it turns out low GPUSieveSize is a major bottleneck on newer cards. It didn't impact my 1070s at all. So a 40% increase in performance for a couple hours of fiddling. Not bad. |
|
|
|
|
|
#3577 | |
|
Sep 2011
Germany
70258 Posts |
Quote:
Code:
./mfaktc.exe: error while loading shared libraries: libcudart.so.12: cannot open shared object file: No such file or directory |
|
|
|
|
|
|
#3578 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
Quote:
It would be interesting to see how much high gpusievesize matters on 40xx. (Anyone?) Last fiddled with by kriesel on 2022-12-30 at 08:00 |
|
|
|
|
|
|
#3579 | |
|
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/
24×199 Posts |
Quote:
I did try running two instances on a single 3070 with 2047 GPUSieveSize on a 332M 80->81 assignment: throughput dropped 13% from ~3500 to ~3050. I took a look at increasing gpu_sieve_size beyond 2047. I'm not seeing much reason why it couldn't be increased beyond that, though it probably require changing some types passed to the CUDA kernels. I haven't peeked at how it will affect CPU sieving. |
|
|
|
|
|
|
#3580 |
|
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/
24×199 Posts |
And I just caught a 100%. I'm happy with 99% using a single instance.
Code:
$ nvidia-smi
Fri Dec 30 10:05:47 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.108.03 Driver Version: 510.108.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| 86% 75C P2 235W / 270W | 424MiB / 8192MiB | 100% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:02:00.0 Off | N/A |
| 76% 67C P2 234W / 270W | 424MiB / 8192MiB | 99% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:03:00.0 Off | N/A |
| 83% 72C P2 241W / 270W | 424MiB / 8192MiB | 99% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... Off | 00000000:04:00.0 Off | N/A |
| 72% 63C P2 240W / 270W | 424MiB / 8192MiB | 99% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 NVIDIA GeForce ... Off | 00000000:08:00.0 Off | N/A |
| 81% 71C P2 236W / 270W | 424MiB / 8192MiB | 99% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 5 NVIDIA GeForce ... Off | 00000000:0B:00.0 Off | N/A |
| 84% 73C P2 243W / 270W | 424MiB / 8192MiB | 99% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 6 NVIDIA GeForce ... Off | 00000000:0C:00.0 Off | N/A |
| 86% 74C P2 239W / 270W | 424MiB / 8192MiB | 99% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 7 NVIDIA GeForce ... Off | 00000000:0D:00.0 Off | N/A |
| 0% 49C P8 26W / 270W | 3MiB / 8192MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 8757 C ./mfaktc.exe 421MiB |
| 1 N/A N/A 8244 C ./mfaktc.exe 421MiB |
| 2 N/A N/A 9980 C ./mfaktc.exe 421MiB |
| 3 N/A N/A 8753 C ./mfaktc.exe 421MiB |
| 4 N/A N/A 8801 C ./mfaktc.exe 421MiB |
| 5 N/A N/A 8808 C ./mfaktc.exe 421MiB |
| 6 N/A N/A 8816 C ./mfaktc.exe 421MiB |
+-----------------------------------------------------------------------------+
|
|
|
|
|
|
#3581 |
|
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/
24·199 Posts |
Playing around with power limits, I set them to 200 watts (down from 270, though they were using ~245) and lost only ~5% throughout. Each card behaves slightly differently.
I'll gladly save the 400 watts. It's now 4 am. Time to sleep! Last fiddled with by Mark Rose on 2022-12-30 at 11:08 |
|
|
|
|
|
#3582 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
Quote:
Gpusievesize impacts an int32 variable. There is some usage in mfaktx of a negative value. So that would need to be split off to a different variable. Or one could partially 64-bit-ize mfaktc. If one wanted to play above exponent 2^32, a more complete 64-bit conversion would be needed, including rewriting the kernels. |
|
|
|
|
|
|
#3583 | |
|
Jul 2003
27×5 Posts |
Quote:
i have ubuntu v22.04 and cuda toolkit 12 installed i compiled mfaktc v 0.21 and let the selftests -st and -st2 run - works to compile it where the libs are does not work (changes nothing - i tried out with a live-cd) i am sorry but i know not how to do this (i am not a programmer) |
|
|
|
|
|
|
#3584 | |
|
"James Heinrich"
May 2004
ex-Northern Ontario
7×13×47 Posts |
Quote:
![]() What would you like me to test? |
|
|
|
|
|
|
#3585 |
|
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/
24·199 Posts |
|
|
|
|
|
|
#3586 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24×3×163 Posts |
Quote:
If at GpuSieveSize=2047, you run two instances (in separate folders, very similar work; similar exponent, same bit level), do you gain combined throughput, or lose? |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1724 | 2023-06-04 23:31 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 42 | 2022-12-18 05:59 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |