![]() |
![]() |
#3335 | |
Random Account
Aug 2009
U.S.A.
2×3×13×23 Posts |
![]() Quote:
Edit: I found it in the Linux archive in the link above. The settings were mostly like the quoted above. The best 0.1% time was 1.876 seconds. 3349 GHz-d/day. Last fiddled with by storm5510 on 2020-09-28 at 14:20 |
|
![]() |
![]() |
![]() |
#3336 | |
Dec 2018
China
23·5 Posts |
![]() Quote:
borrowed a machine with 4 RTX 3090 and found a quite strange thing: the GPU util cannot reach 100% even I am running a single mfaktc in a single GPU uploaded 2 result, and may try gpuowl later |
|
![]() |
![]() |
![]() |
#3337 | |
Dec 2018
China
23·5 Posts |
![]() Quote:
all the results generate by mfaktc is here: results.7z |
|
![]() |
![]() |
![]() |
#3338 |
Jul 2009
Germany
54710 Posts |
![]()
Please make a short gpuowl benchmark with the exponent 77936867, so that we can directly compare the values of the graphics cards, thank you.
https://mersenneforum.org/showthread.php?p=558317#post558317 Last fiddled with by moebius on 2020-10-01 at 07:52 |
![]() |
![]() |
![]() |
#3339 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
484510 Posts |
![]()
I would expect that. Fast gpus need multiple instances as well as large gpusieveprimes and other tuning typically. Tune with a single instance first, then test performance versus number of tuned instances is the approach I use. The effect seems to be stronger, the faster the gpu is. Solid state disk or ramdisk might help also.
|
![]() |
![]() |
![]() |
#3340 |
"James Heinrich"
May 2004
ex-Northern Ontario
3×23×47 Posts |
![]()
This is normal on high-performance GPUs. 1080 will get to about 95%, 2080 will get to about 80% (apparently 30x0 same). The GPU is just too fast, the little bit that the CPU does can't keep up. In production use running two instances of mfaktc should allow optimal throughput (splitting the CPU load across two cores).
|
![]() |
![]() |
![]() |
#3341 | |
Random Account
Aug 2009
U.S.A.
2×3×13×23 Posts |
![]() Quote:
There is a solution for this. I do not know about Linux, but with Windows it is possible to set the CPU speed based on a percentage of its capability. Default minimum is something like 5%. It will not respond to a quick pulse very much. Set it to 85%, for example, with no load and it will respond much faster. I noticed that when I have Prime95 running, my GPU performance, with mfaktc, increased considerably. |
|
![]() |
![]() |
![]() |
#3342 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×5×17×19 Posts |
![]() Quote:
Needing multiple instances for full performance is typical for gpus faster than ~GTX1050Ti, even with prime95 fairly fully utilizing the cpu, keeping cpu clock rates at highest sustainable levels, and on -gpu sieving enabled in mfaktc. The faster the gpu the more it matters. Two instances does a pretty good job on some gpu models; I use 3 instances to get the most throughput from GTX2080x. So it's no surprise the GTX3080 is underutilized with a single instance. Also, was the GTX3080 mfaktc test thoroughly tuned? I think the lower than 100% utilization in mfaktc has to do with time for saving checkpoint files and generating console output, and activities that may be limited by pcie bandwidth. Running multiple instances lets gpu resources work on something in one instance while another instance is waiting for the cpu side of mfaktc and the OS to get things done occasionally and communication across pcie to occur. For comparison, GTX1080Ti shows 98% utilization in gpuowl with one instance. Mfakto shows much less effect of tuning than mfaktc for equivalent gpu speed. So maybe it has to do with CUDA call overhead. For more, see detailed mfaktc tune analyses on GTX1080Ti and RTX2080 Super here. I saw 90% utilization with 256 gpusievesize on RTX2080Super, but 2047 gpusievesize and a good tune otherwise boosted it a lot. Last fiddled with by kriesel on 2020-10-01 at 17:29 |
|
![]() |
![]() |
![]() |
#3343 |
Random Account
Aug 2009
U.S.A.
179410 Posts |
![]()
I just began using Ubuntu 20.04 LTS. The archive mfaktc-0.21-linux64.cuda10.1-gpusievesize2047.tar.gz does not contain the libraries needed to run. Where can I find them?
|
![]() |
![]() |
![]() |
#3344 | |
"Viliam Furík"
Jul 2018
Martin, Slovakia
7·47 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1668 | 2020-12-22 15:38 |
The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |