![]() |
|
|
#1 |
|
Mar 2013
Dallas, TX
37 Posts |
Running my new system, a Ryzen 7950X with RTX 3070 TI GPU, I noticed that when trial factoring on the GPU, the throughput drops from about 3500 GHz-D/D to 2900 or so GHz-D/D when I start Prime95 on the CPU and it is fully loaded.
Is this normal? Is there a way to mitigate it? What causes this? Thanks |
|
|
|
|
|
#2 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
7,823 Posts |
What does task manager in Windows or top in Linux say about the TF process's CPU usage?
Hopefully you are using GPU sieving not CPU sieving. In that case CPU usage of mfaktc should be quite low. I see ~0% CPU use on mfaktc on an RTX2080 Super; less than 12 logical-core-minutes accumulated in 4 days+ on an old dual-4-core & x2HT Xeon E5520 system; <0.21% of one logical core of the 16. I'm guessing it would be even lower with a new Ryzen 7950x. That's with 1/20 the default checkpoint saving frequency in mfaktc.ini: Code:
# CheckpointDelay is the time in seconds between two checkpoint writes. # Allowed values are 0 <= CheckpointDelay <= 900. # # Minimum: CheckpointDelay=0 # Maximum: CheckpointDelay=900 # # Default: CheckpointDelay=30 CheckpointDelay=600 You could try running less-classes, or higher bit levels, or both, which take longer per class. Tuning the TF app if you haven't yet could help. Using a version allowing GPUSieveSize up to 2047M helps performance. See also the mfaktc reference thread, especially posts on tuning. https://www.mersenneforum.org/showthread.php?t=23386 Another possibility is it's indirect thermal coupling. Is the case well ventilated, all fans functioning well? Last fiddled with by kriesel on 2023-03-28 at 16:50 |
|
|
|
|
|
#3 | |
|
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/
24·199 Posts |
Quote:
|
|
|
|
|
|
|
#4 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
7,823 Posts |
Quote:
Another way to go is to run two mfaktc instances on the same GPU, from different folders. While one waits for the CPU the other can keep the GPU busy. A brief test on a dual e5-2670 system running prime95 on all cores but without using hyperthreading in prime95 P-1 shows only a very slight impact on mmff 0.28, which is similar to mfaktc and derived from it, on a GTX1650. At MM127 TF, stopping all prime95 workers provided only a ~0.23% increase in GPU throughput, not far above measurement noise & normal fluctuation. Each output line represents a completion of one factor class. Code:
[Mar 28 10:44] M127 [186-187]: 27.9% 1288/4620,268/960 | n.a. | 3107.8s | 24d21h | 1082.25G | 348.24M/s | 1050165 | n.a.% | kriesel@emu-gtx1650 [Mar 28 11:36] M127 [186-187]: 28.0% 1292/4620,269/960 | n.a. | 3108.6s | 24d20h | 1082.25G | 348.14M/s | 1050165 | n.a.% | kriesel@emu-gtx1650 (stop prime95 workers at 11:29) [Mar 28 12:27] M127 [186-187]: 28.1% 1297/4620,270/960 | n.a. | 3101.3s | 24d18h | 1082.25G | 348.97M/s | 1050165 | n.a.% | kriesel@emu-gtx1650 [Mar 28 13:19] M127 [186-187]: 28.2% 1304/4620,271/960 | n.a. | 3100.6s | 24d17h | 1082.25G | 349.05M/s | 1050165 | n.a.% | kriesel@emu-gtx1650 Last fiddled with by kriesel on 2023-03-28 at 19:05 |
|
|
|
|
|
|
#5 | |
|
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/
24×199 Posts |
Quote:
With my 1070s, the impact of running mprime with no hyperthreading is minimal. With my 3070s, GPUSieveSize=128, each was taking 10% of a core (including system usage in the nvidia driver). With GPUSieveSize=2047, they still use about 1% each. On my system with the 3070s I have mprime running on a single core, leaving the other free. |
|
|
|
|
|
|
#6 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
7,823 Posts |
Quote:
3076 GHD/d mfaktc RTX 3070Ti per https://www.mersenne.ca/mfaktc.php; 806 GDH/d mfaktc GTX 1650. Further, allow 2:1 mfaktc/mmff, so 3076 * 2 / 806 = 7.63:1. So from 0.21% for the GTX1650 in mmff to 1.6% of a cpu core for an RTX 3070 Ti in mfaktc, still quite low CPU core utilization. And that's pessimistically assuming a Ryzen 7950X core is only equal to an old xeon e5-2670 core, which seems unlikely. I don't know what your CPU model is, but the ~1% is comparable to the 1.6% estimated above. The order of magnitude CPU overhead reduction with proper GPU app tuning is useful data. Thanks for that. Last fiddled with by kriesel on 2023-03-28 at 21:00 |
|
|
|
|
|
|
#7 | |
|
Mar 2013
Dallas, TX
37 Posts |
Quote:
|
|
|
|
|
|
|
#8 | |
|
Jul 2009
Germany
2×353 Posts |
Quote:
|
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Budget PC Throughput | Rodrigo | Hardware | 14 | 2011-09-26 10:16 |
| possibly serious bug affecting msieve 1.48 & 1.49 | jrk | Msieve | 0 | 2011-09-03 17:53 |
| how is the throughput calculated? | ixfd64 | PrimeNet | 5 | 2008-05-21 13:39 |
| My throughput does not compute... | petrw1 | Hardware | 9 | 2007-08-13 14:38 |
| Fake throughput drop | Lumly | Lounge | 12 | 2002-09-05 20:00 |