![]() |
[QUOTE=firejuggler;631468]FP64 is what matter.[/QUOTE]For TF it's FP[B]32[/B] ("Single Precision") GFLOPS that matters for TF.
For gpuowl etc I have low confidence in my ability to predict performance, either from FP32 or FP64 theoretical throughput, but TF always translates directly. All the mfkatx performance numbers on [URL="https://www.mersenne.ca/mfaktc.php"]my chart[/URL] are derived from FP32 GFLOPS and a magic multiplier for architecture (generally corresponding to CUDA level for NVIDIA):[code]$TF_GFLOPS_per_GHzDayPerDay = array( 'N' => array( 10 => 0.00, 11 => 14.00, 12 => 14.00, 13 => 14.00, 20 => 3.65, 21 => 5.35, 30 => 10.75, 35 => 11.55, 37 => 11.05, // Tesla K80 -- single benchmark, note that K80 is dual-GPU model 50 => 9.00, 52 => 9.00, 60 => 9.70, // Tesla P100 61 => 7.90, 70 => 3.58, // Titan V100 -- only one benchmark so far 75 => 3.30, // RTX 20x0 80 => 2.90, // A100-SXM4 86 => 6.15, // RTX 30?0/A?000 89 => 6.35, // RTX 40x0 ), 'A' => array( 1 => 11.3, // VLIW5 2 => 11.0, // VLIW4 10 => 9.3, // GCN 1.0 11 => 9.3, // GCN 1.1 12 => 9.3, // GCN 1.2 13 => 10.9, // GCN 1.3 14 => 10.9, // GCN 1.3 15 => 10.8, // GCN 1.5 20 => 13.0, // RDNA 1 (RX 5700) 30 => 11.0, // RDNA 2 (RX 6600/6700/6800/6900) 40 => 15.0, // RDNA 3 (RX 7x00) ), 'I' => array( 10 => 9.5, // Arc A380, A770 ), );[/code] |
[QUOTE=James Heinrich;631470]For TF it's FP[B]32[/B] ("Single Precision") GFLOPS that matters for TF.
[/QUOTE] Neat, thanks for information, FP32 it is! Sorry, if I missed it before, but how I submit my 3060 Ti benchmark? Mine is averaging around 3000, while benchmark table shows 2100 average. EDIT: Never mind, I found it and I uploaded my benchmark. |
[QUOTE=Jurzal;631474]how I submit my 3060 Ti benchmark? Mine is averaging around 3000, while benchmark table shows 2100 average.[/QUOTE]That's pretty normal these days. The performance table is based on nominal "stock" clockspeeds, while in reality with decent cooling you're likely to see both Boost and any manufacturer overclock on top of that, oftentimes a significant difference. Your [URL="https://www.mersenne.ca/mfaktc.php?filter=3060+Ti"]3060 Ti[/URL], for example, has a stock clock of [URL="https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3060-3060ti/"]1410[/URL] and your submitted benchmark showed 1965, nearly 40% higher, so the theoretical 2230 GHzd/d * 1.394 = 3108 GHd/d which is in line with your reported performance.
|
[QUOTE=James Heinrich;631486]That's pretty normal these days. The performance table is based on nominal "stock" clockspeeds, while in reality with decent cooling you're likely to see both Boost and any manufacturer overclock on top of that, oftentimes a significant difference. Your [URL="https://www.mersenne.ca/mfaktc.php?filter=3060+Ti"]3060 Ti[/URL], for example, has a stock clock of [URL="https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3060-3060ti/"]1410[/URL] and your submitted benchmark showed 1965, nearly 40% higher, so the theoretical 2230 GHzd/d * 1.394 = 3108 GHd/d which is in line with your reported performance.[/QUOTE]
Thanks for confirming! Nvidia cards respond very well to proper undervolting + overclocking. Can gain 40% higher performance with -20% power consumption reduction. 1965 MHz clock from base 1410 MHz is with 165W power consumption, instead of 200W default. |
[QUOTE=Jurzal;631518]Thanks for confirming!
Nvidia cards respond very well to proper undervolting + overclocking. Can gain 40% higher performance with -20% power consumption reduction. 1965 MHz clock from base 1410 MHz is with 165W power consumption, instead of 200W default.[/QUOTE] Seems strange that they would release cards like that. |
[QUOTE=Jurzal;631518]
Nvidia cards respond very well to proper undervolting + overclocking. Can gain 40% higher performance with -20% power consumption reduction. 1965 MHz clock from base 1410 MHz is with 165W power consumption, instead of 200W default.[/QUOTE] BUT does it compute correctly at that overclock+undervolt? In my experience, expecially with TF, it's very easy to overlook wrong compute. You simply don't find factors, and there is no other indication that the GPU is not working correctly. So if your GPU undervolts+overclocks fantastically, you should spend a significant effort making sure it still works correctly before jumping into serious TF. To check you need to run known-factors TF and verify that the factors are all detected correctly without exception. |
[QUOTE=preda;631570]To check you need to run known-factors TF and verify that the factors are all detected correctly without exception.[/QUOTE]The easiest way to do this is [c]mfaktc -st2[/c] which will test a large number of known factors of all sizes across multiple kernels and different exponent sizes, and give you confirmation at the end:[code]Selftest statistics
number of tests 26192 successfull tests 26192 kernel | success | fail -------------------+---------+------- UNKNOWN kernel | 0 | 0 71bit_mul24 | 2586 | 0 75bit_mul32 | 2682 | 0 95bit_mul32 | 2867 | 0 barrett76_mul32 | 1096 | 0 barrett77_mul32 | 1114 | 0 barrett79_mul32 | 1153 | 0 barrett87_mul32 | 1066 | 0 barrett88_mul32 | 1069 | 0 barrett92_mul32 | 1084 | 0 75bit_mul32_gs | 2420 | 0 95bit_mul32_gs | 2597 | 0 barrett76_mul32_gs | 1079 | 0 barrett77_mul32_gs | 1096 | 0 barrett79_mul32_gs | 1130 | 0 barrett87_mul32_gs | 1044 | 0 barrett88_mul32_gs | 1047 | 0 barrett92_mul32_gs | 1062 | 0 selftest PASSED![/code] (there is also the less-extensive [c]-st[/c] test which does the same thing, just less of it, since -st2 can easily take several hours on a slower gpu) |
[Code]
Selftest statistics number of tests 26192 successfull tests 26192 kernel | success | fail -------------------+---------+------- UNKNOWN kernel | 0 | 0 71bit_mul24 | 2586 | 0 75bit_mul32 | 2682 | 0 95bit_mul32 | 2867 | 0 barrett76_mul32 | 1096 | 0 barrett77_mul32 | 1114 | 0 barrett79_mul32 | 1153 | 0 barrett87_mul32 | 1066 | 0 barrett88_mul32 | 1069 | 0 barrett92_mul32 | 1084 | 0 75bit_mul32_gs | 2420 | 0 95bit_mul32_gs | 2597 | 0 barrett76_mul32_gs | 1079 | 0 barrett77_mul32_gs | 1096 | 0 barrett79_mul32_gs | 1130 | 0 barrett87_mul32_gs | 1044 | 0 barrett88_mul32_gs | 1047 | 0 barrett92_mul32_gs | 1062 | 0 selftest PASSED![/Code] |
2 Attachment(s)
I have completed 4001 assignments, found 52 factors.
Mostly at wavefront 75-77. Added screenshots of GPU72 config. I will run a recently tested known factor 168785003 at 76-77 bit, to see if I match it. |
1 Attachment(s)
[QUOTE=preda;631570]BUT does it compute correctly at that overclock+undervolt?
In my experience, expecially with TF, it's very easy to overlook wrong compute. You simply don't find factors, and there is no other indication that the GPU is not working correctly. So if your GPU undervolts+overclocks fantastically, you should spend a significant effort making sure it still works correctly before jumping into serious TF. To check you need to run known-factors TF and verify that the factors are all detected correctly without exception.[/QUOTE] Test succesful, factor found. This GPU goes up to 2175 MHz with 1.081v on core, 1965 MHz with 0.925v on core, and that is a safe margin, since it can do it on 0.900v too. I know my overclocks. Cheers. |
| All times are UTC. The time now is 14:41. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.