![]() |
Relative performance of GPUs for P1
I realize P1 as a separate task is discontinued ... however ...
I am still running the version that allows it: Does it seems reasonable that for the various Colab GPUs available I am seeing relative Stage1 iteration times of (based on my specific B1 but still relative): P4: 3,600 T4: 2,630 K80: 1,800 P100: 470 (yes 4 to 8 times faster) |
[QUOTE=petrw1;560015]I realize P1 as a separate task is discontinued ... however ...
I am still running the version that allows it: Does it seems reasonable that for the various Colab GPUs available I am seeing relative Stage1 iteration times of (based on my specific B1 but still relative): P4: 3,600 T4: 2,630 K80: 1,800 P100: 470 (yes 4 to 8 times faster)[/QUOTE] Yes, these times make perfect sense. Neither the P4 nor the T4 have many FP64 cores available. These cores are essential for performance doing Stage1. Their specs are fairly close but since the T4 is newer with faster memory & a few other things, it should be faster than the P4. Even tho the K80 is quite old, it still has decent FP64 performance AND it has 2 GPUs. The P100 has lots of FP64 cores and they will yield the best performance. AFAIK the P4 and T4 are touted as being designed explicitly for training AIs since they do not require high percision computations. |
[QUOTE=tServo;560044]Even tho the K80 is quite old, it still has decent FP64 performance AND it has 2 GPUs.[/QUOTE]
The K80 itself has two (2#) GPUs on the card, but only one (1#) is given to each VM. |
[QUOTE=petrw1;560015]I realize P1 as a separate task is discontinued ... however ...
I am still running the version that allows it: Does it seems reasonable that for the various Colab GPUs available I am seeing relative Stage1 iteration times of (based on my specific B1 but still relative): P4: 3,600 T4: 2,630 K80: 1,800 P100: 470 (yes 4 to 8 times faster)[/QUOTE] us/iteration for ~100M exponents? Time required for any fft-based multiplication mod m is strongly related to log2(m); roughly p[SUP]1.1[/SUP] for Mersenne number m=2[SUP]p[/SUP]-1. Some data for Colab gpus at [url]https://www.mersenneforum.org/showpost.php?p=533245&postcount=15[/url], showing the P4 & T4 have 1/32 SP/DP ratio, making them better suited for TF, not well suited for LL, PRP, P-1. |
[QUOTE=kriesel;560057]us/iteration for ~100M exponents?[/QUOTE]
44.6M B1=1,250,000 B2=25,000,000 |
| All times are UTC. The time now is 02:35. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.