mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Why is there a noticeable change in P-1 speed between 107M and 109M exponents (https://www.mersenneforum.org/showthread.php?t=26497)

ZFR 2021-02-14 20:00

Why is there a noticeable change in P-1 speed between 107M and 109M exponents
 
I'm using an i7 4790K CPU. Been doing mostly exponents in the 105M-107M range on 4 workers and getting around 45ms/iter on each. I've got 2 new 109M exponents, and suddenly the speed decreased to 50ms/iter.

I'm not really familiar with how FFT multiplication works, but it's been my experience that speed per iteration is more or less proportional to the exponent size. I'm curious what's the reason behind this jump.

[CODE]
[Worker #1 Feb 14 03:52] Iteration: 66110000 / 107300849 [61.61%], ms/iter: 45.124, ETA: 21d 12:18
[Worker #3 Feb 14 03:53] Iteration: 2270000 / 109861267 [2.06%], ms/iter: 50.764, ETA: 63d 05:08
[Worker #2 Feb 14 03:53] Iteration: 72220000 / 107007577 [67.49%], ms/iter: 44.721, ETA: 18d 00:08
[Worker #4 Feb 14 03:54] M109819159 stage 2 is 76.84% complete. Time: 755.476 sec.
[Worker #1 Feb 14 03:59] Iteration: 66120000 / 107300849 [61.62%], ms/iter: 45.038, ETA: 21d 11:11
[Worker #2 Feb 14 04:01] Iteration: 72230000 / 107007577 [67.49%], ms/iter: 44.960, ETA: 18d 02:20
[Worker #3 Feb 14 04:02] Iteration: 2280000 / 109861267 [2.07%], ms/iter: 51.028, ETA: 63d 12:54
[Worker #4 Feb 14 04:07] M109819159 stage 2 is 77.42% complete. Time: 754.166 sec.
[Worker #1 Feb 14 04:07] Iteration: 66130000 / 107300849 [61.63%], ms/iter: 45.155, ETA: 21d 12:24
[Worker #2 Feb 14 04:08] Iteration: 72240000 / 107007577 [67.50%], ms/iter: 44.645, ETA: 17d 23:09
[Worker #3 Feb 14 04:10] Iteration: 2290000 / 109861267 [2.08%], ms/iter: 50.902, ETA: 63d 09:00
[Worker #1 Feb 14 04:15] Iteration: 66140000 / 107300849 [61.63%], ms/iter: 45.368, ETA: 21d 14:43
[Worker #2 Feb 14 04:16] Iteration: 72250000 / 107007577 [67.51%], ms/iter: 44.773, ETA: 18d 00:16
[Worker #3 Feb 14 04:19] Iteration: 2300000 / 109861267 [2.09%], ms/iter: 50.831, ETA: 63d 06:43
[Worker #4 Feb 14 04:19] M109819159 stage 2 is 78.00% complete. Time: 755.928 sec.
[Worker #1 Feb 14 04:22] Iteration: 66150000 / 107300849 [61.64%], ms/iter: 45.075, ETA: 21d 11:14
[Worker #2 Feb 14 04:23] Iteration: 72260000 / 107007577 [67.52%], ms/iter: 44.795, ETA: 18d 00:21
[Worker #3 Feb 14 04:27] Iteration: 2310000 / 109861267 [2.10%], ms/iter: 50.800, ETA: 63d 05:40
[Worker #1 Feb 14 04:30] Iteration: 66160000 / 107300849 [61.65%], ms/iter: 45.155, ETA: 21d 12:01
[Worker #2 Feb 14 04:31] Iteration: 72270000 / 107007577 [67.53%], ms/iter: 44.800, ETA: 18d 00:17
[Worker #4 Feb 14 04:32] M109819159 stage 2 is 78.58% complete. Time: 755.585 sec.
[Worker #3 Feb 14 04:36] Iteration: 2320000 / 109861267 [2.11%], ms/iter: 50.816, ETA: 63d 06:00
[Worker #1 Feb 14 04:37] Iteration: 66170000 / 107300849 [61.66%], ms/iter: 45.225, ETA: 21d 12:42
[Worker #2 Feb 14 04:38] Iteration: 72280000 / 107007577 [67.54%], ms/iter: 44.738, ETA: 17d 23:34
[Worker #3 Feb 14 04:44] Iteration: 2330000 / 109861267 [2.12%], ms/iter: 50.742, ETA: 63d 03:40
[Worker #4 Feb 14 04:44] M109819159 stage 2 is 79.16% complete. Time: 754.739 sec.
[Worker #1 Feb 14 04:45] Iteration: 66180000 / 107300849 [61.67%], ms/iter: 45.128, ETA: 21d 11:28
[Worker #2 Feb 14 04:46] Iteration: 72290000 / 107007577 [67.55%], ms/iter: 44.738, ETA: 17d 23:26
[Worker #1 Feb 14 04:52] Iteration: 66190000 / 107300849 [61.68%], ms/iter: 45.147, ETA: 21d 11:33
[Worker #3 Feb 14 04:53] Iteration: 2340000 / 109861267 [2.12%], ms/iter: 50.832, ETA: 63d 06:12
[Worker #2 Feb 14 04:53] Iteration: 72300000 / 107007577 [67.56%], ms/iter: 44.753, ETA: 17d 23:27
[Worker #4 Feb 14 04:57] M109819159 stage 2 is 79.74% complete. Time: 753.788 sec.
[Worker #1 Feb 14 05:00] Iteration: 66200000 / 107300849 [61.69%], ms/iter: 45.208, ETA: 21d 12:08
[Worker #2 Feb 14 05:01] Iteration: 72310000 / 107007577 [67.57%], ms/iter: 44.702, ETA: 17d 22:50
[Worker #3 Feb 14 05:01] Iteration: 2350000 / 109861267 [2.13%], ms/iter: 51.212, ETA: 63d 17:25
[Worker #1 Feb 14 05:07] Iteration: 66210000 / 107300849 [61.70%], ms/iter: 45.073, ETA: 21d 10:28
[Worker #2 Feb 14 05:08] Iteration: 72320000 / 107007577 [67.58%], ms/iter: 44.755, ETA: 17d 23:13
[Worker #4 Feb 14 05:09] M109819159 stage 2 is 80.32% complete. Time: 753.423 sec.
[Worker #3 Feb 14 05:10] Iteration: 2360000 / 109861267 [2.14%], ms/iter: 50.909, ETA: 63d 08:13
[Worker #1 Feb 14 05:15] Iteration: 66220000 / 107300849 [61.71%], ms/iter: 45.220, ETA: 21d 12:01
[Worker #2 Feb 14 05:16] Iteration: 72330000 / 107007577 [67.59%], ms/iter: 44.699, ETA: 17d 22:34
[Worker #3 Feb 14 05:18] Iteration: 2370000 / 109861267 [2.15%], ms/iter: 50.739, ETA: 63d 02:59
[Worker #4 Feb 14 05:22] M109819159 stage 2 is 80.90% complete. Time: 756.304 sec.
[Worker #1 Feb 14 05:22] Iteration: 66230000 / 107300849 [61.72%], ms/iter: 45.145, ETA: 21d 11:02
[Worker #2 Feb 14 05:23] Iteration: 72340000 / 107007577 [67.60%], ms/iter: 44.772, ETA: 17d 23:08
[Worker #3 Feb 14 05:27] Iteration: 2380000 / 109861267 [2.16%], ms/iter: 50.786, ETA: 63d 04:16
[Worker #1 Feb 14 05:30] Iteration: 66240000 / 107300849 [61.73%], ms/iter: 45.089, ETA: 21d 10:16
[Worker #2 Feb 14 05:31] Iteration: 72350000 / 107007577 [67.61%], ms/iter: 44.787, ETA: 17d 23:10
[Worker #4 Feb 14 05:35] M109819159 stage 2 is 81.48% complete. Time: 755.844 sec.
[Worker #3 Feb 14 05:35] Iteration: 2390000 / 109861267 [2.17%], ms/iter: 50.933, ETA: 63d 08:31
[Worker #1 Feb 14 05:37] Iteration: 66250000 / 107300849 [61.74%], ms/iter: 45.142, ETA: 21d 10:45
[Worker #2 Feb 14 05:38] Iteration: 72360000 / 107007577 [67.62%], ms/iter: 44.667, ETA: 17d 21:53
[Worker #3 Feb 14 05:44] Iteration: 2400000 / 109861267 [2.18%], ms/iter: 50.839, ETA: 63d 05:34
[Worker #1 Feb 14 05:45] Iteration: 66260000 / 107300849 [61.75%], ms/iter: 45.032, ETA: 21d 09:22
[Worker #2 Feb 14 05:46] Iteration: 72370000 / 107007577 [67.63%], ms/iter: 44.751, ETA: 17d 22:34
[Worker #4 Feb 14 05:47] M109819159 stage 2 is 82.06% complete. Time: 754.849 sec.
[Worker #3 Feb 14 05:52] Iteration: 2410000 / 109861267 [2.19%], ms/iter: 50.928, ETA: 63d 08:05
[Worker #1 Feb 14 05:53] Iteration: 66270000 / 107300849 [61.76%], ms/iter: 45.126, ETA: 21d 10:19
[Worker #2 Feb 14 05:53] Iteration: 72380000 / 107007577 [67.64%], ms/iter: 44.719, ETA: 17d 22:08
[/CODE]

EDIT: Worker 3 is the one with 109M exponent. Worker 4 was still doing P-1 stage, but after finishing it it's also getting 50-51 ms/iter.

chalsall 2021-02-14 20:37

[QUOTE=ZFR;571595]I'm not really familiar with how FFT multiplication works, but it's been my experience that speed per iteration is more or less proportional to the exponent size. I'm curious what's the reason behind this jump.[/QUOTE]

That looks like an "FFT size cross-over" to me. A complex function trying to maximize throughput while managing acceptable errors.

The size of the FFT chosen *is* a function of the Exponent (possibly different between different versions of the software). But the execution time is not linear against that particular variable.

(Admittedly, I didn't write the code nor understand the Maths as deeply as those who did. So take this statement as subject to correction by those who actually know what they're doing...)

Uncwilly 2021-02-14 20:45

Chris likely has the right answer. FFT sizes will stair step up at some point. One needs to have enough room at the end of the FFT so the noise at the end does not effect the calculations. Once that cushion is eaten into, one needs to go to the next higher FFT size. And because different machines have different size caches, available instructions, etc., their cross-overs will differ. That is why Prime95 will run benchmarks every once in a while.

ZFR 2021-02-14 20:55

OK, thanks for the answers. I think I'll have to read up on FFT multiplication some day to understand all this better.

Regarding this:

[QUOTE=Uncwilly;571597]That is why Prime95 will run benchmarks every once in a while.[/QUOTE]

mprime too? Are these done automatically, or is anything needed from me?

chalsall 2021-02-14 20:59

[QUOTE=Uncwilly;571597]Chris likely has the right answer.[/QUOTE]

Just to quickly share...

As a serious dyslexic (and on the ASD) I prefer the word "correct" over "right".

"Turn right here." "Um (while driving), there's no turn to our right." "No, I mean, turn to your left immediately..." :chalsall:

Uncwilly 2021-02-14 22:12

[QUOTE=ZFR;571599]OK, thanks for the answers. I think I'll have to read up on FFT multiplication some day to understand all this better.[/quote][hedge]This may not be current:
Since the software is using the Floating Point Unit of the CPU, there is that noise at the end. Which is ok for most normal calculations as it way below what the user needs. But since we are using nearly the entire width of the instruction set, we care about the accumulated errors. FFT's can be run using integer math, but that would be slower on the CPU[/hedge]
[quote]mprime too? Are these done automatically, or is anything needed from me?[/QUOTE]mprime and Prime95 are twins. The only difference is hair and make-up.

Uncwilly 2021-02-14 22:16

[QUOTE=chalsall;571600]As a serious dyslexic (and on the ASD) I prefer the word "correct" over "right".[/QUOTE]Your right I should have said "correct". You're answer was most likely closest to the truth.
[FONT="Fixedsys"][SIZE="1"][COLOR="White"][SUP]And I did intentionally fix up your and you're.[/SUP][/COLOR][/SIZE][/FONT]
:razz:

chalsall 2021-02-14 22:47

[QUOTE=Uncwilly;571605]:razz:[/QUOTE]

Coolness. Although to share, it took some effort to be comfortable with that. :smile:

Batalov 2021-02-15 02:44

[QUOTE=chalsall;571600]"Turn right here." "Um (while driving), there's no turn to our right." "No, I mean, turn to your left [STRIKE]immediately[/STRIKE]..." :chalsall:[/QUOTE]
"No, I mean, turn to your left right now!"
"What, left?"
"Right!!"

retina 2021-02-15 05:26

Little Jimmy at the local aquarium
 
"Look everyone, there are lots to left, and fewer [url=https://en.wikipedia.org/wiki/Right_whale]right whales[/url] on [url=https://en.wikipedia.org/wiki/Left%E2%80%93right_political_spectrum]the right[/url]!"
"I'm sure the proper way to refer to them is as less correct conservative high rollers."

LaurV 2021-02-15 08:25

Pale and fade. All of you. With all your efforts, you don't beat [URL="https://www.youtube.com/watch?v=YC9_Aan_S9Q"]Ali Nadim[/URL]. (watch first 2:40 minutes, ignore the subtitles, they are bullshit, I mostly hate when he says "e-squeeze me please lady" and they translate "excusing me...").


All times are UTC. The time now is 19:50.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.