mersenneforum.org More twin primes below Mersenne exponents than above Mersenne exponents.
 User Name Remember Me? Password
 Register FAQ Search Today's Posts Mark Forums Read

 2021-05-22, 14:57 #23 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 5,783 Posts Thanks for the somewhat supportive response, Batalov. Multiplying iteration count ~p by time per iteration which is dominated by squaring time for large p, t ~ c p2 log p log log p, or closer to quoting, t=(O)p2 log p log log p according to the number theorists for IBDWT and derivation from greats such as Knuth for Schonhage-Strassen ("How fast can we multiply?"). And actual program behavior, for Gpuowl and prime95, fits well to t~c p2.1, over the range ~103M to 1G exponent, the range of greatest current interest for Mersenne prime hunting. (One could argue ~56M-103M is also of some lesser interest, until we're done with verification, in several years.) Of course primality-test time is not a simple well behaved function in reality, because of finite spacing of n-smooth fft length transitions adding some staircase structure to per-iteration timing versus exponent. (n often no more than 7 as in CUDALucas; up to 13 in Gpuowl; sometimes up to 31 as in Mlucas; originally n was 2 in CUDALucas and Gpuowl and prime95 IIRC.) Judiciously or blindly choosing comparison exponents, the staircasing may misleadingly indicate an inferior fit as performing better. (Within a single fft length, it will indicate run time is linear with exponent, as an extreme example.) In practice, with program branches, environmental influences, etc. for a given program, exponent, etc, iteration timings vary during a run. And when errors occur and are detected and responded to, the total number of iterations increases unpredictably. Having a simple to calculate approximation such as c pn for run time scaling is useful at times. It could for example spare some newbies the shock of how long 100Mdigit or larger take compared to 100M, after they've already spent hours or days of hardware time, on a run they haven't the patience or resources to finish. The program authors (Woltman and Preda and Mayer IIRC) have stated that the iteration time difference between computing s2-2 mod Mp (an LL iteration) and s2 mod Mp (a PRP iteration) is negligible. The -2 is a very small part of the carry propagation step for an exponent of relevant size. This is part of why there is not sufficient justification for making a separate PRP benchmarking page in addition to LL for GPUs. Another part is that program efficiency varies considerably versus version and input operands, and also efficiency among program titles: Gpuowl > CUDALucas > CLLucas for example. The reference info collection is indeed large and growing. I now suggest people use their browser's search function to search for likely keywords, especially for those who are less patient. I come at this project from the other extreme. I spent days reading single threads such as the foundational Mfaktc thread showing its development progress from the beginning. Then went back and did it again, taking notes along the way, including the more valuable posts' urls copied and pasted into the notes files. And have revisited it since, such as when preparing https://www.mersenneforum.org/showpo...23&postcount=6. Had such a compendium existed when I started GPU GIMPS computing, I expect I would have read and reread and studied it. The quibble that much reference info is GPU oriented and presumably therefore not relevant or valid for CPU-based GIMPS computing is wrong. CPU, GPU, cloud-based, spreadsheet, or pencil and paper are all bound by intrinsic limits imposed by the same number theory. Last fiddled with by kriesel on 2021-05-22 at 15:52
2021-05-22, 19:17   #24
drkirkby

"David Kirkby"
Jan 2021
Althorne, Essex, UK

1BF16 Posts

Quote:
 Originally Posted by kriesel t ~ c p2.1, as the reference info states multiple places, including this for beginners. (Why do you persist in not using the reference info?)
Can you please help me understand how to interpret "p log p log log p"? I might be putting parentheses in the wrong place, as I'm not getting the expected value. I took took the data you gave on the reference material
https://www.mersenneforum.org/showpost.php?p=523345

~100M is 381.39 GhzDays, and wrote a program in Mathematica to compute the time in GHz days as a function of the exponent. I added a fiddle factor of 0.0710644634190389 so that the program returns 381.39 with an input of 100.

Code:
In[50]:= time2[p_]:=0.0710644634190389 p  Log[p 1000000] Log[Log[p 1000000]]

In[52]:= time2[100]

Out[52]= 381.39

In[53]:= time2[110]

Out[53]= 422.447
The output of 422.447 GHz days for an input of 110 is significantly different to the 482 GHz days stated at https://www.mersenneforum.org/showpost.php?p=523345
(Using the simpler formula, of p^2.1 would give 465.901 GHz days, which is a lot closer to 482.).

I'm probably doing something wrong, or maybe I am expecting too much.

Last fiddled with by drkirkby on 2021-05-22 at 19:28

2021-05-22, 20:26   #25
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

2×32×83 Posts

Quote:
 Originally Posted by drkirkby Can you please help me understand how to interpret "p log p log log p"?
It is:
Code:
f(p)=p*log(p)*log(log(p))

2021-05-22, 21:03   #26
drkirkby

"David Kirkby"
Jan 2021
Althorne, Essex, UK

3×149 Posts

Quote:
 Originally Posted by R. Gerbicz It is: Code: f(p)=p*log(p)*log(log(p))
Thank you. That's what I thought, and that's what I believe my Mathematica code is computing, with the log to the base e. I will have another look tomorrow. Either there's an error somewhere, or I'm expecting too much. The simpler p^2.1 gives a significantly better estimate

Dave

Last fiddled with by drkirkby on 2021-05-22 at 21:06

2021-05-22, 21:18   #27
charybdis

Apr 2020

2·3·83 Posts

Quote:
 Originally Posted by drkirkby Thank you. That's what I thought, and that's what I believe my Mathematica code is computing, with the log to the base e. I will have another look tomorrow. Either there's an error somewhere, or I'm expecting too much. The simpler p^2.1 gives a significantly better estimate
There is an error. Kriesel's reference post states that the effort scales as p log p log log p per iteration, not per test. There are p iterations so the effort per test scales as p^2 log p log log p.

2021-05-22, 22:06   #28
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10110100101112 Posts

Quote:
 Originally Posted by drkirkby Can you please help me understand how to interpret "p log p log log p"? ... I'm probably doing something wrong, or maybe I am expecting too much.
Yes and yes you seem to be.
1) You're missing a power of ~p. Time per iteration is estimated as c p log p log log p. c * p * ln(p) * ln(ln(p)). Or log2, or log10, but be consistent.
Time for the primality test is time per iteration, times number of iterations (p-2 iterations for LL, p-1 iterations for PRP), leaving aside error checking, occasional saves, other overhead and proof generation. In the range of current interest to GIMPS prime hunting, that's ~p iterations, within ~36 parts per billion or better.
2) benchmarking shows considerable variation. Different fft lengths/smoothnesses are not the same efficiency; experimental error enters into it too; hardware properties such as finite cache sizes, main memory bandwidth, etc.
3) There's no reason to expect a branching program's run time scaling to follow precisely a simple function with well behaved derivatives, and good reason to expect the program to deviate sometimes. Transitioning from one fft length to a larger, and not entirely filling its numerical size capability is one reason.
Back when only power of two fft lengths were available, people avoided running just above the changeover from one length to the next. As a greater variety of fft lengths became available, the effect was diminished, but it's still there; the staircasing I referred to in an earlier post.
Attached Files
 two exponents compared.pdf (24.3 KB, 41 views)

Last fiddled with by kriesel on 2021-05-22 at 22:14

2021-05-22, 22:15   #29
drkirkby

"David Kirkby"
Jan 2021
Althorne, Essex, UK

44710 Posts

Quote:
 Originally Posted by charybdis There is an error. Kriesel's reference post states that the effort scales as p log p log log p per iteration, not per test. There are p iterations so the effort per test scales as p^2 log p log log p.
Thank you. That's much better. The revised estimate of 464.691 GHz days is a lot closer to the 482 stated by Kriesel and on the GIMPS website. The difference is now around 3.5%.

Code:
In[73]:= time3[p_] :=
0.000710644634190389  p^2  Log[1000000 p] Log[Log[1000000 p]]

In[74]:= time3[100]
Out[74]= 381.39

In[75]:= time3[110]
Out[75]= 464.691

 2021-05-23, 08:38 #30 drkirkby   "David Kirkby" Jan 2021 Althorne, Essex, UK 3·149 Posts Thank you kriesel - I did not see your post before posting mine, but charybdis had pointed out my error - the missing p, since the time is per iteration. Might I suggest kriesel, that since that link is aimed at beginners, many of whom might not be mathematicians, that you add a couple of parentheses around the p log p log log p. I was about 90% confident I was interpreting it correctly, but stuck it in Wolfram Alpha for clarification. Then when the results did not seem to make sense, I started to wonder if I was interpreting it correctly. In interesting example is how to calculate a^b^c. If you try that in MATLAB and Mathematica, they give different answers, as MATLAB interprets it as (a^b)^c, but Mathematica interprets at as a^(b^c). From what I understand, Mathematica (and also Python) do it the way mathematicians do, but these are relationships are not well known to people who are not mathematicians. A few parentheses make it clearer. Dave Last fiddled with by drkirkby on 2021-05-23 at 08:39
 2021-05-23, 08:55 #31 Nick     Dec 2012 The Netherlands 5×349 Posts Mathematicians do it that way because (a^b)^c can simply be written as a^(bc) with no need for an extra level!
2021-05-23, 09:50   #32
drkirkby

"David Kirkby"
Jan 2021
Althorne, Essex, UK

3×149 Posts

Quote:
 Originally Posted by Nick Mathematicians do it that way because (a^b)^c can simply be written as a^(bc) with no need for an extra level!
I an understand the logic of mathematicians writing it in the most concise form, but I think it's fair to say that many people, other than mathematicians will often not know how to interpret 2^3^4. Octave, which is a clone of MATLAB, and used a lot in engineering, does it like this
Code:
>> 2^3^4
ans =  4096
>>
(MATLAB would do it exactly the same).
Mathematica computes it like a mathematician.
Code:
In[86]:= 2^3^4
Out[86]= 2417851639229258349412352
Although I don't have a copy, I remember someone saying one of the other maths programs would not accept 2^3^4, but wanted parentheses. I forget what program that was - perhaps Maxima, Macsyma, Maple or one of the other mathematical programs. I don't have any of those programs, so can't verify that.

Dave

Last fiddled with by drkirkby on 2021-05-23 at 09:51

 2021-05-23, 14:12 #33 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 10110100101112 Posts https://www.intmath.com/blog/mathema...-algebra-12416 PEMDAS Innermost parentheses first in case of nested. Top exponent first in case of stacks. Isn't every high school graduate supposed to know this? I'm not keen on turning reference info beginner material into remedial general math. Last fiddled with by kriesel on 2021-05-23 at 14:13

 Thread Tools

 Similar Threads Thread Thread Starter Forum Replies Last Post paulunderwood Miscellaneous Math 15 2016-01-23 20:53 ProximaCentauri Miscellaneous Math 15 2014-12-25 14:26 Lee Yiyuan Miscellaneous Math 60 2011-03-01 12:22 baha2007 Information & Answers 2 2007-12-08 20:12 Xordan Math 4 2007-06-07 16:05

All times are UTC. The time now is 11:17.

Wed Oct 20 11:17:14 UTC 2021 up 89 days, 5:46, 0 users, load averages: 0.84, 1.15, 1.20

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.