20210522, 14:57  #23 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1011111001011_{2} Posts 
Thanks for the somewhat supportive response, Batalov.
Multiplying iteration count ~p by time per iteration which is dominated by squaring time for large p, t ~ c p^{2} log p log log p, or closer to quoting, t=(O)p^{2} log p log log p according to the number theorists for IBDWT and derivation from greats such as Knuth for SchonhageStrassen ("How fast can we multiply?"). And actual program behavior, for Gpuowl and prime95, fits well to t~c p^{2.1}, over the range ~103M to 1G exponent, the range of greatest current interest for Mersenne prime hunting. (One could argue ~56M103M is also of some lesser interest, until we're done with verification, in several years.) Of course primalitytest time is not a simple well behaved function in reality, because of finite spacing of nsmooth fft length transitions adding some staircase structure to periteration timing versus exponent. (n often no more than 7 as in CUDALucas; up to 13 in Gpuowl; sometimes up to 31 as in Mlucas; originally n was 2 in CUDALucas and Gpuowl and prime95 IIRC.) Judiciously or blindly choosing comparison exponents, the staircasing may misleadingly indicate an inferior fit as performing better. (Within a single fft length, it will indicate run time is linear with exponent, as an extreme example.) In practice, with program branches, environmental influences, etc. for a given program, exponent, etc, iteration timings vary during a run. And when errors occur and are detected and responded to, the total number of iterations increases unpredictably. Having a simple to calculate approximation such as c p^{n} for run time scaling is useful at times. It could for example spare some newbies the shock of how long 100Mdigit or larger take compared to 100M, after they've already spent hours or days of hardware time, on a run they haven't the patience or resources to finish. The program authors (Woltman and Preda and Mayer IIRC) have stated that the iteration time difference between computing s^{2}2 mod Mp (an LL iteration) and s^{2} mod Mp (a PRP iteration) is negligible. The 2 is a very small part of the carry propagation step for an exponent of relevant size. This is part of why there is not sufficient justification for making a separate PRP benchmarking page in addition to LL for GPUs. Another part is that program efficiency varies considerably versus version and input operands, and also efficiency among program titles: Gpuowl > CUDALucas > CLLucas for example. The reference info collection is indeed large and growing. I now suggest people use their browser's search function to search for likely keywords, especially for those who are less patient. I come at this project from the other extreme. I spent days reading single threads such as the foundational Mfaktc thread showing its development progress from the beginning. Then went back and did it again, taking notes along the way, including the more valuable posts' urls copied and pasted into the notes files. And have revisited it since, such as when preparing https://www.mersenneforum.org/showpo...23&postcount=6. Had such a compendium existed when I started GPU GIMPS computing, I expect I would have read and reread and studied it. The quibble that much reference info is GPU oriented and presumably therefore not relevant or valid for CPUbased GIMPS computing is wrong. CPU, GPU, cloudbased, spreadsheet, or pencil and paper are all bound by intrinsic limits imposed by the same number theory. Last fiddled with by kriesel on 20210522 at 15:52 
20210522, 19:17  #24  
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2^{6}×7 Posts 
Quote:
https://www.mersenneforum.org/showpost.php?p=523345 ~100M is 381.39 GhzDays, and wrote a program in Mathematica to compute the time in GHz days as a function of the exponent. I added a fiddle factor of 0.0710644634190389 so that the program returns 381.39 with an input of 100. Code:
In[50]:= time2[p_]:=0.0710644634190389 p Log[p 1000000] Log[Log[p 1000000]] In[52]:= time2[100] Out[52]= 381.39 In[53]:= time2[110] Out[53]= 422.447 (Using the simpler formula, of p^2.1 would give 465.901 GHz days, which is a lot closer to 482.). I'm probably doing something wrong, or maybe I am expecting too much. Last fiddled with by drkirkby on 20210522 at 19:28 

20210522, 20:26  #25 
"Robert Gerbicz"
Oct 2005
Hungary
5FA_{16} Posts 

20210522, 21:03  #26 
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2^{6}×7 Posts 
Thank you. That's what I thought, and that's what I believe my Mathematica code is computing, with the log to the base e. I will have another look tomorrow. Either there's an error somewhere, or I'm expecting too much. The simpler p^2.1 gives a significantly better estimate
Dave Last fiddled with by drkirkby on 20210522 at 21:06 
20210522, 21:18  #27  
Apr 2020
247_{16} Posts 
Quote:


20210522, 22:06  #28  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
6,091 Posts 
Quote:
1) You're missing a power of ~p. Time per iteration is estimated as c p log p log log p. c * p * ln(p) * ln(ln(p)). Or log2, or log10, but be consistent. Time for the primality test is time per iteration, times number of iterations (p2 iterations for LL, p1 iterations for PRP), leaving aside error checking, occasional saves, other overhead and proof generation. In the range of current interest to GIMPS prime hunting, that's ~p iterations, within ~36 parts per billion or better. 2) benchmarking shows considerable variation. Different fft lengths/smoothnesses are not the same efficiency; experimental error enters into it too; hardware properties such as finite cache sizes, main memory bandwidth, etc. 3) There's no reason to expect a branching program's run time scaling to follow precisely a simple function with well behaved derivatives, and good reason to expect the program to deviate sometimes. Transitioning from one fft length to a larger, and not entirely filling its numerical size capability is one reason. Back when only power of two fft lengths were available, people avoided running just above the changeover from one length to the next. As a greater variety of fft lengths became available, the effect was diminished, but it's still there; the staircasing I referred to in an earlier post. Last fiddled with by kriesel on 20210522 at 22:14 

20210522, 22:15  #29  
"David Kirkby"
Jan 2021
Althorne, Essex, UK
111000000_{2} Posts 
Quote:
Code:
In[73]:= time3[p_] := 0.000710644634190389 p^2 Log[1000000 p] Log[Log[1000000 p]] In[74]:= time3[100] Out[74]= 381.39 In[75]:= time3[110] Out[75]= 464.691 

20210523, 08:38  #30 
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2^{6}·7 Posts 
Thank you kriesel  I did not see your post before posting mine, but charybdis had pointed out my error  the missing p, since the time is per iteration.
Might I suggest kriesel, that since that link is aimed at beginners, many of whom might not be mathematicians, that you add a couple of parentheses around the p log p log log p. I was about 90% confident I was interpreting it correctly, but stuck it in Wolfram Alpha for clarification. Then when the results did not seem to make sense, I started to wonder if I was interpreting it correctly. In interesting example is how to calculate a^b^c. If you try that in MATLAB and Mathematica, they give different answers, as MATLAB interprets it as (a^b)^c, but Mathematica interprets at as a^(b^c). From what I understand, Mathematica (and also Python) do it the way mathematicians do, but these are relationships are not well known to people who are not mathematicians. A few parentheses make it clearer. Dave Last fiddled with by drkirkby on 20210523 at 08:39 
20210523, 08:55  #31 
Dec 2012
The Netherlands
3×587 Posts 
Mathematicians do it that way because (a^b)^c can simply be written as a^(bc) with
no need for an extra level! 
20210523, 09:50  #32  
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2^{6}×7 Posts 
Quote:
Code:
>> 2^3^4 ans = 4096 >> Mathematica computes it like a mathematician. Code:
In[86]:= 2^3^4 Out[86]= 2417851639229258349412352 Dave Last fiddled with by drkirkby on 20210523 at 09:51 

20210523, 14:12  #33 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
13713_{8} Posts 
https://www.intmath.com/blog/mathema...algebra12416
PEMDAS Innermost parentheses first in case of nested. Top exponent first in case of stacks. Isn't every high school graduate supposed to know this? I'm not keen on turning reference info beginner material into remedial general math. Last fiddled with by kriesel on 20210523 at 14:13 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Observation about Mersenne exponents  paulunderwood  Miscellaneous Math  15  20160123 20:53 
SophieGermain primes as Mersenne exponents  ProximaCentauri  Miscellaneous Math  15  20141225 14:26 
Assorted formulas for exponents of Mersenne primes  Lee Yiyuan  Miscellaneous Math  60  20110301 12:22 
Mersenne exponents verified  baha2007  Information & Answers  2  20071208 20:12 
Mersenne exponents  Xordan  Math  4  20070607 16:05 