Time for a new entry in this data series for the just-discovered 49th M-prime. Using the classic B-smoothness and the L2-smoothness measures I define in post #32 we have

p = 74207281: p - 1 = 2^4.3.5.7.44171. (For comparison, p + 1 = 2.107.346763).

Compared to a sample of over 1000 of its peers (primes in [p-10^4, p+10^4]; cf. attachment #2),

B -smoothness: 74207281 is 574 of 1146, percentile = 50.00

L2-smoothness: 74207281 is 180 of 1146, percentile = 84.38

Since B-smoothness only cares about the largest prime factor - here 44171 is ~60% the size of 74207280, logarithmically speaking - we land smack in the middle of the sample according to that metric. L2-smoothness includes all the factors, so there the small factors boost the resulting percentile.

-------------------

Further: With a view towards large-exponent asymptotics I did various best-fit experiments, starting with the full 49-point 'knowns' dataset and truncating various chunks at the low end. Here x is the index of the M(p) (in size-sorted order rather than by discovery date, obviously) and y = log2(p). As my comment notes, the final sample of just the largest 9 M(p)s leads to a big shift in the fit-line:

Code:

[1] Least-squares of full 49-point dataset gives slope = 0.5465, y-intercept = 1.1208
[2] Omitting 10 smallest M(p): Sample size = 39, xavg = 30.0000, yavg = 17.6238
Least-squares omitting 10 smallest M(p) gives slope = 0.5252, y-intercept = 1.8685
[3] Omitting 20 smallest M(p): Sample size = 29, xavg = 35.0000, yavg = 20.2390
Least-squares omitting 20 smallest M(p) gives slope = 0.5240, y-intercept = 1.8977
[4] Omitting 30 smallest M(p): Sample size = 19, xavg = 40.0000, yavg = 23.0612
Least-squares omitting 30 smallest M(p) gives slope = 0.4351, y-intercept = 5.6575
[5] Omitting 40 smallest M(p): Sample size = 9, xavg = 45.0000, yavg = 25.1945
Least-squares omitting 40 smallest M(p) gives slope = 0.1895, y-intercept = 16.6660 <*** Holy crap! ***

Using the 5 distinct regressions to predict both the 49th and the 50th M-prime we get a wide range of estimates:

Code:

E.g. using bc -l:
l2 = l(2)
a = 0.1895; b = 16.6660
x=49;lgp=a*x+b;e(lgp*l2)
x=50;lgp=a*x+b;e(lgp*l2)
[1] p49 ~= 250337642; p50 ~= 365627666
[2] p49 ~= 203901903; p50 ~= 293441972
[3] p49 ~= 199761040; p50 ~= 287243698
[4] p49 ~= 132131573; p50 ~= 178642487
[5] p49 ~= 64890322; p50 ~= 73998875

Attachment #1 has the above in graphical form - clearly, omitting the smallest 40 M(p) gives much too small a statistical sample to take seriously in the viewed-at-large sense, but it is rather striking how the most recent 10 M(p) line up quite neatly on a very different trendline.

Serge Batalov comments:

That is indeed what I (re)posted as a riddle (from David Eddy) in the Ooops forum for the users to chew on.

9 heads = 9 times the ratio is below expected geometric mean ratio of 1.48...

Hence the odd slope for the last 10 known primes.

It still doesn't challenge the confidence interval based on all known primes to include the Wagstaff slope.