mersenneforum.org Digit sum of a Mersenne-prime exponent
 Register FAQ Search Today's Posts Mark Forums Read

 2021-07-15, 13:04 #23 Dr Sardonicus     Feb 2017 Nowhere 141B16 Posts At the risk of arrest for criminal belaboring of the obvious, I give one obvious reason to deem digit sums a poor candidate for any sort of correlation with Mersenne prime exponents. If b > 1 is an integer, the digit sum of n to the base b exhibits noticeable oscillations. The most extreme, of course, is the base-b digit sum of b^k - 1 is k*(b-1) and the base-b digit sum of b^k is 1. So if the largest Mersenne prime exponent with a given number k of base-b digits is only slightly less than b^k, while the next larger Mersenne prime exponent is only slightly larger than b^k, or perhaps a larger power of b, the base-b digit sum will drop noticeably. Similar oscillations can occur between consecutive exponents if the first has a large block of digits b-1 which is gone in the next. For example, consider the consecutive Mersenne prime exponents 127 and 521. To the base b = two, the digit sums are 7 and 3 respectively. In base b = ten, the consecutive Mersenne prime exponents 19 and 31 have digit sums 10 and 4 respectively, while 9941, 11213, 19937, and 21701 have digit sums 23, 8, 29, and 11 respectively.
2021-07-15, 14:09   #24
charybdis

Apr 2020

10448 Posts

Quote:
 Originally Posted by Dobri Oscillations around the linear fit are common but to have a change in slope for a dozen consecutive Mersenne primes from M40 till M51 and assume that it is happening just by chance would be in blind defense of the linear heuristic. This thread is about paths less traveled. It is a gray area, there are no definite answers as more factual information has to be obtained both theoretically and computationally. Concerning the term 'saturation', it happens when the growth slows down and eventually stops. A common example is the logistic function, see .
Yes, the deviation from the heuristic may well be statistically significant at the 2-sigma level, but I don't think that should be enough to change our views. If the new slope were to continue for another 12 Mersenne primes then I'd be more willing to accept your hypothesis. I suppose we disagree about what the prior is. I would say the prior should be heavily weighted towards the relationship being best approximated by some smooth function, as there's no precedent or plausible mathematical explanation for a sudden change in behaviour. Perhaps the relationship isn't linear and Mersenne primes do start appearing more often than the heuristic suggests as p gets larger. But I'd be far more willing to believe that the true behaviour is some smooth curve rather than two straight lines with a transition at p=20M.

Similarly, if you're at a casino and the roulette wheel comes up red 15 times in a row after behaving "normally" for a long time, you might say that is statistically significant, but the prior is heavily in favour of nothing having changed, so this would still most likely be down to chance alone. If red comes up 50 times in a row, then someone has almost certainly found a way to rig the wheel so that red comes up every time.

Saturation in the sense you mention doesn't just happen out of nowhere. It happens when the growth rate of a quantity is slowed as a result of its own growth, or when there is a natural limit to how large the quantity can grow. In other words, something is getting saturated (hence my question to you a few posts back). For example, as a population grows, it can only reach a certain level before competition for resources slows the growth down to a halt. There is only a certain amount of salt that will dissolve in water before precipitation happens just as often as dissolution. How could the existence of a particular number of Mersenne primes make higher Mersennes less likely to be prime?

Quote:
 If selecting mainly digit sums from the histogram of known Mersenne primes, obviously one would miss the discovery of Mersenne primes with digit sums observed for the first time. However, this increases the chances of discovering Mersenne primes with digits sums most frequently occurring in the digit sum histogram of the known Mersenne primes. It is a trade-off between the risks one would take with limited resources for a prolonged period of time. It is not like a game of poker when the results are known immediately.
So if you pick and choose your targets like this, you're more likely to find Mersenne primes that fit your pattern, and less likely to find ones that don't. When there's no evidence that Mersennes fitting your pattern are more likely to be prime, there's no advantage in doing this - in fact there's a disadvantage because you'd get to larger numbers more quickly. It gets worse: if we kept on following your advice, we would not only miss Mersenne primes whose digit sums haven't appeared yet, but we'd miss any more primes sharing the same digit sums as those primes. We'd be forever limiting ourselves to searching exponents with the 24 known digit sums.

2021-07-15, 14:22   #25
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

17·349 Posts

Quote:
 Originally Posted by Dobri In my opinion, further sophistication aided by machine learning with the use of multiple incomplete patterns could reduce the element of chance in the manual selection of exponents.
We have ample data set size for exponents with factors found by TF, and set with factors found by P-1, and set with no factors found by economical factoring effort but found with additional uneconomic factoring effort, and set with no factors known but conclusive primality test indicating composite. What we don't have is a large set of training data for known Mersenne primes, especially if some get classified in categories of high, medium, or low likelihood, or finer gradations for selection. https://machinelearningmastery.com/m...hine-learning/ speaks of training sets in the tens, hundreds, or thousands.

I recall reading of a system that had been trained to distinguish photos of dogs from photos of wolves. After considerable training effort with many photos, it was queried to display what distinguished one from the other. The highlighted area was the snow in the background of wolf photos. It ignored the animals completely, and revealed problems with the training data.

All of this assumes that there is an additional pattern we're not yet aware of, in effect an as yet not found shortcut to identifying "good" candidates for Mersenne primes vs. "mediocre" or "poor" candidates that requires far less computational time than a primality test or additional factoring effort. And that statistical analysis or machine learning can find the pattern if it exists. And that the signal/noise or false positive and false negative rates are acceptably low for it to be useful. And that it will also apply sufficiently well to the untested range p>104M. Those assumptions can be wrong. We're already using the product of a great deal of info. Skipping composite exponents is one example. Eliminating ~2/3 of the remaining exponents by factoring is another.

If we somehow find some way via ML of pre-scoring untested exponents, how would we validate it, and would the PrimeNet server adopt prioritizing the higher likelihood exponents, or continue with a systematic exhaustive mostly-monotonic-ascent search? We could implement that prioritizing now, prioritizing p=1 mod 8. It is however based on a very small sample size. The effect would be limited. A 2:1 probability advantage only justifies testing up to about a 1.39x larger exponent. (effort ~ p2.1; 21/2.1 ~1.39)

There's a big difference, between generating probability estimates for each individual untested unfactored exponent, by the millions, to be stored for later lookup in the database, requiring modification to the database, and having a simple easily applied mathematical description for what to prioritize, requiring some lines of code.

As to the two-slopes argument, two rejoinders. We are designed to see patterns. We see them whether they are real patterns or the predictable product of expected statistical variation. (The same thing happens with other phenomena, such as combustion engine cylinder peak pressure cycle-to-cycle variation.) The 10M-100M interval of Mersenne primes can appear as a slope increase. Or it can be seen as staircases of short gaps interrupted by landings (longer gaps). There are modest staircases at lower exponent. Compare 19K-24K to 70M-90M.

The best available theorizing says the distribution of Mersenne primes is governed by Poisson process, and an expected outcome is that some seemingly improbable outcomes are probable. Toss enough dice, and you'll get as many sixes consecutively as you have patience to pursue. There were an enormous number of primality tests performed between 10M and 100M. That improves the chances of the statistics matching well what is theoretically predicted. And that is what the graph shows. Back at 128-512, one could have made the argument that all the Mersenne primes may have been found. Or at 220,000-750,000. Slopes there were zero over ~4:1 and ~3.4:1. But these or even longer gaps may occur. If we are at the beginning of one now, the next Mersenne prime could be >100Mdigit.

Last fiddled with by kriesel on 2021-07-15 at 14:25

2021-07-15, 14:34   #26
Uncwilly
6809 > 6502

"""""""""""""""""""
Aug 2003
101×103 Posts

13×19×41 Posts

Quote:
 Originally Posted by kriesel I recall reading of a system that had been trained to distinguish photos of dogs from photos of wolves. After considerable training effort with many photos, it was queried to display what distinguished one from the other. The highlighted area was the snow in the background of wolf photos. It ignored the animals completely, and revealed problems with the training data.
How about this one? The AI being trained to determine if a skin spot was bad learned to look for the ruler in the image. Dermatologist take pictures of the bad spots with a rule to measure size. Those were knowns. The ok spots did not have rulers, as they weren't worried about the size.

 2021-07-15, 14:47 #27 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 17×349 Posts Good one. Maybe we should not be concerned about AI, but about artificial stupidity. We don't need to augment the naturally occurring kind that's already a surplus. Of dogs, wolves, snow, grass, tanks, forest, clouds, and sun: https://hackernoon.com/dogs-wolves-d...o-41c43bc7f982 Now imagine trying to do the classification and subsequent analysis of multiple possibilities reliably in real time at road speed to avoid or minimize harming or killing people. Last fiddled with by kriesel on 2021-07-15 at 14:54
 2021-07-15, 20:56 #28 Batalov     "Serge" Mar 2008 Phi(4,2^7658614+1)/2 9,629 Posts Not only Luke 23:34 but also a well-known quote from Macbeth comes to mind " ........ ........ full of sound and fury, Signifying nothing.”
2021-07-15, 21:37   #29
Uncwilly
6809 > 6502

"""""""""""""""""""
Aug 2003
101×103 Posts

13·19·41 Posts

Quote:
 Originally Posted by Batalov Not only Luke 23:34 but also a well-known quote from Macbeth comes to mind
Maybe 1Tim 6:20

2021-07-16, 11:46   #30
Dr Sardonicus

Feb 2017
Nowhere

5,147 Posts

Quote:
 Originally Posted by Uncwilly Maybe 1Tim 6:20
... or the ever-popular Matthew 15:14

 2021-07-16, 16:50 #31 Dobri   "刀-比-日" May 2018 239 Posts Let's mention also another empirical observation which might be obvious but related to this thread nonetheless. When comparing the digit sum histograms of the Mersenne-prime (Mp) and Mersenne-number (Mn) discrete distributions (in the interval from 2 to 71) in the 8-digit exponent range, the Mp histogram has entries mainly below the mean value of the Mn histogram. Assuming that there will be not so many Mp discoveries in the 9-digit exponent range (in order to go back to the linear fit of the prime number heuristic), then the Mp digit sum histogram would lay almost entirely below the mean value of the Mn digit sum histogram (in the interval from 2 to 80) except for the digit sum 47 of Mp51. I will plot and post the Mp and Mn histograms later. Meanwhile, I enjoyed reading Shakespeare and the Scriptures. It was a good distraction. Thank you!
2021-07-16, 17:09   #32
Batalov

"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

100101100111012 Posts

Quote:
 Originally Posted by Dr Sardonicus ... or the ever-popular Matthew 15:14
And of course this great principle -->
Quote:
 Originally Posted by Ecclesiastes 1:15 That which is crooked cannot be made straight: and that which is wanting cannot be numbered.

2021-07-16, 17:15   #33
charybdis

Apr 2020

22·137 Posts

Quote:
 Originally Posted by Dobri When comparing the digit sum histograms of the Mersenne-prime (Mp) and Mersenne-number (Mn) discrete distributions (in the interval from 2 to 71) in the 8-digit exponent range, the Mp histogram has entries mainly below the mean value of the Mn histogram. Assuming that there will be not so many Mp discoveries in the 9-digit exponent range (in order to go back to the linear fit of the prime number heuristic), then the Mp digit sum histogram would lay almost entirely below the mean value of the Mn digit sum histogram (in the interval from 2 to 80) except for the digit sum 47 of Mp51.
Smaller Mersennes are much more likely to be prime, so this is about as useful as pointing out that most of the Mersenne prime exponents up to 100M are below 50M.

Quote:
 Originally Posted by Dr Sardonicus ... or the ever-popular Matthew 15:14
Hoping that I haven't inadvertently fallen into the ditch...

 Similar Threads Thread Thread Starter Forum Replies Last Post PawnProver44 Miscellaneous Math 26 2016-03-18 08:48 odin Software 7 2010-04-18 13:57 ewmayer Lounge 4 2006-09-06 20:57 ET_ Factoring 39 2006-05-11 18:27 Dougy Math 4 2005-03-11 12:14

All times are UTC. The time now is 04:58.

Wed Dec 8 04:58:31 UTC 2021 up 137 days, 23:27, 1 user, load averages: 1.18, 1.42, 1.51