View Single Post
Old 2019-07-15, 16:04   #3
kriesel's Avatar
Mar 2017
US midwest

171416 Posts
Default Background

The purpose of this post is to describe a shared foundation. Posting claims counter to what's here may indicate someone is an unaware novice, or is a troll. (Once I'm done weeding out my errors, with some help from others, that is.) This necessarily covers some points and leaves out much other material; whole books have been written about individual aspects, such as factorization. It's probably a good idea to read or watch George Woltman (forum user Prime95) describe the origin and methods of GIMPS, and then maybe return and resume here.
  1. "Natural numbers" are the counting numbers 1, 2, 3, ...
  2. Expression of math operations here follow usual conventions and so are indicated as follows: addition +, subtraction -, division /, multiplication * or implied (as in the expression 2 k p = 2 * k * p), exponentiation of a to the b power ab, square root sqrt(), modulo a mod b, equal =, less than or equal <=. Operator precedence is standard PEMDAS: parentheses first, then exponentiation, multiplication, division, addition, subtraction, in that order, with ties going to left-to-right evaluation. Innermost parentheses are the first parentheses applied. The uppermost exponent is the first exponentiation applied. Function evaluation is from inside outward. Preference is given for conciseness. So 2*k*p+1 will be expressed as 2 k p + 1; log ( log (x)) will be expressed as log log x. Log without a base specified will be the natural log, loge = ln. Base may be expressed as a number following "log", as log2 or log10 or log10.
  3. I use here n, a, b, f or k for natural numbers that may be composite or prime, but p for primes. Occasionally a hopefully helpful subscript is attached.
  4. Natural numbers are either prime (having no natural numbers greater than one but smaller than themselves as exact integer divisors), or composite with factor(s) greater than one and less than the composite number. Five is an example of a prime. Six is an example of a composite: 6 = 2 * 3.
  5. Finding and verifying a single nontrivial factor (a natural number 1 < f < n) of a natural number n proves it is composite, not prime. By definition.
  6. One is not considered a prime, but a more special case.
  7. Smallest possible factor candidates are more likely to be factors than larger candidates. Two divides half the natural numbers, 3 a third, 5 1/5, f 1/f. So it's more efficient to search for factors with the smallest potential factors first.
  8. The largest possible smallest prime factor of a natural number is, the largest prime natural number no greater than the square root of the number. fmax <= sqrt(n)
  9. To search for factors of a number, it's sufficient to check only the primes below the square root of the number. Composite numbers can be skipped as potential factors, since their prime factors will divide any number they would, and some the composites won't. For example, 4 = 2 * 2, 6 = 2 * 3, 2 divides 14, 3 divides 15, but 4 or 6 divide neither. Thirteen is prime and it is sufficient to trial factor it with 2 and 3, since sqrt(13) ~3.6055 and the next prime, 5, exceeds 13's square root.
  10. Mersenne numbers are a small subset of the natural numbers, having the form of a simple function M(n) = 2n - 1 where n is a natural number greater than 1.
  11. Given a factor a of a number n = a b, the cofactor of a is b = n / a. (definition copied from
  12. Any Mersenne number with a composite exponent n is composite; cannot be a prime. See
  13. Mersenne primes are a very small subset of the set of Mersenne numbers with prime exponents. (It is not known whether the set of Mersenne primes is infinite.)
  14. Any factors of Mersenne numbers with prime exponents are of the form f = 2 k p + 1, where p is the prime exponent, and k is a natural number.
  15. Numbers 2 k p + 1 can only be factors of Mersenne numbers with prime exponent if 2 k p + 1 = either 1 or 7, mod 8. Again,
  16. Multiple methods of computerized factoring are employed for Mersenne numbers. Which are worthwhile, and the extent or level of effort to which they are worthwhile, depend on the size of the exponent, the efficiency of algorithm and implementation, and characteristics of the computing hardware involved.
  17. For some additional background, see regarding trial factoring, P-1 factoring, and the Lucas Lehmer test, regarding probable prime testing and the Gerbicz error check, and for an excellent backgrounder including FFT based multiplication of very large numbers (But note that PRP with GEC and proof generation is superior in multiple ways to LL; do PRP first tests, not LL.)
  18. Large amounts of effort have been invested in developing and optimizing factoring and primality testing code relevant to Mersenne numbers for Intel CPUs, nonIntel CPUs, and GPUs from NVIDIA, AMD and also trial factoring on Intel or AMD CPUs' integrated graphics processors. There is summary info about these at
  19. Trial division is one factorization method. Much more about it is available at (especially #12 about implementation as modpow rather than long division for greatly increased speed) and and elsewhere. There's a lot of discussion, and some Pari/GP code in the links of
  20. For exponents much smaller than the current GIMPS search area, ECM (elliptic curve method) is also useful. But not at the current Great Internet Mersenne Prime Search (GIMPS) first-test or double-check wavefront of activity.
  21. P-1 factoring is another useful method, including at currently active and higher exponents. See also and
  22. Eventually, factorization attempts produce diminishing returns for increasing computational resources, declining to the point where it is more efficient to run either a conclusive primality test (Lucas-Lehmer test), or a quite reliable probable-prime test (PRP3) that can prove a number composite but cannot prove it prime. See also
  23. For the very large numbers involved in primality tests and P-1 factoring, squarings or multiplications are performed using the complex but efficient method of irrational base discrete weighted transforms, modulo the Mersenne number. See
  24. In such lengthy computations, errors will sometimes occur. For error rates see
  25. The programs contain various error checks that are performed during the computation. These often allow detection of an error and retreating to the point before an error was detected and attempting that portion of the computation again, which can save the lengthy computation if the error is not reproducible.
  26. Some applications report the final error counts with the result. Results with counts of detected errors that may be serious concerning result accuracy can be scheduled for early checking.
  27. Primality tests were eventually done at least twice, to catch the approximately 2 percent chance of incorrect result per LL test, or lesser error rate with PRP/GEC test, with certain precautions taken to ensure results are arrived at independently. The final result of a test mod 264 (called 64-bit residue or res64) is reported to the PrimeNet server. In the case of mismatches for the same exponent and primality test type, additional runs are made until matches are obtained. A coordinated effort to follow up on mismatches is at In rare cases it takes 4, 5 or 6 LL tests to get two that match.
  28. Much more on error types and handling is available at
  29. An exciting recent development is the PRP proof, certificate, and verification process supported in gpuowl, mprime, prime95 and the PrimeNet server, and being developed for Mlucas, after first deployment in gpuowl. This is expected to replace for new tests the doublecheck effort described in #27, reducing verification effort to typically less than 1% of the current cost of verifying an LL test or a PRP test without proof generation. Existing completed tests and future runs performed without the proof code present and enabled will still need double-checks. USE PRP/GEC/proof for all first primality tests going forward whenever practical!
  30. Upon initial indication of a prime or probable prime by a reliable program, careful checking of the run for possible errors is performed. Multiple independent Lucas-Lehmer retests are made. To be eligible for prize awards, the rules must be followed, including a nondisclosure period while the multiple checks are run.
  31. Many have attempted to predict the next Mersenne prime discovery. All predictions that have been tested have failed. A prediction that is very reliable is the following: any given prime exponent produces a composite Mersenne number. There are only 51 known exceptions currently, out of ~5.76 million tested or factored below 100M as of 2021-01-11 6PM UTC, so the known Mersenne primes compose less than 9 ppm of the candidate exponents in this range. Double checking is complete through 55M as of 2021-04-21.
  32. On average, it is conjectured, Mersenne primes will occur at exponent ratios of 2^(1/egamma) where gamma is Euler's constant; this evaluates to approximately 1.475761397. However this is useless for predicting primes, since the ratio fluctuates widely. Observed values of ratio of exponents of successive Mersenne primes range from ~1.01 to 4.10
  33. An additional factoring method, P+1, has been added as of mprime/prime95 v30.6. This may be useful to those who hunt for additional factors of Mersenne numbers considerably smaller than the current wavefronts of primality testing and verification. It is believed to be unsuitably low productivity for factoring work for the wavefront.
  34. Exponents will sometimes be abbreviated/approximated by suffixes of M or G. Those are powers of 10 as defined in Elsewhere the binary near equivalent may be used, as in discussing RAM capacity or requirements. The difference at only 2.4% / power of 1000 becomes appreciable; 1 TiB = 1.0995... TB. Hard drives are marketed using the smaller units employing powers of ten, and frequently rounding upward. (A 1TB drive is ~931 GiB.)
  35. The progression of a particular exponent in the search for new Mersenne primes is typically as follows:
    1. Addition to a database when the database is created, or expanded to cover a larger range of exponents. Usually this includes some very fast TF to a low bit level. Database creation or expansion is a rare event.
    2. TF to increasing bit levels, up to a recommended maximum. The maximum is somewhat dependent on the type of hardware used for TF, and significantly dependent on Mersenne exponent, and number of primality tests potentially saved by finding a factor. Since GPUs are far more effective on TF than CPUs, it's recommended all TF be performed on GPUs, none on CPUs. A factor reported to the server is confirmed there and is conclusive proof the Mersenne number is composite.
    3. P-1 factoring in a single run with adequate bounds and memory to efficiently perform and retire the task. A factor reported to the server is confirmed there and is conclusive proof the Mersenne number is composite. Sometimes the factor reported is itself composite, delivering 2 or more factors of the Mersenne number.
    4. (No ECM, and no P+1, before primality testing)
    5. PRP with GEC and proof generation. (Previously, this step was PRP/GEC, followed by verification by PRP-DC/GEC, or earlier, LL first test, LL DC, and sometimes LL TC etc.) Ideally the PRP run is done with the proof power at or near optimal for the exponent.
    6. Verification. For PRP/GEC/proof-generation first test, a quick certification run on prime95 / mprime is performed. How quick depends on hardware, software, exponent, and proof power. For proof power 8, it takes less than 0.5% as long as the PRP/GEC/proof. Each additional increment on proof power is about twice as quick to certify. (Previously, verification was by a second PRP-DC/GEC, or for an LL first test, LLDC and sometimes LLTC etc., each taking as long as a first test.) Upon matching LL res64 values for the same exponent and standard seed value 4 (providing they are not known problematic residues & software), or upon matching PRP final res64 values and matching PRP types, or upon PRP/GEC/proof certification, primality testing is complete.
    7. Almost always, the result is a factor or composite test result. These will eventually (perhaps years later) draw the attention of the additional-factor hunters, with ECM or P+1, or P-1 with larger bounds, or TF to higher bit levels. Rarely, an initial primality test will indicate prime and without any clear indication of error. These will immediately be retested in parallel with LL on different software and hardware to confirm or refute a new Mersenne prime discovery.
Some additional reading that may be of use at some point includes:
"A Friendly Introduction to Number Theory", Joseph H. Silverman,
The Prime Pages
Knuth's Algorithms, Donald Knuth
Prime Numbers and Computer Methods for Factorization, Hans Riesel
Number Theory Web
"Prime Numbers: A Computational Perspective", Crandall and Pomerance
"Humble Pi", Matt Parker
"The C Programming Language", Kernighan and Ritchie
"Recreations In The Theory of Numbers", Albert Beiler

(Thanks to Batalov, Dylan14, LaurV and Dr. Sardonicus for contributing to the accuracy and readability of this post; see reference discussion thread.)

Top of reference tree:

Last fiddled with by kriesel on 2021-11-11 at 15:32 Reason: minor edits; add optimal proof power & link
kriesel is online now