2019-07-15, 15:05 | #1 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×19×107 Posts |
new participant reference
This is intended as a reference thread. Do not post here. Post comments at https://www.mersenneforum.org/showthread.php?t=23383 instead. Posts placed here may be moved or deleted without warning or recourse.
Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2022-01-02 at 14:33 Reason: added OS fundamentals for GIMPS GPU application use |
2019-07-15, 15:52 | #2 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×19×107 Posts |
new participant guidance
Welcome.
The Mersenne forum content is almost all expressed in English (a blend of UK and American). For those whose native language is not English, we native English speakers very much appreciate your efforts to communicate in English! Mersenneforum.org covers many different ways of aiding the Great Internet Mersenne Prime Search (GIMPS) progress, through trial factoring, P-1 factoring, Lucas-Lehmer testing, probable-prime testing, double checking, proof certification etc. of Mersenne numbers, 2^{p}-1. (You may see these abbreviated as TF, P-1, LL, PRP or PRP3, DC, LL DC, PRP DC, Cert.) It also provides a home for a lot of threads about related number types that have no direct connection to finding Mersenne prime numbers, such as Proth numbers, or Cunningham numbers, or activities or programs that have little or no direct connection, such as factoring for general natural numbers as an end in itself. Most of this description is GIMPS oriented. There are many different operating systems, and processor types, in common use or development. Most activity is on Windows or Linux, on Intel or AMD processors, or NVIDIA or AMD GPUs, but there is a little Mac activity, and Linux on smaller hardware including Raspberry Pi, Odroid C2, Samsung cell phones, and Intel compute sticks, as well as trial factoring via Mfakto on some models of Intel or AMD integrated graphics processors, and in rare cases gpuowl on IGPs. Most of the hardware is personally owned or employer owned. (Getting written permission for use of employer-owned gear from an authorized person is very highly recommended. It has happened that someone got permission from the wrong person, and got personally acquainted with how the FBI executes a search warrant, seizing and holding for years, all computer equipment including that of a client of the person being investigated.) Others use cloud computing. Or a combination. If you have newly signed up to the project, it may take the busy volunteers running things a while to get to approving you for making manual submissions to the PrimeNet server. Please be patient, but if you do not get set up for days, contact them. When joining the forum: Some go to considerable length to use an alias and remain essentially anonymous. Others see no reason to do that, and see drawbacks to it. Nearly all the prominent software coders are identified. Use good judgment on what personally identifying information you do and do not disclose. The forum offers reading and making public posts, optionally including up to 5 attachments of certain specified types and size limits, and sending and receiving private messages (PM) without attachments. There's also a search capability by multiple keywords. Please limit PM sent to the leading software coders. They are volunteers and there is always more to code and test. They are rare and their time valuable. When posting on the forum: Take some time to read the existing threads, FAQs, reference information, the Netiquette post, etc., before posting. Some suggested material is at Intro to GIMPS and prime95 math (omits PRP etc) https://www.mersenne.org/various/math.php Note: whenever practical, use PRP/GEC/proof generation, NOT LL for first tests. PRP/GEC/proof generation is also superior in speed and reliability for double-checks. Do not post to any forum thread: 64-bit residues ("res64") for first tests prior to verification assignment IDs MD5 hashes passwords links to porn sites, malware sites, etc., or links with tracking info included inflammatory material Mersenne prime discovery before the official Mersenne Research Inc press release is issued (early disclosure would disqualify you for any prize money) Available software (cpu and gpu summary) listing is found at http://www.mersenneforum.org/showpos...91&postcount=2 (and its historical thread https://www.mersenneforum.org/showthread.php?t=22450) reference material compilation (large!) https://mersenneforum.org/showthread.php?t=24607 cloud computing subforum https://www.mersenneforum.org/forumdisplay.php?f=134 prime95 v30.8 for CPUs https://mersenneforum.org/showthread.php?t=27366 prime95 v30.7 for CPUs https://mersenneforum.org/showthread.php?t=27180 GPU Trial Factoring FAQ (old) https://www.mersenneforum.org/showthread.php?t=16140 GPU LL testing FAQ (old) https://www.mersenneforum.org/showthread.php?t=16142 (And again, do PRP/GEC/proof whenever practical for first tests, NOT LL! GPUs that can't run a recent version of gpuowl should only do LL DC, or TF. They're probably too old to be reliable enough for P-1 without much error checking implemented, as in CUDAPM1 which has very little checking built in. They're probably too old to be energy efficient, and are candidates for retirement) Some of the threads are very long. Some show the history of development of certain applications. It can be useful to take notes as you go through those threads, including URL's for easy return to the more informative or seminal posts. Some content may not mean much to you when first encountered, but will later, after enough is learned, for its importance to be apparent, or its content to be better understood. It can be useful to read again or skim the same thread six months or a year later. The forum is not the place for submitting unremarkable results of GIMPS computations. To apply your GPU to needed work, and improve your chances of avoiding unneeded duplication of someone else's work, go to https://www.mersenne.org/manual_assignment/ or https://www.mersenne.org/manual_gpu_assignment/ to reserve work assignments. To report the results, copy and paste into https://www.mersenne.org/manual_result/ Or make use of a relevant client management package. See http://www.mersenneforum.org/showpos...92&postcount=3 and note that some applications have a separate work reservation and/or result submission script. It's standard for PrimeNet to assign some double-check (DC) work to a new computer. This is to qualify it as reliable, and helps with the backlog of double-checks, which is currently several years long. It's also a good idea to run a qualifying double check on anything you manage manually or via one of the client management tools. If the DC matches, great, that justifies confidence in the system. But if the DC does not match, it might not be a problem with your hardware, but with the other system that did the first test. What's needed is a tie-breaker triple check (TC) to determine if either is correct. These can be requested at https://www.mersenneforum.org/showthread.php?t=24148 Be courteous; avoid sounding hostile or rude. People do get temporarily banned if going too far. A long pattern can get long bans. Occasionally, permanent bans. It's generally preferable to apply some principles from Dale Carnegie's book "How to Win Friends and Influence People." If someone goes way over the line, the offensive post can be reported by clicking on the little red and white triangle at the lower left of the post. Above all, avoid profanity, bullying, threats or harassment or the appearance of such. A "smily" attached does not mean there are no limits on what will get you blocked, banned, or into legal trouble. Realize that there's a real person at the other end. Realize that text without mannerisms such as smiles as in person comes across as more negative. When posting, be generous and make an effort to keep tone positive. Effective communication is more likely when no one is annoyed, offended, insulted, disparaged, or feels those whether it was intended or not. Conversely, when reading, realize perceived offense may be unintended. Its appearance may be due in part to cultural or language differences, hurry, frustration, history, various maladies, etc. Remember that this is a worldwide community, including people from many nations, cultures, age brackets, occupations, educations, perspectives etc. Anyone from about age 13, to published number theory professors or world-class programmers/software engineers, to retired & senior citizens. Use an existing topic-specific thread when practical. Use the existing already-posted answers when possible. If English is not your native language, tell us. Responses will be more tolerant if so. Understand that people get tired of answering the same questions asked year after year from people who don't bother to read program documentation or reference posts or previous answers or use the search function first. Making an effort to find an answer yourself first helps, especially if you tell us that you did. Proofread your draft of a post. Preview before submit. After submitting it, you'll have a limited time to edit it (about an hour currently). For complex posts, an ordinary editor may be more useful; copy/paste from a draft in the forum page's editing window, to the separate editor, and back again. Use the quote and code and other tags to organize and format the post. Learn this from examining others' usage in posts that you quote. Make an effort to provide an easily read complete set of the needed context information in the same post with a question or bug report. If you're asking why something is not working how you expect, tell us at the beginning what software you're asking about, what version of the software, what OS you're running it on, what OS version or flavor, what hardware, what computation type (for mprime / prime95, Gpuowl, Mlucas or any other software that can perform more than one computation type), stage if relevant, exponent, bounds or bit levels as applicable, non-default tuning, any parameters it seems to be having difficulties with, and any other pertinent information. If asking about Linux, what version of what distribution. In the case of a GPU related question, include the GPU model, driver name and version, and perhaps hardware specs that are relevant (GPU ram for example, or NVIDIA compute capability level). A little time spent once, providing that info in a convenient format, can save many readers and the original poster a little time each, and reduce the need for Q&A that sometimes follows when such information is missing, or hidden away somewhat in a long code box line. It's especially important to be considerate of the time of the rare few very talented volunteer programmers. If posting benchmark information, specify not only time per exponent or iteration or whatever, but the exponent, computation type, details such as the TF bit level or P-1/P+1/ECM bounds, FFT length if applicable, non-default tuning parameters, the hardware, the OS, etc. so that it is meaningful. An under-specified benchmark result is not useful; a fully specified one is useful. If posting code, please indicate in a comment at the top of the code, what language the code is. There are so many programming or scripting languages, that it is very unlikely all readers will recognize any language you might use unless perhaps it is an exceedingly common one. It would also be helpful if you post what version of the language it's known to work in, or versions of popular compiler it works with. Depending on the code, it might also be useful to know operating system or hardware requirements. There are thousands of users and thousands of threads. Don't expect us to remember what hardware you own, what OS and version you run, etc. We forget, or never saw that one time or ten you posted it elsewhere. Tell us again, when and where it is relevant. It's simple enough to make a little description file once, and then copy and paste it into your posts as needed. Do this when it's helpful, and not otherwise. (We don't need or want to see anyone naively bragging in their every post about how many cores are in the one overpriced medium performance system they bought instead of a more efficient system or GPU.) Be especially rigorous and forthright if posting in math or number theory threads. Some participants may be very assertive regarding unclarity or error or what appears to be ignorance of the prior art, getting terms switched, proposing "improvements" that reduce reliability or lengthen run times, etc. A lot of very skilled people have worked the area for a long time. Extraordinary claims require extraordinary evidence, and will be met with considerable skepticism and lots of questions. It's not personal. It may help to consider questions as demonstration of a desire to understand your position, and feedback as interest in helping you. Minimize the use of idioms. They're less likely to be understood well by people whose native language is not English. An idiom example is "a piece of cake." The non-idiom equivalent is "very easy." Clear writing, with good spelling, punctuation, grammar, and structure, is always appreciated. The care in writing is paid once. The benefit is lasting. It is considerate of the other forum participants to post clean content. It's considered impolite to ask another member how much computing gear or throughput they have access to. Even if the reason you ask is to understand their perspective. Or to brag about how many cores and GPUs you have control of. Let your work speak for itself. Specifying what hardware produces a puzzling result or a fast benchmark is fine. People are happy to have (well-specified) benchmark results posted for the newest GPU or CPU models. It may help with someone else's purchasing decisions. Please read https://mersenneforum.org/showthread.php?t=19993 and https://www.mersenneforum.org/showthread.php?t=24199 for guidance what to do or how to do it, and perhaps https://www.mersenneforum.org/showthread.php?t=12945 for what not to do or how not to do it. Other parts of the Other Stuff/Forum Feedback subforum may also be useful. Do not post assignment IDs for current assignments, or 64-bit residues for results not yet verified, or MD5 hashes for proof files. Refrain from trolling or activity that will likely be interpreted as such. "a person who posts inflammatory, insincere, digressive,[1] extraneous, or off-topic messages in an online community (such as social media (Twitter, Facebook, Instagram, etc.), a newsgroup, forum, chat room, or blog), with the intent of provoking readers into displaying emotional responses,[2] or manipulating others' perception." https://en.wikipedia.org/wiki/Internet_troll Such behavior may earn a spot on individual forum users' "ignore list" (how-to-add here), or other pushback, or progressive discipline from forum moderators, consisting of reprimands, warnings, loss of blogger or moderator status, temporary ban from posting on the forum, permanent ban from the forum, or perhaps other measures. The different subforums and threads have differing purposes. Some are rather more informal and for entertainment or humor. Others are seriously oriented to a single specific purpose, such as development or support of one GPU application. Values in the more serious ones include: relevant accurate novel constructive civil helpful informative positive Please consider how you may contribute positively, other than in computing throughput, toward the project's goals. Do you have skills in programming, in a flavor of C/C++/C#, Python, CUDA, OpenCL, multithreading, assembler, profiling, web scripting, databases, etc? Technical writing? Software testing? Math or number theory specifically? Other ways to help? See also https://www.mersenneforum.org/showthread.php?t=24905 and the many other online tutorials to help develop such skills. (note, an earlier version of this was available at https://www.mersenneforum.org/showpo...3&postcount=11 for comment since April 17 2019) Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-12-31 at 18:47 Reason: revised section on trolling and countermeasures |
2019-07-15, 16:04 | #3 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×19×107 Posts |
Background
The purpose of this post is to describe a shared foundation. Posting claims counter to what's here may indicate someone is an unaware novice, or is a troll. (Once I'm done weeding out my errors, with some help from others, that is.) This necessarily covers some points and leaves out much other material; whole books have been written about individual aspects, such as factorization. It's probably a good idea to read https://www.mersenne.org/various/math.php or watch George Woltman (forum user Prime95) describe the origin and methods of GIMPS, and then maybe return and resume here.
"A Friendly Introduction to Number Theory", Joseph H. Silverman, https://www.math.brown.edu/~jhs/frint.html The Prime Pages https://primes.utm.edu/ Knuth's Algorithms, Donald Knuth Prime Numbers and Computer Methods for Factorization, Hans Riesel Number Theory Web http://www.numbertheory.org/ntw/ "Prime Numbers: A Computational Perspective", Crandall and Pomerance "Humble Pi", Matt Parker "The C Programming Language", Kernighan and Ritchie https://en.wikipedia.org/wiki/The_C_...mming_Language https://en.wikipedia.org/wiki/Prime_number "Recreations In The Theory of Numbers", Albert Beiler https://books.google.com/books/about/Recreations_in_the_Theory_of_Numbers.html (Thanks to Batalov, Dylan14, LaurV and Dr. Sardonicus for contributing to the accuracy and readability of this post; see reference discussion thread.) Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-11-11 at 15:32 Reason: minor edits; add optimal proof power & link |
2019-08-08, 13:35 | #4 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1011111010011_{2} Posts |
How much work is it to do x
Effort can be computed at https://www.mersenne.ca/credit.php for TF, P-1 factoring, or LL testing. Effort is expressed in Ghz-Days, the measure of what one core of a Core2 cpu running at 1Ghz could do in a day. Estimated performance of a given GPU is available at https://www.mersenne.ca/mfaktc.php for TF and https://www.mersenne.ca/cudalucas.php for other work types. Gpuowl performance with a recent 6.11-x or 7.x version is considerably better than indicated there.
To TF a Mersenne number with exponent 100M from starting bit level 73 to finishing bit level 76 is 133.9 GhzDays. Double the exponent is about half the effort for equal bit levels. Each bit level is twice as much effort as the one preceding. Note, in TF a GhzDay is not comparable to a GhzDay for other computation types, since GPUs are MUCH faster at TF. The ratio can be 11:1 ranging up to 40:1 or higher depending on GPU model and computation parameters. To P-1 factor a Mersenne number with exponent ~100M to PrimeNet bounds B1=1040000,B2=28080000 is 13.90 GhzDays. This scales similarly to how PRP or LL testing do, ~p^{2.1}. GCD phase of P-1 or P+1 run time is O(p (log p)^{2} log log p), and strongly dependent on CPU core speed since known GIMPS implementations use single-threaded gmplib for GCD. For p~110M, Xeon Phi 7210, GCD time ~5.7 minutes. Run time scaling is in the range of p relevant to DC and upward to 1G, ~p^{1.14}. In most applications GCD runs sequentially, stalling other CPU cores of a worker, or a GPU, for the duration of the GCD, while in some versions of Gpuowl it runs in parallel with the next P-1 stage or next assignment if a valid one exists in the worktodo file. Server confirmation of a reported factor for TF or P-1 is a trivially fast computation. To LL test a Mersenne number with exponent ~100M is 381.39 GhzDays. For ~110M it is ~482 GHzDays, or about a day on a Radeon VII gpu in a relatively recent version of gpuowl. (But do PRP with GEC and proof generation instead for greater reliability and efficiency.) Effort scales as p log p log log p per iteration, or about p^{2.1} per test. LL Double checking ("LLDC") and the occasional triple check, quadruple check, etc. are the same effort per attempt as a first test for a given exponent. Therefore, first testing using LL should cease as soon as possible. Using PRP with proof generation instead is more than twice as efficient, given LL's real world higher error rate and extremely high verification cost and extreme delays in verification time of occurrence. (Eight years is not unusual.) To PRP test a Mersenne number is basically the same effort as an LL test. In gpuowl on a Radeon VII that could be a day for ~110M. On a Core 2 Duo it could be 11 weeks or more. Gerbicz error check (GEC) as a fraction of a PRP, depends inversely on block size, typically ~0.2% of a PRP test at block size 1000. Overhead * blocksize ~ constant. Jacobi symbol check, as a fraction of an LL test, depends on frequency, typically ~0.3% of an LL test. PRP DC (without proof and verification as below) is the same effort as a first PRP test for the same exponent. Upgrade to proof generation capability as soon as possible. PRP proof generation and verification Total effort, assuming a single verification on a system separate from the PRP tester/proof-generator system and server, is, for a 100M exponent, approximately: Code:
A) power= 8, 3.2 GB temporary disk space needed, proof file size 113MB, 413K squarings = 0.41% of a full DC, default B) power= 9, 6.4 GB temporary disk space needed, proof file size 125MB, 239K squarings = 0.24% of a full DC C) power=10, 12.8 GB temporary disk space needed, proof file size 138MB, 182K squarings = 0.18% of a full DC. Code:
A) power= 8, 3.2 GB temporary disk space needed, proof file size 113MB, computation ~0.02% of a full DC, default B) power= 9, 6.4 GB temporary disk space needed, proof file size 125MB, computation ~0.04% of a full DC; C) power=10, 12.8 GB temporary disk space needed, proof file size 138MB, computation ~0.08% of a full DC. Prime95 will reserve proof generation required disk space at the beginning and hold it for the duration, releasing the temporary disk space upon completion. "As exponents increase, squarings, disk space, and proof size increase roughly linearly." https://www.mersenneforum.org/showpo...1&postcount=75 For Gpuowl, maximum working system ram during proof generation for proof power 9 was observed in Task Manager as ~0.25 GB, which only takes about a minute at the end of a PRP computation for p~104M, occupying 1 cpu core. Ram in use increased as it began at level 1 and successively built higher levels of the proof, with ~0.25 GB seen as it performed the level 9 proof build step. Server computation related to PRP proof is a small fraction of the total verification effort, at 1414 squarings ~14 ppm of a PRP test for p~100M, power 8; 1577 squarings ~16 ppm for power 9. It's unclear how that varies versus exponent. https://www.mersenneforum.org/showpo...&postcount=189 Note, the server CPU is SSE2 hardware and its code is based on gwnum routines, so is limited to handling up to ~595.8M exponent automatically. Higher requires manual intervention by George. PRP Proof Verification as a fraction of a PRP or PRPDC, for a hypothetical 100M exponent: Code:
A) power= 8, proof file size 113MB, topk= ceiling(p/2^{8})*2^{8} = 100M, topk/2^{8} = 390,625 squarings = 0.39% of a full DC B) power= 9, proof file size 125MB, topk= ceiling(p/2^{9})*2^{9} = 100000256; topk/2^{9} = 195313 squarings ~0.195% of a full DC C) power=10, proof file size 138MB, topk= ceiling(p/2^{10})*2^{10} = 100000768; topk/2^{10} = 97657 squarings = 0.098% of a full DC. Overall, LL vs. PRP compared: LL + DC + occasional TC, QC, etc, ~2.04 tests at ~100M exponent, ~2.5 tests at 100Mdigits, to get a matched pair of res64s, which are presumed to constitute verification of those two runs. (There are some bugs which will cause erroneous residues that are far from random.) PRP with GEC & proof generation & cert: ~1.01 test equivalent, to get a proven correct result. PRP's error detection is far superior, and the overall project efficiency is more than double that of LL. (Increasingly so at larger exponents.) That's why first time LL assignments are no longer issued by the PrimeNet server. The reliability of LL has historically been a declining function with exponent increase. Longer run times create more chance of computing error that may escape detection. That strengthens the case against LL which inherently has inferior error detection and recovery, as exponent and run time increase. For first tests, run PRP with GEC & proof generation whenever possible. Only run LL with its lesser error detection and lesser efficiency, if PRP is not possible. The preceding is for implementations of equal efficiency on equal or equivalent hardware. If comparing recent gpuowl to CUDALucas or ClLucas, add about another factor of 2 disadvantage for LL, and note neither of them include the Jacobi symbol check. Just don't LL! To find the next Mersenne prime, compared to the current largest. R D Silverman lays it out at https://www.mersenneforum.org/showpo...58&postcount=8 as approximately 8 times as much effort, based on conjectures about the expected distribution. GIMPS has had a very lucky run for the past several years where the Mersenne primes have been more closely spaced recently, than expected on the average. If the remaining number of Mersenne primes with exponent p<10^{9} fits conjectures, there are 6 left to find. A rough estimate of time to complete the search of p<10^{9} is 150 years. If they are equally spaced in time that's 25 years apart. That's far longer than GIMPS previous experience, averaging ~17/25 ~ 0.68 per year. (Particular thanks go to Preda and Prime95 who helped me understand the proof and verification resource usage) Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-10-02 at 16:41 Reason: add GCD time, P-1 scaling, update discovery rate, misc edits |
2019-12-18, 15:15 | #5 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×19×107 Posts |
GIMPS Glossary
See also the background post. https://www.mersenneforum.org/showpo...65&postcount=3
Software application names are intentionally omitted from this glossary. See the "Available Mersenne prime hunting software" table for those. http://www.mersenneforum.org/showpos...91&postcount=2 100Mdigit: There's an Electronic Frontier Foundation prize for finding the first prime large enough to occupy 100 million digits (100 megadigits). A GIMPS subproject is to search for such a prime and win that prize. This requires an exponent of at least 100,000,000/log_{10}(2) = 332,192,807 for the Mersenne number. The smallest prime exponent above that minimum is 332,192,831. AID assignment identifier: The 32 hexadecimal character unique identifier that the PrimeNet server and clients use to identify and communicate about a specific assignment. These appear in worktodo entries, results output, and internet traffic. AIDs are to be kept confidential while they are valid (active). If they are not kept confidential they could be abused by some miscreant. A fundamental rule of GIMPS etiquette is to never post on the forum a valid AID. Moderators will intervene and remove such content. Don't make more such work for the moderators which may annoy them. assignment: designation by the PrimeNet server, through either automated or manual operation, of a specific computation task as reserved to one GIMPS user. See https://www.mersenneforum.org/showpo...8&postcount=22 for what the various forms look like. Manual assignments are obtained through https://www.mersenne.org/manual_assignment/ or https://www.mersenne.org/manual_gpu_assignment/. Note these assignments can expire before completion and result report. base: in P-1 or PRP, the starting number which is raised to successive powers. Usually 3. bit level: in TF, indication of how far TF has been performed. For example, 74 corresponds to attempting factors up to 2^{74} in size.. A single bit level would be advancing the TF performed by one, such as from 2^{73} to 2^{74}. bound: a limit. Specific to P-1 factoring, an upper limit on the set of primes used in a stage of factoring, referred to as B1 and B2. certificate a file showing completion of a PRP proof verification. Such a file can be very rapidly verified as valid by a server. CERT depending on context, a worktype, or the performance of the work. Mprime / prime95 support receiving CERT assignments via the PrimeNet API, receiving the required hashed input file generated by the PrimeNet server (from a proof file the server received from a client earlier), and computing from the received hashed file a CERT type result record for upload to the PrimeNet server. counterfeit a deliberate fake, a fraud (crime) perpetrated seeking fame or fortune. In GIMPS results, a fabricated result for work not performed, or only slightly performed. In computing hardware, devices disguised and sold to seem to be more than they are. USB external SSDs are one current example, and high capacity flash drives appear to be another. The PrimeNet server implements some anti-counterfeiting measures. Hardware buyers beware, and be prepared to test hardware upon receipt. CPU: central processing unit. Can be used for any GIMPS computation, although CPUs have largely been supplanted by GPUs for trial factoring within the p<2^{32} exponent range. CUDA: "CUDA is a parallel computing platform and programming model invented by NVIDIA." https://www.geforce.com/hardware/technology/cuda It is proprietary and only usable on NVIDIA GPUs. There are many different version levels. As of 2021-05-29, latest is v11.3. DC: abbreviation for double-check double-checking: running a second primality test on an already tested exponent, with the same primality test type, seed, residue-type, as applicable. If the runs are independent, and error counts are low, matching res64 values are presumed to both be correct. ECM: Elliptic curve method of factoring. This is suitable for Mersenne numbers with exponents below 40 x 10^{6}. It is not suitable at the current GIMPS wavefronts or above. expiration: the end of an assignment before it's completed, because time ran out. Expiration occurs earlier if there's no progress reported. Time from assignment to expiration depends on the assignment type, exponent, and last date of progress reported if any. exponent: for a Mersenne number, the power, or number of times 2 is multiplied together, as the 3 in 2^{3}=2x2x2=8 before subtracting a 1, yielding M(3)=2^{3}-1=7 extension: adding additional time to complete an assignment. See https://www.mersenne.org/manual_extension/ FFT: Fast Fourier Transform; Finite Fourier Transform fft length: number of double precision words length of an FFT used for computing products or squares. In the range of exponents of current GIMPS wavefront interest, fft length ~ exponent/17.5. For more, see https://www.mersenneforum.org/showpo...21&postcount=7 FPGA Field programmable gate array. To my knowledge there have been many suggestions/proposals to use FPGAs in GIMPS over the years, but no working designs or compatible software created or announced or demonstrated or shared. See https://mersenneforum.org/showthread.php?t=2674 for a recent iteration. GEC: Gerbicz error check: a highly reliable error check on the PRP test, identified by Robert Gerbicz, with nearly 100% detection for errors in the PRP iteration process, which allows resuming from an earlier saved state and trying again. It is independent of earlier error detection methods. The high detection rate for software or hardware error in computing PRP iterations provides MUCH higher reliability for completed tests than a Lucas-Lehmer test on the same software and hardware for large exponents. See https://www.mersenneforum.org/showpo...17&postcount=9 Other errors not caught by this are possible and have been observed, rarely. The check is built into PRP code of recent versions of gpuowl, Mlucas, mprime, prime95. GHzDay: in GIMPS, a unit of computing work equivalent to what a 1 Ghz Core2 processor core would accomplish in a day with efficient GIMPS code. GhzD/day: in GIMPS, a computing rate equivalent to what a 1 GHz Core2 processor core would accomplish. Note that the same hardware may have very different ratings for very different computation types. CPUs generally have ratios between TF and other computations that are near one, while GPUs can be very different; ratios of 11:1 to 40:1 faster TF have been observed. gigadigit: There's an Electronic Frontier Foundation prize for finding the first prime large enough to occupy 1000 million digits (1000 megadigits). A small subproject is to search for candidates for such a prime and hypothetically someday win that prize. This requires an exponent of at least 1,000,000,000/log_{10}(2)-3.3 = 3,321,928,092 for the Mersenne number. The smallest prime exponent above that minimum for which no factor has yet been found is 3,321,928,307. Note that sufficient TF depth for such a Mersenne number is 91 or 92 bits, representing a total TF effort investment per exponent of up to ~151,000 - 302,000 GHD TF. Preparatory P-1 to sufficient bounds (B1 17,000,000, B2 1,000,000,000) is estimated in 2021 to require about a year each exponent on some available higher end consumer hardware, and require 16GiB ram for stage 1, 64 GiB ram for stage 2. Software to do PRP with proof generation at this exponent level does not yet exist. PRP run time estimates in 2021 are of order 5 years each try on a Radeon VII GPU, much longer on available CPUs. Such exponents are outside the mersenne.org server range (by 3.322 times), so preliminary TF & P-1 effort is being coordinated in "Operation OBD" forum threads. As the exponent increases, the odds of a prime diminish, while the testing effort grows rapidly. GIMPS: the Great Internet Mersenne Prime Search GPU: graphics processing unit. https://en.wikipedia.org/wiki/Graphics_processing_unit hoard: to obtain assignments, especially a large number of them, and do nothing with them for much of the assignment lifetime. Manual assignments performed on applications which are not PrimeNet interfaced for status update may appear to be hoarded while actually quietly making unreported progress toward completion. IGP: integrated graphics processor. A CPU and IGP together on one chip or in one package share power budget and memory bandwidth. IGPs tend to be low performance compared to discrete GPUs. For more see https://hexus.net/tech/tech-explaine...processor-igp/ Jacobi check: computation of a Jacobi symbol value. Applied to the LL interim full residue, it has a 50% chance of detecting a software or hardware error affecting the LL iterations, which allows resuming from an earlier saved state and trying again. It is independent of earlier error detection methods. If applied after every iteration, it would have a 75% chance of detecting error. It is too computationally expensive for that. It is typically applied every several hours. It is not applicable to TF or PRP. It is also somewhat applicable to P-1 but quite expensive to do there. See also https://www.mersenneforum.org/showpo...1&postcount=10 The check is built into LL code of recent versions of gpuowl, Mlucas, mprime, prime95, except that LL has been dropped from gpuowl versions V7.0 and higher. latency: the time from beginning to end of a specific computing task. Reducing latency can come at the cost of lowered throughput. LL: Lucas-Lehmer test which is a conclusive primality test for Mersenne numbers if performed accurately. Without the Jacobi check, the LL test observed residue error rate is about 2% historically, increasing with run time. Beginning with seed value 4 is standard. LLDC: Repeating the LL on the same exponent, ideally by a different participant, shift, software, and hardware, but same seed value, in an attempt to verify the first LL test by producing a matching final Res64 value. See DC and double checking above. If either LL test has an error affecting the final result, a triple check (or more) becomes necessary. M_{x}: see (1) below. Mx: in the GIMPS context, (1) the Mersenne number with exponent x. For example, M7 would represent 2^{7}-1 = 127, the fourth entry in the list of known Mersenne primes https://www.mersenne.org/primes/ (2) the xth known Mersenne prime. Note this usage, while historically common, both conflicts with (1) and creates ambiguity. Generally the ambiguity can be resolved by context, or noting suffix values significantly greater than that for the highest known Mersenne prime exponent, #51*, p=82589933, very likely indicate exponent, not sequence number. The ambiguity remains for small prime x for which 2^{x}-1 is also prime; 2 3 5 7 13 17 19 31. To reduce ambiguity, and the need for every reader to reason out which is meant, please avoid the second usage going forward. Mpx or Mp#x: the x-th known Mersenne prime in sorted order. Its ranking is regarded as provisional and it is followed customarily by an asterisk * or sometimes a question mark "?" if verification of all others below its exponent has not yet completed. For example, Mp48 but Mp49* and above currently, since double-checking has not yet completed up to 74207281 as of 6 October 2021. Mp#4 is 2^{7}-1 = 127. (There is not to my knowledge, usage of the analogous Mnx, which would be the Mersenne number (composite or prime) given by the x-th prime exponent.) milestone: finding a new Mersenne prime or proving a known Mersenne prime is the nth in the sorted set is considered a major GIMPS milestone. Completing an additional range of million exponent value, in first test or successful double-check, is considered a minor milestone. There's a web page for those. necropost: post in an existing thread which has had no posts in the past several months or years. It's mildly discouraged. Sometimes it's useful, as in cases where the hardware technology or number theory have advanced in a relevant way. numerology: unsound or baseless mathematical reasoning. See also https://en.wikipedia.org/wiki/Numerology#In_science OEIS: the Online Encyclopedia of Integer Sequences. OEIS includes https://oeis.org/A000668 (Mersenne primes) and https://oeis.org/A000043 (exponents of Mersenne primes) offset: an integer number of bit positions to which the seed or base is initially shifted. Shifting the computations differently causes the expected numerical errors in floating point FFT computations to affect the operands differently. Different, pseudorandomly selected, and nonzero shift counts are preferred. OpenCL: "OpenCL is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units, graphics processing units, digital signal processors, field-programmable gate arrays and other processors or hardware accelerators." It is a standard supported by AMD, Intel, NVIDIA, and ARM. https://www.khronos.org/opencl/ OpenGL: "a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. The API is typically used to interact with a graphics processing unit (GPU), to achieve hardware-accelerated rendering." https://en.wikipedia.org/wiki/OpenGL (while reportedly OpenGL has been used to perform nongraphics computations, there is no known GIMPS application using OpenGL) overclock: running computing hardware faster than the rated frequency. This tends to reduce reliability of the output, reduce performance per watt-hour, and can reduce hardware lifetime. P-1: A factoring technique for finding factors with the special property that they are one greater than a number with many small prime factors; a standard part of the TF, P-1, primality test, and verification sequence applied in the GIMPS hunt for new Mersenne primes https://en.wikipedia.org/wiki/Pollar...92_1_algorithm P+1: A factoring technique for finding factors with the special property that they are one less than a number with many small prime factors. This method is thought to be too unproductive to be a part of the GIMPS hunt for new Mersenne primes, but may be useful in searching for additional factors of smaller Mersenne numbers, as an additional capability introduced in mprime / prime95 v30.6 https://en.wikipedia.org/wiki/Willia...2B_1_algorithm poach: work on and report a result for an exponent and computation type combination that was currently assigned to someone else. This is frowned upon, both because it can irritate the person with the assignment, and because it often results in wasteful duplication of computation. PrimeNet: the API, software, and server that enables automated issuance of assignments and submission of results from GIMPS client software such as mprime and prime95 and updating of a central GIMPS database. Proof computations accompanying a PRP test, after preserving residues at selected points, creating a file allowing independent verification of the PRP test's completion and correctness. Generating a proof requires (a) software that supports it (currently sufficiently recent versions of gpuowl, mprime, prime95), (b) configured to use sufficient available disk space to hold all the needed temporary files, (c) a PRP run not LL, (d) from (almost) the beginning of the test. Proof-capable software is required before the iterations reach ~p/2^{proof_power}. PRP: Probably prime test, a primality test that is conclusive when performed accurately and indicating composite, and highly likely to have identified a prime if it indicates probably prime, but requiring a conclusive primality test, such as LL, be run to confirm the prime. PRP tests are much more reliable than LL, when guarded by the excellent GEC error detection. There are several different PRP test types. For final residues to match for the same exponent, tests must use the same PRP test type and seed. See residue type. relative prime, or relatively prime Two (or more) numbers are said to be relatively prime if they have no factors in common, i.e. their greatest common divisor is 1. QA: quality assurance, the act of testing computations on a wide variety of exponents, fft lengths in primality testing or P-1, bit levels in TF, unusual hardware, and reporting in detail any anomalies encountered, and testing one software against another for matching results, relying on the likelihood they are both right rather than both wrong in such a way as to give matching results res64: 64-bit residue, the least 64 bits of a larger number, usually represented in ASCII hexadecimal in console output residue type: in PRP, there are at least 6 identified residue types. For the residues to match, the residue types must match. Those obtaining manual assignments for PRP DC should take care to ensure they reserve assignments that fit in residue type the capability of the software they intend to use to run the double-checks. For PRP, type 1 is standard. For PRP-CF, type 5 is standard. See also https://www.mersenneforum.org/showpo...32&postcount=8 scaling: how run-time or memory requirements vary with exponent or other relevant variables for a given software application. Number theory provides lower limits to what is possible scaling. Testing by sampling at widely spaced variable values provides an indication of achieved scaling. Schrodinger number: a number which someone has claimed is prime and has also claimed is composite, without proof in either case, or has claimed that the discovery of a prime factor is irrelevant. Analogous to Schrodinger's cat. Sort of like an illegitimate child; it's not their doing either. Two examples. seed: in LL testing, the starting value to which the LL iterations are successively applied. Standard seed for GIMPS LL primality testing is 4. Other values, such as 10 or 2/3, are valid for finding a Mersenne prime, but are not used, since differing seed values make full-residue values, res2048 values, and res64 values for composites not match between tests of differing seed values, with serious negative consequences for double-checking. In PRP testing, the starting value on which successive powerings are performed. Standard seed for GIMPS PRP primality testing is 3. shift: an alternate term for offset smooth: having small factors. A number that has no factors greater than n is called n-smooth. For example, 42 = 2 * 3 * 7 is 7-smooth. straggler: exponent that is late in completing, delaying reaching minor milestones. TC: triple check. When a doublecheck residue does not match the first-test residue, for the same test type and seed, a tie breaker is needed, so a third check is run, and occasionally a fourth, etc, until matches are obtained, with acceptably low error counts. Around 2% of LL tests get triple checks. (More if the exponent is much larger than wavefront exponents or the hardware is unusually slow.) A much lower fraction of PRP tests get triple checks since the Gerbicz error check performed along the way in most PRP tests makes PRP test results much more reliable. TF: trial factoring throughput: the long term rate of production of computational results. See also latency TPU: "Tensor Processing Units (TPUs) are Google’s custom-developed application-specific integrated circuits (ASICs) used to accelerate machine learning workloads. TPUs are designed from the ground up with the benefit of Google’s deep experience and leadership in machine learning." https://cloud.google.com/tpu/docs/tpus There is currently no known GIMPS application that makes use of TPUs. tuning: adjusting various parameters in the command line or ini file or configuration file of an application, usually to increase throughput underclock: running computing hardware slower than the rated frequency. This tends to increase reliability of the output, increase performance per watt-hour, and can cause equipment to run at lower temperatures and extend hardware lifetime. undervolt: operating computing hardware at lower than the nominal voltage, to reduce power dissipation. The reduced power at a given clock rate may in turn allow additional overclocking. VDF: verifiable delay function. The function that made the low cost PRP proof generation and independent certification possible, and in turn, allowed reducing DC work required per PRP test to less than 0.5% from 100%. verification generally, confirming correctness of a first primality test by repetition independently. More recently, performing the independent "CERT" check beginning with a PRP proof file that has been processed by the PrimeNet server and downloaded to a CERT-capable client. wavefront: the range of exponents, for a given computation type, where the bulk of the activity is occurring. It moves upward over time. See https://www.mersenneforum.org/showpo...73&postcount=2 for a definition and some historical values wildcat: (proposed; not in common use) work on tasks without assignments to anyone. This is relatively low but not zero risk in P-1 or primality testing above ~350M. TF this way is both more common and more problematic. This is sometimes less than accurately referred to as poaching, which is activity in conflict with assignments to others. Possible synonyms or variants are squat, or "working off the books" (which is also a term for tax evasion, so has a negative connotation not justified in context, and it's just too long and wordy); pioneer if working somewhat ahead of the wavefront; scout if working substantially ahead of the wavefront; explorer if working all over the mersenne.org range. zombie: (proposed; not in common use) exponent or system that is making little or no progress. It could be the hardware is slow, the assignment follows another being worked on, the error rate and retry is high, or the application has stopped or the system was upgraded and application not relaunched. See also straggler. Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-12-26 at 15:42 Reason: added AID / assignment identfier; gigadigit |
2019-12-20, 16:55 | #6 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3·19·107 Posts |
Older reference thread
There is a reference thread from 2003 by PrimeMonster at https://www.mersenneforum.org/showthread.php?t=1534
Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2019-12-20 at 16:58 |
2020-01-16, 14:44 | #7 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1011111010011_{2} Posts |
Best practices
In general, what would constitute best practices for GIMPS effort? My proposal:
Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-12-31 at 17:59 Reason: edit for style |
2022-01-02, 14:32 | #8 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×19×107 Posts |
OS fundamentals for GIMPS GPU application use
Modern OSes offer both GUIs and command line interaction. Learn the fundamentals of both, before attempting to run GIMPS GPU applications. GIMPS GPU applications are not GUI applications, they are console text applications. As such they can be invoked from Windows batch files or Linux shell scripts. Attempts to launch GPU applications from the GUI side might work, but if they don't, which is almost certain the first tries, you probably will not get a chance to see what's wrong. Run them from the command line in a session that will remain long enough to show and allow capture of the output including error messages from the program or OS.
A new user will not be successful in running GIMPS GPU applications without understanding adequately most of the following, for the OS in use:
Some possible resources for learning these things can be found online, such as Linux: https://ubuntu.com/tutorials/command...ers#1-overview Windows: https://docs.microsoft.com/en-us/win...ndows-commands There are also "help" on Windows and "man" on Linux. Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2022-01-02 at 19:20 |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
prime95-specific reference material | kriesel | kriesel | 11 | 2021-06-18 16:49 |
Mfaktc-specific reference material | kriesel | kriesel | 8 | 2020-04-17 03:50 |
gpu-specific reference material | kriesel | kriesel | 4 | 2019-11-03 18:02 |
Available software and reference info | kriesel | Information & Answers | 3 | 2019-05-06 16:34 |
NFS reference | Jushi | Math | 2 | 2006-08-28 12:07 |