![]() |
Reference material discussion thread
This is where I'd prefer the reference material be publicly discussed. (Not in the reference material threads themselves.)
|
You are doing an important job.
:two cents: |
[QUOTE=kriesel;488884]What's the exponent required for 10, 100 or 1000 megadigit Mersenne primes?
How was that calculated? Also included is a rough ballpark estimate of what's feasible on a GTX1070 in CUDALucas 2.06beta.[/QUOTE] 33,219,283 , 332,192,831 and 3,321,928,097 respectively. |
[QUOTE=kriesel;488884]How was that calculated?[/QUOTE]
[URL="https://www.rapidtables.com/math/algebra/Logarithm.html"]Logarithms[/URL]. I won't repeat what ET said, but just use the logarithms properties to compute the binary logarithm of 10 at the power 1M (the first number with 1M decimal digits), considering that \(\log_a x^n=n\log_a x\) and \(\log_a x=\frac{log_b x}{log_b a}\). To calculate how many digits in base 5 will \(10^{1000000}\) have, you need to compute \(\log_5 10^{1000000}\). To calculate how many bits will \(10^{1000000}\) have, you need to compute \(\log_2 10^{1000000}\). That is the power of 2 you need to raise 2 to get 10^1M (i.e a number with 1M digits). Then round it to the next prime. |
[QUOTE=LaurV;488947]To calculate how many digits in base 5 will \(10^{1000000}\) have, you need to compute \(\log_5 10^{1000000}\).
To calculate how many bits will \(10^{1000000}\) have, you need to compute \(\log_2 10^{1000000}\). That is the power of 2 you need to raise 2 to get 10^1M (i.e a number with 1M digits). Then round it to the next prime.[/QUOTE] But keep also in mind that 10^1000000 has 1000001 digits, and 10^999999 has 1000000 digits.:smile: |
[QUOTE=Uncwilly;488529]You are doing an important job.
:two cents:[/QUOTE] Thanks! Try as I might, I can not get those two lincolns from the screen to my pocket ;) (That's not why I'm doing this. When joining the gpu Mersenne hunting effort a little over a year ago, I looked for reference material and found less than I expected. What I found was scattered about. Made my own for my own use, and figured I might as well share and save someone else some time or puzzlement or wasted cycles. And feedback from doing so could help enlighten me; win-win.) |
Discuss reference material, here, not in reference threads please; and some questions
The posts #3-5 above are moved here and were in reference to [URL]http://www.mersenneforum.org/showpost.php?p=488884&postcount=11[/URL].
At this point there's been only one view of the attachment to that post, which is what my rhetorical questions were intended as the setup for. (I've modified that post's text a bit to be hopefully more clear about that.) Some nice posts, thoughtful, well formatted; I just don't want them in the reference thread, so they're relocated to here. Are people reluctant to view attachments for some reason, or pdfs in particular? If so, why? Do you prefer other attachment types? Some way of inlining the content? Do attachments not show up as available in some browsers? What would you recommend or prefer? |
[QUOTE=axn;488952]But keep also in mind that 10^1000000 has 1000001 digits, and 10^999999 has 1000000 digits.:smile:[/QUOTE] Good point, and part of why I originally included two columns in the quick reference table in the attachment, the lowest and highest integer exponents for Mersenne numbers to have precisely 10[SUP]n[/SUP] digits, vs. 10[SUP]n[/SUP] values, n=1,2,..10.
10[SUP]10^6[/SUP] has 10[SUP]6[/SUP]+1 decimal digits, but 10[SUP]10^6[/SUP]-1 has 10[SUP]6[/SUP], as does 10[SUP]10^6[/SUP]/9.99 or 10[SUP]10^6[/SUP]/8. I just now checked the cell formulas in the underlying spreadsheet against [URL]http://oeis.org/A034887[/URL] which covers2[SUP]p[/SUP], p=0,1,...72. |
[QUOTE=kriesel;489059]Some things to check if the system uptime or other reliability is less than quite good.
[LIST=1][*]How old is the hardware? (Hard drive etc not too ancient? All components and connectors well seated and making good contact?)[*]Recent backups, running on schedule, well monitored to ensure they're actually running to completion?[*]How well patched is the system?[*]How well is it protected from power interruption or transients or sags? (Voltage regulating UPS?)[*]Do you have a way of monitoring the line voltage?[*]What do system logs have to say?[*]How detailed and complete is your system logging? (Is some logging going to another system or storage device? Will it survive a HD problem in the system of interest?)[*]What OS is it running?[*]What other software?[*]Is it safe from children and other small animals?[*]System components and memory pass reliability tests? What if anything does/would a serious diagnostics attempt tell you? [URL]https://lifehacker.com/5551188/best-computer-diagnostic-tools[/URL][*]What assumptions are you making and may not even realize it?[*]Temperature of components and ambient environment in a reasonable range?[*]Relative humidity in a reasonable range?[*]All fans in the system in good working order? Grilles and components free of dust, lint, and pet hair?[*]A full complement of drivers, of reliable versions, typically up to date except for recent releases with known issues?[*]Well secured?[*]Correct power supply output voltages, and adequate current output for all the components now installed on all voltage levels? System components get added, and power supply components degrade over time. Wattage required varies with operating temperature, clock rate, program execution, etc.[/LIST][/QUOTE] I'd say that you have to upgrade the bios as well. |
[QUOTE=SELROC;489062]I'd say that you have to upgrade the bios as well.[/QUOTE]
Thanks, added. Also modified the other one you commented on. |
[QUOTE=axn;488952]But keep also in mind that 10^1000000 has 1000001 digits, and 10^999999 has 1000000 digits.:smile:[/QUOTE]
Haha, I think it is the second time when you catch me with this...:redface: Anyhow, it is irrelevant, because log(10,2) is 3.32192809488736 and when you multiply it with either 10M or 10M-1, you get 33219280.xx and 33219277.xx, respectively, and there is no prime in between. The next prime candidate for the exponent is (as ET already said) 33219283 (which has 10M+2 digits, probably). :razz: |
[QUOTE=LaurV;489237]Haha, I think it is the second time when you catch me with this...:redface:
Anyhow, it is irrelevant, because log(10,2) is 3.32192809488736 and when you multiply it with either 10M or 10M-1, you get 33219280.xx and 33219277.xx, respectively, and there is no prime in between. The next prime candidate for the exponent is (as ET already said) 33219283 (which has 10M+2 digits, probably). :razz:[/QUOTE] Added some columns for nearest prime exponents. [URL]https://www.alpertron.com.ar/ECM.HTM[/URL] shows 33219281 prime, a closer fit than 33219283 for least prime exponent >=10million digits.Mersenne number. Calculatorsoup calculator agrees on primality of 332919281. |
[URL="http://www.mersenne.ca/exponent/33219281"]You are right[/URL], and [URL="http://www.mersenne.ca/exponent/33219283"]both of them[/URL] have 10M+1 digits. I trusted ET's numbers without check.
|
[QUOTE=LaurV;489277][URL="http://www.mersenne.ca/exponent/33219281"]You are right[/URL], and [URL="http://www.mersenne.ca/exponent/33219283"]both of them[/URL] have 10M+1 digits. I trusted ET's numbers without check.[/QUOTE]
Thanks for the confirmation, and the additional cross-check, James' excellent site. |
How best to determine exponent limit for a transform type and fft length?
1 Attachment(s)
As I understand it, there are approximate rules, such as compute the number of bits of the candidate Mersenne number being expressed per word of the fft length, and compare to the accepted ok values, for DP transforms; approx 20 bits/word for 2M, and subtract about a bit for every power of two increase, so 8M gets limited to 18-18.2 or so. (Don't rely on these, they're from memory, can vary from program to program, do vary from transform type to type, etc.)
For the 4M -fft DP in gpuOwL v1.9, there was no prior experience available to consult. Preda guessed a number of bits/word. But a difference of 1 or even 0.1 bit per word seems small but means considerably different limit exponents. The more general issue is there does not seem to be a known or sharp function for what will run successfully n iterations or what won't, or how many iterations it will take versus exponent to have excessive round-off error or some other error. Rather it is fuzzy, variable. (An exponent p that completes successfully is one that doesn't generate an error in p iterations. The one I'm looking for is the highest up to which they all complete, defining the useful range of the transform and size.) In trying to determine the bounds experimentally, I settled on the following. Run some short trials at the expected bits/word exponent value and a little above and a little below, going up or down until a spot is found where some failures occur within a small number of iterations and nearby longer runs ok to higher number of iterations. Then approach the bound from the right (higher exponent), because a quick fail provides bounds information much more efficiently than a slow success. In some cases errors are detected in 500 to 5000 iterations. Tabulate iterations to failure versus exponent. Patterns will seem to emerge from the data collected this way. Often the next run or two will provide a contradiction to the last perceived pattern. See for example the green points on the plot attached, which were run after most of the red points. (Orange were first on the right) The slope of trend lines plotted seems to steepen as the limit is approached from above. The point where such trend lines predict running enough iterations to complete before an error occurs, is an estimate for the bound. The vertical position for the blue points merely reflects where I chose not to spend more iterations yet, so a trend line through those means very little. Is there a better way to explore the transform's limits? |
The "limits" are "soft" (think cotton fabric, not hard steel), i.e. we are talking about "bands" in which some exponents works, some not, and the width and position of each limit on the number line is dependent of how the FFT is implemented.
|
[QUOTE=kriesel;488537]Here is the latest posted version of the list I am maintaining for gpuOwL. As always, this is in appreciation of the authors' past contributions. Users may want to browse this for workarounds included in some of the descriptions, or for an awareness of some known pitfalls. Please respond with any comments, additions or suggestions you may have, preferably by PM to kriesel.
(last updated 2018-12-31)[/QUOTE] About item #47 on your pdf: I have just recompiled both gpuowl and mfakto with the new g++8 but I am not seeing any performance improvements. |
(in reference to [URL]https://www.mersenneforum.org/showpost.php?p=505634&postcount=18[/URL])
Good post. The calculus is not exact, but the idea is right. With the patience of the user, you could add something about his personality/mood too, hehe, some of us consider that having a factor to prove compositeness is more "cute" than having two LL tests, that is why we go a little bit higher with the factoring. Or, better said, longer (in time, because factor hunters will prefer lower bitlevel, not higher, and they will also prefer higher exponents, exactly for the reason you said, to find more factors per time unit). As we advance in the exponents list, LL is getting more and more difficult, and the factoring (TF) is getting easier (but not by much, I mean for the same bitlevel, where the same probability lies - or is it lays?) |
Reference material discussion thread
[QUOTE=kriesel;512813]For any Mersenne number formed from a composite exponent, the resulting Mersenne number is composite. Some of the Mersenne number's factors are easily found by deriving them from the prime factors of the composite exponent.[/QUOTE]A common convention in exhibiting factors in Cunningham tables is to deal just with the "primitive part," supplemented with any "intrisnsic" prime factor, or, as they arise, cracked into Aurifeuillian factors. The standard factorization of x^n - 1 as the product of cyclotomic polynomials gives a partial factorization. This is easily implemented for integers like 2^30 - 1 as follows.
[code]? fordiv(30,m,f=1;fordiv(m,d,f*=(2^d-1)^moebius(m/d));print(m" "f)) 1 1 2 3 3 7 5 31 6 3 10 11 15 151 30 331[/code]Each divisor of the exponent gives a cyclotomic factor. Your example fortuitously includes an "intrinsic" prime factor, the factor 3 for the divisor 6 of 30. The last factor, 331, for the divisor 30 of 30, is the "primitive part" of 2^30 - 1. |
Perhaps another topic for inclusion is:
Why don't we run the algorithm backwards. Start at s=0 and work in reverse to see if we get s=4. Which follows on from: why doesn't someone just go back one iteration and create a save file that yields a zero residue upon running the last step, and fool the system? [size=1][color=grey]Obviously because discrete logs are hard, but many seem unaware of that.[/color][/size] |
(In response apparently to [url]https://www.mersenneforum.org/showpost.php?p=515172&postcount=9[/url])
There is no such thing like "iterations independent of exponent". The only "independent" iterations are the few steps where the squaring didn't reach m, so x^2 (natural) is equal to (x^2 (mod m)). Once you take the first modulus, the "independence" is gone. So, they can't serve as a "self test" because the modulus part of your software is never tested, unless you [U]take[/U] a modulus calculation, and to use them as a start value instead of the classic 4 or 10, to save time, you save... how much? few hundred milliseconds? for a LL test that takes a month? Don't forget that by squaring, the result doubles every time, so to reach 80 million bits, you only need 26 squares, so you can start with the iteration 25 or 26 and run the other 79 999 9xx iterations instead of 80 M. Big savings, hehe, it would be more complicate to keep track of every exponents to get the starting iteration... plus memory to store them... |
[QUOTE=LaurV;515366](In response apparently to [URL]https://www.mersenneforum.org/showpost.php?p=515172&postcount=9[/URL]
or [URL]https://www.mersenneforum.org/showpost.php?p=510741&postcount=3[/URL]) There is no such thing like "iterations independent of exponent". [/QUOTE]Yes, there is, at the very beginning; seed 4 LL, iteration one is the same 0xE value for any p>4, for a trivial example, disregarding pseudo random offsets where applicable. Add another iteration at about every doubling of p. [QUOTE]The only "independent" iterations are the few steps where the squaring didn't reach m, so x^2 (natural) is equal to (x^2 (mod m)). Once you take the first modulus, the "independence" is gone. So, they can't serve as a "self test" because the modulus part of your software is never tested, unless you [U]take[/U] a modulus calculation, and to use them as a start value instead of the classic 4 or 10, to save time, you save... how much? few hundred milliseconds? for a LL test that takes a month? Don't forget that by squaring, the result doubles every time, so to reach 80 million bits, you only need 26 squares, so you can start with the iteration 25 or 26 and run the other 79 999 9xx iterations instead of 80 M. Big savings, hehe, it would be more complicate to keep track of every exponents to get the starting iteration... plus memory to store them...[/QUOTE] This may be a case of vigorous agreement re [URL]https://www.mersenneforum.org/showpost.php?p=510741&postcount=3[/URL] or incomplete understanding of [URL]https://www.mersenneforum.org/showpost.php?p=515172&postcount=9[/URL]. The proposed selftest in 9 is admittedly very brief and not comprehensive. Some confirmation of function early could be a good thing. There are periodically novice CUDALucas users unknowingly running for hours or days or longer with fft lengths or other misconfigurations that produce garbage from the start. (For one recent example, see [URL]https://www.mersenneforum.org/showpost.php?p=509707&postcount=178[/URL]) It is not necessary to "keep track of every exponents" to know the starting iteration possible, or know what self-test iteration is the limit. The usable initial-self-test iteration number can be calculated from the exponent constituting the current assignment. Or the minimum exponent for a given self-test iteration number could be predetermined for LL seed 4, LL seed 10, PRP3 cases, and stored also in very small tables. Then the maximum suitable iteration number could be found from a fast reverse order sequential search. It is likely that the various applications are programmed so that the iteration loop does the Lucas Lehmer s[SUP]2[/SUP]-2 carry and modulo, or the PRP3 equivalent loop including modulo, from the very start, whether the s term has grown large enough to actually be affected by the modulo or not. So that modulo related code would be getting somewhat exercised in the very early iterations. As I recall, the modulo is basically for free in the IBDWT representation, so it's hard to avoid doing it. (I think Ernst's explanation at [URL]https://www.mersenneforum.org/showpost.php?p=1483&postcount=10[/URL] is applicable and helpful here.) It's probably faster and more reliable to include the modulo and carry from the start. It's unlikely there's any branch included to avoid it, since as both Laurv and I have already said more or less, it's less than 1ppm of the iterations at the beginning, which is beyond trivial. (But detecting early a serious error would not be trivial.) |
About the manual assignment progress reports. Different programs have wildly different progress logs. Example from Mlucas:
[C][May 20 03:38:41] M50346019 Iter# = 24280000 [48.23% complete] clocks = 00:26:42.587 [160.2587 msec/iter] Res64: 54912F2731A45A41. AvgMaxErr = 0.042192089. MaxErr = 0.062500000. Residue shift count = 12144693. [/C] But minimum common requirements should be current iteration count and Res64/shift so that interim residues could be stored, if needed. I propose that instead of having the manual results submission form decipher many different log formats, there should be a simple common format, and let the client submission script do the conversion to that. |
[QUOTE=nomead;517219]About the manual assignment progress reports. Different programs have wildly different progress logs. Example from Mlucas:
[C][May 20 03:38:41] M50346019 Iter# = 24280000 [48.23% complete] clocks = 00:26:42.587 [160.2587 msec/iter] Res64: 54912F2731A45A41. AvgMaxErr = 0.042192089. MaxErr = 0.062500000. Residue shift count = 12144693. [/C] But minimum common requirements should be current iteration count and Res64/shift so that interim residues could be stored, if needed. I propose that instead of having the manual results submission form decipher many different log formats, there should be a simple common format, and let the client submission script do the conversion to that.[/QUOTE] I'm not concerned about Mlucas format, because the word is Madpoo and EWMayer were working together on implementation of the Primenet API in Mlucas v19, in development now. Mlucas is a cpu application, which is what the Primenet API was designed for and implemented for on the server and in prime95 & mprime. I requested only CUDALucas progress reporting support, initially. But what client submission script? Some gpu applications lack any client submission script, so all work reservation and submission is manual. Manual progress updates will be done only as needed in that use case. CUDALucas, for example, and CUDAPm1, have no client submission scripts. (Also many earlier versions of gpuowl, but there are so many log styles in gpuowl versus version, and many different gpuowl versions are fastest depending on the exponent / fft length, so I think supporting all the older useful versions of gpuowl progress reporting by the manual submission page is too much to ask.) |
[QUOTE=kriesel;517312]I'm not concerned about Mlucas format, because the word is Madpoo and EWMayer were working together on implementation of the Primenet API in Mlucas v19, in development now. [/QUOTE]
And it currently uses a Python script adapted from mfloop.py, works well enough but of course doesn't handle progress reports. [QUOTE] But what client submission script? [/QUOTE] For CUDALucas / clLucas, there's llloop.py included in [URL="https://github.com/MarkRose/primetools"]primetools[/URL] - haven't tested it though, since I put about 99% of GPU effort into factoring now. But it shouldn't be that hard to adapt to other GPU LL / PRP applications. [QUOTE](Also many earlier versions of gpuowl, but there are so many log styles in gpuowl versus version, and many different gpuowl versions are fastest depending on the exponent / fft length, so I think supporting all the older useful versions of gpuowl progress reporting by the manual submission page is too much to ask.)[/QUOTE] Pretty much what my point was, choose some simple common format and then stick to it. So if the client output format changes, adapt the script, not the submission page. Like worktodo.txt - easy to parse and it's been around long enough that pretty much everything should support it directly. mfaktc seems to be pretty flexible with its output even now, but I see that as less of a problem anyway. I mean, how many people do GPU trial factoring work that takes so long, that expiration without progress reports could be a problem? |
[QUOTE=nomead;517325]
For CUDALucas / clLucas, there's llloop.py included in [URL="https://github.com/MarkRose/primetools"]primetools[/URL] - haven't tested it though, since I put about 99% of GPU effort into factoring now. But it shouldn't be that hard to adapt to other GPU LL / PRP applications.[/QUOTE]I contacted the author some time ago; essentially no one used lloop.py and it was abandoned years ago. And it does not do progress reports. [QUOTE]how many people do GPU trial factoring work that takes so long, that expiration without progress reports could be a problem?[/QUOTE]Agreed. And why I emphasized gpu primality test progress reporting, not P-1 or TF. Weeks-long TF or P-1 assignments are possible but not common. A possible benefit of CUDALucas or cllucas progress reporting, that is not very relevant to the other gpu applications, is early detection of pathological runs that are producing all 0x0 or 0x02 or 0xfffffff800000000 interim res64, with potential savings of days or weeks of wasted run time. Some users don't recognize those as an issue, but if they could be persuaded to do interim progress reports, the server might educate them! Mainly though, manual progress reporting is about preventing expiration of computations that are progressing, and reducing the chafing of other users who see no indicated progress at server reports and imagine wrongly that means no progress is occurring. |
Background
The purpose of this post is to describe a shared foundation. Posting claims counter to what's here may indicate someone is an unaware novice, or is a troll. (Once I'm done weeding out my errors, with some help from others, that is.) This necessarily covers some points and leaves out much other material; whole books have been written about individual aspects, such as factorization. It's probably a good idea to read [URL]https://www.mersenne.org/various/math.php[/URL] and then maybe return and resume here. [LIST=1][*]"Natural numbers" are the counting numbers 1, 2, 3, ... [URL]https://en.wikipedia.org/wiki/Natural_number[/URL][*]Expression of math operations here follow usual conventions and so are indicated as follows: addition +, subtraction -, division /, multiplication * or implied (as in the expression 2 k p = 2 * k * p), exponentiation of a to the b power a[SUP]b[/SUP], square root sqrt(), modulo a mod b, equal =, less than or equal <=.[*]I use here n, a, b, f or k for natural numbers that may be composite or prime, but p for primes. Occasionally a hopefully helpful subscript is attached.[*]Natural numbers are either prime (having no natural numbers greater than one but smaller than themselves as exact integer divisors), or composite with factor(s) greater than one and less than the composite number. Five is an example of a prime. Six is an example of a composite: 6 = 2 * 3.[*]Finding and verifying a single nontrivial factor (a natural number 1 < f < n) of a natural number n proves it is composite, not prime. By definition. [URL]https://en.wikipedia.org/wiki/Prime_number[/URL][*]One is not considered a prime, but a more special case.[*]Smallest possible factor candidates are more likely to be factors than larger candidates. Two divides half the natural numbers, 3 a third, 5 1/5, f 1/f. So it's more efficient to search for factors with the smallest potential factors first.[*]The largest possible smallest prime factor of a natural number is, the largest natural number no greater than the square root of the number. f[SUB]max[/SUB] <= sqrt(n)[*]To search for factors of a number, it's sufficient to check only the primes below the square root of the number. Composite numbers can be skipped as potential factors, since their prime factors will divide any number they would, and some the composites won't. For example, 4 = 2 * 2, 6 = 2 * 3, 2 divides 14, 3 divides 15, but 4 or 6 divide neither. Thirteen is prime and it is sufficient to trial factor it with 2 and 3, since sqrt(13) ~3.6055 and the next prime, 5, exceeds 13's square root.[*]Mersenne numbers are a small subset of the natural numbers, having the form of a simple function M(n) = 2[SUP]n[/SUP] - 1 where n is a natural number greater than 1.[*]Given a factor a of a number n = a b, the cofactor of a is b = n / a. (definition copied from [URL]http://mathworld.wolfram.com/Cofactor.html[/URL])[*]Any Mersenne number with a composite exponent n is composite; cannot be a prime. See [URL]https://www.mersenneforum.org/showpost.php?p=512813&postcount=4[/URL][*]Mersenne primes are a small subset of the set of Mersenne numbers with prime exponents.[*]Any factors of Mersenne numbers with prime exponents are of the form f = 2 k p + 1, where p is the prime exponent, and k is a natural number.[*]Numbers 2 k p + 1 can only be factors of Mersenne numbers with prime exponent if 2 k p + 1 = either 1 or 7, mod 8.[*]Multiple methods of computerized factoring are employed for Mersenne numbers. Which are worthwhile, and the extent or level of effort to which they are worthwhile, depend on the size of the exponent, the efficiency of algorithm and implementation, and characteristics of the hardware involved.[*]For some additional background, see [URL]https://www.mersenne.org/various/math.php[/URL] regarding trial factoring, P-1 factoring, and the Lucas Lehmer test, [URL]https://www.mersenneforum.org/showpost.php?p=514417&postcount=9[/URL] regarding probable prime testing and the Gerbicz error check, and [URL]https://magazine.odroid.com/article/prime-number-discovery-use-odroid-c2-make-mathematical-history/[/URL] for an excellent backgrounder including FFT based multiplication of very large numbers[*]Large amounts of effort have been invested in developing and optimizing factoring and primality testing code relevant to Mersenne numbers for Intel cpus, nonIntel cpus, and gpus from NVIDIA, AMD and also trial factoring on Intel integrated graphics processors. There is summary info about these at [URL]http://www.mersenneforum.org/showpost.php?p=488291&postcount=2[/URL][*]Trial division is one factorization method. Much more about it is available at [URL]https://www.mersenneforum.org/showpost.php?p=508523&postcount=6[/URL] and [URL]https://www.mersenneforum.org/showpost.php?p=505634&postcount=18[/URL] and elsewhere[*]For exponents much smaller than the current GIMPS search area, ECM (elliptic curve method) is also useful. But not at the current Great Internet Mersenne Prime Search (GIMPS) first-test or double-check wavefront of activity.[*]P-1 factoring is another useful method, including at currently active and higher exponents. See also [URL="https://en.wikipedia.org/wiki/Pollard%27s_p_%E2%88%92_1_algorithm"]https://en.wikipedia.org/wiki/Pollard%27s_p_%E2%88%92_1_algorithm[/URL] and [URL]https://www.mersenneforum.org/showpost.php?p=501984&postcount=17[/URL][*]Eventually, factorization attempts produce diminishing returns, declining to the point where it is more efficient to run either a conclusive primality test (Lucas-Lehmer test), or a quite reliable probable-prime test (PRP3) that can prove a number composite but cannot prove it prime. See also [URL]https://www.mersenneforum.org/showpost.php?p=488897&postcount=12[/URL][*]In such lengthy computations, errors will sometimes occur. For error rates see [URL]https://www.mersenneforum.org/showpost.php?p=509283&postcount=19[/URL][*]The programs contain various error checks that are performed during the computation. These often allow retreating to the point before an error was detected and attempting that portion of the computation again, which can save the lengthy computation if the error is not reproducible.[*]Some applications report the final error counts with the result. Results with counts of detected errors that may be serious concerning result accuracy can be scheduled for early checking.[*]Primality tests are done at least twice, to catch the approximately 2 percent chance of incorrect result per test, with certain precautions taken to ensure results are arrived at independently. The final result of a test mod 2[SUP]64[/SUP] (called 64-bit residue or res64) is reported to the primenet server. In the case of mismatches for the same exponent and primality test type, additional runs are made until matches are obtained. A coordinated effort to follow up on mismatches is at [URL]https://www.mersenneforum.org/showthread.php?t=24148[/URL][*]Much more on error types and handling is available at [URL]https://www.mersenneforum.org/showthread.php?t=24136[/URL][/LIST]Some additional reading that may be of use at some point includes:
"A Friendly Introduction to Number Theory", Joseph H. Silverman, [URL]https://www.math.brown.edu/~jhs/frint.html[/URL] The Prime Pages [URL]https://primes.utm.edu/[/URL] Knuth's Algorithms, Donald Knuth Prime Numbers and Computer Methods for Factorization, Hans Riesel Number Theory Web [URL]http://www.numbertheory.org/ntw/[/URL] "Prime Numbers: A Computational Perspective", Crandall and Pomerance "Humble Pi", Matt Parker "The C Programming Language", Kernighan and Ritchie [URL]https://en.wikipedia.org/wiki/The_C_Programming_Language[/URL] (Thanks to Batalov, Dylan14, and LaurV for contributing to the accuracy and readability of this post.) |
[QUOTE=Batalov;521567]Sorry, no, not better. Now this post looks like a tumor. Volume keeps growing, information content stays constant.
To quote from Linda Richman, "A little basic background" is neither little nor basic. "Talk amongst yourselves. I am feeling varklempt" (...after reading just 1/3 of it). Sticking to the same question as before (and just that, for brevity): "current item #14" still says "For example, 2[SUP]6[/SUP] - 1 = (2[SUP]2[/SUP]-1)[SUP]2[/SUP] 2[SUP]3[/SUP]-1." Which neither explains the point, nor is (ahem) correct. (2[SUP]2[/SUP]-1)[SUP]2[/SUP] 2[SUP]3[/SUP]-1 = 71, not 63.[/QUOTE] I began writing this is an attempt to go from beginning definitions to a rudimentary basis for understanding or a bit more, with links forward, both for sincere newbies, and as a place to point nonsensical trolls and perhaps save time correcting or countering them. (Trolls can be initially amusing but quickly become tiresome.) I'm open to constructive suggestions for the title, organization, and content. Re [URL]https://www.mersenneforum.org/showpost.php?p=521532&postcount=2[/URL], I saw M6 != M(6) and previously missed the missing parentheses, since added. (In the usual notation, M6=2[SUP]17[/SUP]-1; for other readers, [URL]https://www.mersenne.org/primes/[/URL]) |
A few suggestions:
For 5. it should be that a single nontrivial factor of a natural number proves the number composite (meaning a number other than 1 or the number itself). For 20. link to the full list of programs (which is [url]https://www.mersenneforum.org/showpost.php?p=488291&postcount=2[/url]). And finally, perhaps after 28., we should note that some tests need to be done 3 times (or more) due to mismatches, and that there is a dedicated effort to handle these ([url]https://www.mersenneforum.org/showthread.php?t=24148[/url]). |
[QUOTE=LaurV;521611]You do not need to write 2kp+1 with spaces. My joke related to the "operation of k" was to say that you need to be constant in notation, and do not use "x" for multiplication, but it was not my intention to say that writing 2kp with no spaces is wrong. :blush:[/QUOTE]
No worries. Your post led me to more generally reviewing conventions used, and state them, and getting rid of some ambiguities and inconsistencies along with making it I think more readable in spots. For example I had used f2 when meaning f[SUB]2[/SUB] not f times 2. Thanks also to Dylan14. |
Parentheses are important but that is only the first problem that comes up with that explanation.
[QUOTE=kriesel;521531]For example, 2[SUP]6[/SUP] - 1 = 2[SUP](2*3)[/SUP] - 1 = (2[SUP]2[/SUP]-1)[SUP]2[/SUP] (2[SUP]3[/SUP]-1)[/QUOTE] No. That's not algebra, that's like explaining that "this chemical reaction works because of... alchemistry!" "Abracadabra, 63 = 3 * 3 * 7 = {and because of that} = (2[SUP]2[/SUP]-1)[SUP]2[/SUP] (2[SUP]3[/SUP]-1)" No, the "3"s here are not (2[SUP]2[/SUP]-1)[SUP]2[/SUP]. they are (2+1)(2[SUP]2[/SUP]-2[SUP]1[/SUP]+1) Straightforward explanation: 2[SUP]6[/SUP] - 1 = (2[SUP]3[/SUP])[SUP]2[/SUP] - 1 = (2[SUP]3[/SUP]+1)(2[SUP]3[/SUP]-1), because this is a x[SUP]2[/SUP] - 1 and you learned this at school. Or 2[SUP]6[/SUP] - 1 = (2[SUP]2[/SUP])[SUP]3[/SUP] - 1 = (2[SUP]2[/SUP]-1)(2[SUP]4[/SUP]+2[SUP]2[/SUP]+1), because this is a x[SUP]3[/SUP] - 1 and you learned this at school or you learned this now. Take a piece of paper, open the parentheses, and observe how everything cancels. That is sufficient. If you want to get fancy, you can continue and say, "but there is more, (2[SUP]3[/SUP]+1) = (2+1)(2[SUP]2[/SUP]-2[SUP]1[/SUP]+1) and [B]that [/B]equals 3 *3 " [QUOTE=kriesel;521531]...and 2[SUP]15[/SUP]-1 = 2[SUP](3*5)[/SUP] - 1 = 32767 = 7 * 31 * 151 = (2[SUP]3[/SUP] - 1) (2[SUP]5[/SUP] - 1) 151.[/QUOTE] Same thing. Because there is no explanation what 151 is, there is no coherent story just "this is so, because I said so". Instead: 2[SUP]15[/SUP] - 1 = (2[SUP]5[/SUP])[SUP]3[/SUP] - 1 = (2[SUP]5[/SUP]-1)(2[SUP]10[/SUP]+2[SUP]5[/SUP]+1) Or 2[SUP]15[/SUP] - 1 = (2[SUP]3[/SUP])[SUP]5[/SUP] - 1 = (2[SUP]3[/SUP]-1)(2[SUP]12[/SUP]+2[SUP]9[/SUP]+2[SUP]6[/SUP]+2[SUP]3[/SUP]+1) What's more for every 2[SUP](r*s)[/SUP] - 1 (with s>1), have a look at how the above factorization repeats every time: [url]https://primes.utm.edu/notes/proofs/Theorem2.html[/url] And finally, there is a much better way. Just say - "you will benefit from reading this ( [url]https://www.mersenne.org/various/math.php[/url] ). Ask questions if you will have them." P95's webpage is coherent, brief and convincing, and has links to more detailed explanations. Yours leaves the reader more confused that they were when they started reading it. |
Composite exponents
As is well known,
(x[sup]n[/sup] - 1) /(x - 1) = x[sup]n-1[/sup] + x[sup]n-2[/sup] + ... + 1. Now let a > 1 and b > 1 be integers. We apply the above formula to x[sup]ab[/sup] - 1. If we substitute x[sup]a[/sup] for x and b for n, we obtain (x[sup]ab[/sup] - 1)/(x[sup]a[/sup] - 1) = x[sup]ab - a[/sup] + x[sup]ab-2a[/sup] + ... + 1. Similarly, substituting x^b for x and a for n, (x[sup]ab[/sup] - 1)/(x[sup]b[/sup] - 1) = x[sup]ab - b[/sup] + x[sup]ab-2b[/sup] + ... + 1. The [i]ne plus ultra[/i] of this is the "cyclotomic factorization" corresponding to the factorization into cyclotomic polynomials [tex]x^{n}\; -\; 1\; =\; \prod_{d\mid n}\Phi_{d}(x)\text{.}[/tex] Examples of its application may be found in this Forum, e.g. [url=https://www.mersenneforum.org/showpost.php?p=486009&postcount=13]here[/url] For more on the application of cyclotomic polynomials to integer factorization I suggest [url=https://www.maths.lancs.ac.uk/~jameson/cyp.pdf]this paper[/url]. |
From new participant reference:
[QUOTE=kriesel;521664] Mersenneforum.org covers many different ways of aiding the GIMPS progress, through trial factoring, P-1 factoring, Lucas-Lehmer testing, probable-prime testing, double checking, etc. of Mersenne numbers, 2p-1.[/QUOTE] Needs a superscript: 2[sup]p[/sup]-1 [code]2[b][[/b]sup[b]][/b]p[b][[/b]/sup[b]][/b]-1[/code] |
[QUOTE=hansl;521669]From new participant reference:
Needs a superscript: 2[sup]p[/sup]-1 [code]2[B][[/B]sup[B]][/B]p[B][[/B]/sup[B]][/B]-1[/code][/QUOTE] Oops. Done. Say hi to gretl. [url]https://www.youtube.com/watch?v=9246msCh7x4[/url] |
[QUOTE=nomead;517219]About the manual assignment progress reports. Different programs have wildly different progress logs. Example from Mlucas:
[C][May 20 03:38:41] M50346019 Iter# = 24280000 [48.23% complete] clocks = 00:26:42.587 [160.2587 msec/iter] Res64: 54912F2731A45A41. AvgMaxErr = 0.042192089. MaxErr = 0.062500000. Residue shift count = 12144693. [/C] But minimum common requirements should be current iteration count and Res64/shift so that interim residues could be stored, if needed. I propose that instead of having the manual results submission form decipher many different log formats, there should be a simple common format, and let the client submission script do the conversion to that.[/QUOTE] What I was looking for was manual progress reports that are submitted manually, by the user, not by some script. Motivations are to provide the server with interim residues from manual assignments, keep a manual assignment from being expired, and possibly detect error conditions earlier. Perhaps a useful but easy form to enter would be: <appname> <version> <console output line> Only few <appname> <version> combinations would be supported. Candidates are CUDALucas v2.06 GpuOwL v6.5 and maybe v5.0 v3.8 These could be case-insensitve. So an example would be cudalucas 2.06 followed by an entire console output line copy/pasted. Like this: cudalucas 2.06 | May 30 22:04:09 | M89000671 10000000 0xf2af4c9456abf0de | 5040K 0.09766 39.4100 197.05s | 36:01:03:19 11.23% | Then the server can easily parse any such interim progress submission as <appnamestring><spacedelimiter><appversionstring><spacedelimiter><consolelinestring>, parse the consolelinestring in that app/version's way, and get the exponent, iteration number, and interim residue, or if the appname version combination at the front is not a supported one, respond "Application and version combination submitted is not supported by the manual results page." If the error is because the user typed cudlucs instead of cudalucas, or 2.60 instead of 2.06, they can try again. |
[QUOTE=kriesel;522098][B]Mlucas[/B]:
[Test|DoubleCheck]=[<AID>,]<exponent>,<how_far_factored>,<p-1_done> DoubleCheck=B83D23BF447184F586470457AD1E03AF,22831811,66,1 Test=DDD21F2A0B252E499A9F9020E02FE232,48295213,69,0 Test=332220523,80,1[/QUOTE] Mlucas can also handle just a prime exponent on a line by itself, which triggers an LL test of the corresponding Mersenne number. But if one is using the primenet.py script for assignment/results management, its "do I need to fetch new work?" calculation only uses official Primenet-server-style assignments, with the 32-digit hex code, like the first 3 lines in the above sample. The next release, v19, will add PRP support. |
From point 5 of post 18 of the Prime GPU Computing reference material thread, you mentioned that mersenne.ca has exponent status up to 10^10. This is true, and it also has a list of exponents beyond that limit which have factors [URL="https://www.mersenne.ca/export/"]available to be downloaded (see the bottom of the page).[/URL]
|
(grrr, man, my OCD gets ticked when I see #value and #div0 in those tables!)
|
Thanks for doing this!
Feel free to delete this post (as it isn't reference material), but I just wanted to say thanks for doing this.
The curation of this relatively complex subject domain is important and beneficial! :tu: (Apparently referring to the Google Colab blog thread I've started in this blog; moved here from that thread; Kriesel) |
How much faster is 5700 XT vs RX480? How long it takes to PRP on the 90M exponent range?
|
I have built mlucas on the Colab, using the reverse tunnel code. See [URL="https://mersenneforum.org/showpost.php?p=528099&postcount=40"]this[/URL] and subsequent replies for debug stuff from me and Ernst.
And for post 9 of the Colab reference, another way to work with a worktodo.txt would be to directly connect to the server via SSH using the method described [URL="https://mersenneforum.org/showthread.php?t=24840"]here[/URL] and then use your favorite text editor to modify/create the file (which may require using apt-get to get certain ones, like emacs). |
A suggestion for the GIMPS progress thread: some people like to factor numbers, either partially or completely. For example, petrw1 has his [URL="https://www.mersenneforum.org/showthread.php?t=22476"]thread on his efforts to get under 20M unfactored exponents below the PrimeNet boundary of 1G[/URL] and GP2 routinely posts [URL="https://www.mersenneforum.org/showthread.php?t=19407"]exponents which are fully factored.[/URL] Perhaps these are things to track?
|
[QUOTE=Dylan14;530236]Perhaps these are things to track?[/QUOTE]Thanks for your input. They occur to me as having little or nothing to do with finding new mersenne primes, the goal of GIMPS.
|
"Top of" navigation links addition
To hopefully aid navigation in this large and growing collection of reference posts, I've begun adding links for
[B]Top of thread[/B] where I think it makes sense (page 2 or higher of multipage threads) and [B]Top of reference tree[/B] to reference posts, beginning with the gpu computing reference thread, which got [QUOTE]Top of thread: [URL]https://www.mersenneforum.org/showpost.php?p=488289&postcount=1[/URL] Top of reference tree: [URL]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/URL][/QUOTE] on pages 2 and 3, and [QUOTE]Top of reference tree: [URL]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/URL][/QUOTE] on page 1 Does this seem useful? Less clear than some other naming? Suggestions? |
[QUOTE=kriesel;532127]On a dual E5-2697-V2 Xeon system, I made a brief test run of 0.2% of [CODE]Factor5 39845887 73 74 16[/CODE]It indicated completion in 145. hours occupying 16 cores of the 24-core system.[/QUOTE]Try testing performance on 332192831 and 3321928171
|
This appears to be an excellent compilation. I hope it will [U]always[/U] be easy to find. Many things here get buried in the mix rapidly.
|
[QUOTE=storm5510;533055]This appears to be an excellent compilation. I hope it will [U]always[/U] be easy to find. Many things here get buried in the mix rapidly.[/QUOTE]
Thanks. I've scattered bread crumbs (pointers) here and there in the forum. I find it useful to link back to these posts which saves me typing when responding, and may lead people to more. A web browser bookmark to the sub-site map page works too. It grew so big a post or two at a time that I created that for my own use in quickly finding what I knew I had already written. Sub site map to the reference info, and also discussion threads like this one: [URL]https://www.mersenneforum.org/showthread.php?t=24607[/URL] |
While I understand the forum is a good place to discuss such things, much more comfortable than the discussion pages of a wiki, a forum is not a good place to store reference material (except perhaps in closed threads) : it gets buried in the discussion about its elaboration and adjustments.
Jacob |
(re: Glossary)[QUOTE=kriesel;533167]what else?[/QUOTE]
As usual, useful work. Thanks! A suggestion/addition: rather than "squat", nomenclature which has been used previously to describe this is "working `off-the-books`". Edit: Oh, and also perhaps, add "Wave-back", AKA "Stragglers". |
(re: Glossary)
As Chris said, "Useful". Roadblock/holdout/holdup Exponent Bounds 100M I would suggest that you look at the wiki for other ideas. Also, miletsones acknowledge DC milestones, proving a prime is the Xth MP, and all exponents tested once up to a given MP |
[QUOTE=S485122;533157]While I understand the forum is a good place to discuss such things, much more comfortable than the discussion pages of a wiki, a forum is not a good place to store reference material (except perhaps in closed threads) : it gets buried in the discussion about its elaboration and adjustments.
Jacob[/QUOTE]This is precisely why I have collected the materials into a blog space that xyzzy offered me. The dilution by discussion posts here has so far been minimal and manageable. Discussion threads here are only ~2% of the total thread count. Posters are warned elsewhere that discussion posts in reference threads may be moved or removed (deleted) without warning or recourse. In the event that I become unwilling or unable to maintain it at some point, Xyzzy can close all the reference threads. I suppose as moderator of this tree I could close and open them when I edit, but so far it has worked rather well to leave them open. Anyone with the knowledge and desire to create a GIMPS reference wiki is free to create links to these reference posts. I plan to not do so. The location of the GIMPS reference posts site map is unlikely to change; bookmark it. [URL]https://www.mersenneforum.org/showthread.php?t=24607[/URL] It is now book size, with over 200 reference posts. |
[QUOTE=kriesel;533285]There is a reference thread from 2003 by PrimeMonster at [URL]https://www.mersenneforum.org/showthread.php?t=1534[/URL][/QUOTE]
If you want to overhaul any post in that thread (except the links to M40), a super-mod can take your revised version and sub it in for PrimeMonster's. |
[QUOTE=Uncwilly;533296]If you want to overhaul any post in that thread (except the links to M40), a super-mod can take your revised version and sub it in for PrimeMonster's.[/QUOTE]
Thanks for the invitation. My first reaction is, not at this time. Maybe someday. Meanwhile, I suggest if PrimeMonster is no longer active, appending a pointer link there, over to my reference info here which is being actively maintained and extended, specifically, to [url]https://www.mersenneforum.org/showthread.php?p=521922#post521922[/url] |
(in reference to [URL]https://www.mersenneforum.org/showpost.php?p=534472&postcount=5[/URL])
So why not using my [URL="https://www.rieselprime.de/ziki/Main_Page"]Prime Wiki[/URL], the reanimated MersenneWiki with those old data and pages around GIMPS. Daily backups (holding for two weeks) for the database are set. Other advantages: - history of any page edit - create your own pages naming like you want - linking to other pages or externals are easy done - many programs are described in the Wiki but need some updates in information - many pages for Riesel, Proth, Williams, Carol/Kynea, Leyland primes are currently available and should become more and more holding the history of search spreaded all over the internet (e.g. see [URL="https://www.rieselprime.de/ziki/Carol-Kynea_prime_2"]Carol/Kynea base 2[/URL]) - Math rendering available - upload/display of PDFs is available A clear categorizing should be done first for your pages although most of your topics are in the Wiki. |
Admittedly a rookie here..ADVICE please
[QUOTE=kriesel;527930]Tested draft for setup of gpuowl (will create gpuowl folder, git clone and build from latest committed gpuowl source)
[/QUOTE] I ran the first part and got several errors....maybe that is as expected: [CODE]/content/drive/My Drive /content/drive/My Drive/gpuowl fatal: destination path 'gpuowl' already exists and is not an empty directory. /content/drive/My Drive/gpuowl/gpuowl Reading package lists... Done Building dependency tree Reading state information... Done libgmp-dev is already the newest version (2:6.1.2+dfsg-2). 0 upgraded, 0 newly installed, 0 to remove and 25 not upgraded. update-alternatives: error: no alternatives for gcc update-alternatives: error: no alternatives for g++ Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: cpp-8 libasan5 libgcc-8-dev libstdc++-8-dev libubsan1 Suggested packages: gcc-8-locales g++-8-multilib gcc-8-doc libstdc++6-8-dbg gcc-8-multilib libgcc1-dbg libgomp1-dbg libitm1-dbg libatomic1-dbg libasan5-dbg liblsan0-dbg libtsan0-dbg libubsan1-dbg libmpx2-dbg libquadmath0-dbg libstdc++-8-doc The following NEW packages will be installed: cpp-8 g++-8 gcc-8 libasan5 libgcc-8-dev libstdc++-8-dev libubsan1 0 upgraded, 7 newly installed, 0 to remove and 25 not upgraded. Need to get 33.1 MB of archives. After this operation, 117 MB of additional disk space will be used. Ign:1 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 cpp-8 amd64 8.3.0-6ubuntu1~18.04.1 Err:1 http://security.ubuntu.com/ubuntu bionic-updates/universe amd64 cpp-8 amd64 8.3.0-6ubuntu1~18.04.1 404 Not Found [IP: 91.189.88.24 80] Ign:2 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libasan5 amd64 8.3.0-6ubuntu1~18.04.1 Ign:3 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libubsan1 amd64 8.3.0-6ubuntu1~18.04.1 Ign:4 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libgcc-8-dev amd64 8.3.0-6ubuntu1~18.04.1 Ign:5 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 gcc-8 amd64 8.3.0-6ubuntu1~18.04.1 Ign:6 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 libstdc++-8-dev amd64 8.3.0-6ubuntu1~18.04.1 Ign:7 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 g++-8 amd64 8.3.0-6ubuntu1~18.04.1 Err:2 http://security.ubuntu.com/ubuntu bionic-updates/main amd64 libasan5 amd64 8.3.0-6ubuntu1~18.04.1 404 Not Found [IP: 91.189.88.24 80] Err:3 http://security.ubuntu.com/ubuntu bionic-updates/main amd64 libubsan1 amd64 8.3.0-6ubuntu1~18.04.1 404 Not Found [IP: 91.189.88.24 80] Err:4 http://security.ubuntu.com/ubuntu bionic-updates/main amd64 libgcc-8-dev amd64 8.3.0-6ubuntu1~18.04.1 404 Not Found [IP: 91.189.88.24 80] Err:5 http://security.ubuntu.com/ubuntu bionic-updates/universe amd64 gcc-8 amd64 8.3.0-6ubuntu1~18.04.1 404 Not Found [IP: 91.189.88.24 80] Err:6 http://security.ubuntu.com/ubuntu bionic-updates/universe amd64 libstdc++-8-dev amd64 8.3.0-6ubuntu1~18.04.1 404 Not Found [IP: 91.189.88.24 80] Err:7 http://security.ubuntu.com/ubuntu bionic-updates/universe amd64 g++-8 amd64 8.3.0-6ubuntu1~18.04.1 404 Not Found [IP: 91.189.88.24 80] E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/universe/g/gcc-8/cpp-8_8.3.0-6ubuntu1~18.04.1_amd64.deb 404 Not Found [IP: 91.189.88.24 80] E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/g/gcc-8/libasan5_8.3.0-6ubuntu1~18.04.1_amd64.deb 404 Not Found [IP: 91.189.88.24 80] E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/g/gcc-8/libubsan1_8.3.0-6ubuntu1~18.04.1_amd64.deb 404 Not Found [IP: 91.189.88.24 80] E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/g/gcc-8/libgcc-8-dev_8.3.0-6ubuntu1~18.04.1_amd64.deb 404 Not Found [IP: 91.189.88.24 80] E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/universe/g/gcc-8/gcc-8_8.3.0-6ubuntu1~18.04.1_amd64.deb 404 Not Found [IP: 91.189.88.24 80] E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/universe/g/gcc-8/libstdc++-8-dev_8.3.0-6ubuntu1~18.04.1_amd64.deb 404 Not Found [IP: 91.189.88.24 80] E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/universe/g/gcc-8/g++-8_8.3.0-6ubuntu1~18.04.1_amd64.deb 404 Not Found [IP: 91.189.88.24 80] E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? update-alternatives: error: alternative path /usr/bin/gcc-8 doesn't exist update-alternatives: error: alternative path /usr/bin/g++-8 doesn't exist update-alternatives: error: no alternatives for gcc update-alternatives: error: no alternatives for g++ g++ (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. echo \"`git describe --long --dirty --always`\" > version.new diff -q -N version.new version.inc >/dev/null || mv version.new version.inc echo Version: `cat version.inc` Version: "v6.11-180-g6e722e2-dirty" g++ -MT Worktodo.o -MMD -MP -MF .d/Worktodo.Td -Wall -O2 -std=c++17 -c -o Worktodo.o Worktodo.cpp In file included from Worktodo.cpp:6:0: File.h:10:10: fatal error: filesystem: No such file or directory #include <filesystem> ^~~~~~~~~~~~ compilation terminated. Makefile:30: recipe for target 'Worktodo.o' failed make: *** [Worktodo.o] Error 1 create config.txt, worktodo.txt before continuing[/CODE] I created worktodo.txt..... However I do NOT know what goes into config.txt ... Thanks THEN when I ran this: [QUOTE]from google.colab import drive drive.mount('/content/drive') !chmod 777 '/content/drive/My Drive/gpuowl' !cd '/content/drive/My Drive/gpuowl' && LD_LIBRARY_PATH="lib:${LD_LIBRARY_PATH}" && chmod 777 gpuowl && chmod 777 worktodo.txt && ./gpuowl -use ORIG_X2 -block 200 -log 120000 -maxAlloc 10240 -user petrw1 -cpu colab/K80[/QUOTE] [CODE]Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True). /bin/bash: ./gpuowl: Is a directory[/CODE] Please and thanks Wayne |
In reference to building and running gpuowl on Google Colab:[QUOTE=petrw1;539559]I ran the first part and got several errors....maybe that is as expected:
I created worktodo.txt..... However I do NOT know what goes into config.txt ... Thanks THEN when I ran this: [CODE]Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True). /bin/bash: ./gpuowl: Is a directory[/CODE][/QUOTE]I haven't built it in quite a while, so don't know what the current build behavior or quirks are, but that stream you posted seemed unusual to me. The IP# problems for example. The build failed, with a fatal compile error, so there is no gpuowl executable as a result, only the directory for it. For config.txt, options you put on the command line can go in there. In your example, [CODE]-use ORIG_X2 -block 200 -log 120000 -maxAlloc 10240 -user petrw1 -cpu colab/K80[/CODE] If you're not attached to doing the compile of the latest commit, you could try running a build posted by Fan Ming. |
In the script "conditional gpu runs full" in the Colab reference thread, I'd replace the following lines:
[CODE]gpu_info = !nvidia-smi # and one for the script to look at gpu_info = '\n'.join(gpu_info)[/CODE]to these, which is based on my mlucas compile script: [CODE]!nvidia-smi > gpu_info.txt # and one for the script to look at count_gpu = !grep -c failed gpu_info.txt # this counts the number of instances failed has appeared, should be 0 if we want to run GPU code if count_gpu != 0: #code continues as before[/CODE]For the parts where you look for specific GPU models, we could run the following [code]count_T4 = !grep -c T4 gpu_info.txt count_P100 = !grep -c P100 gpu_info.txt count_P4 = !grep -c P4 gpu_info.txt count_K80 = !grep -c K80 gpu_info.txt # then branch off into the different gpu codes using if/else, like if count_T4 != 0, run T4 code [/code] Not sure if it's more efficient, but it's a good exercise in using grep and its options. |
I'd be interested in knowing of any gpuowl version/ igp model combination that works together. Gpuowl requires some aspects of OpenCL2, atomics as I recall, and DP capability.
Some igps have those available, some don't. UHD630 does. But I don't know of a build that works there. And igps are slow, even for TF, so it's not a priority. [CODE]gpuowl-v6.11-318>gpuowl-win 2020-06-12 12:16:38 gpuowl v6.11-318-g3109989 2020-06-12 12:16:38 config: -cpu peregine/uhd630 -user kriesel -device 0 -log 10000 -safeMath -use NO_ASM 2020-06-12 12:16:38 device 0, unique id '' 2020-06-12 12:16:38 peregine/uhd630 worktodo.txt line ignored: ";PRP=N/A,1,2,86243,-1,60,0" 2020-06-12 12:16:38 peregine/uhd630 worktodo.txt line ignored: ";PRP=0,1,2,110503,-1,60,0" 2020-06-12 12:16:38 peregine/uhd630 worktodo.txt line ignored: ";PRP=0,1,2,132049,-1,60,0" 2020-06-12 12:16:38 peregine/uhd630 216091 FFT: 128K 256:1:256 (1.65 bpw) 2020-06-12 12:16:38 peregine/uhd630 Expected maximum carry32: 00000 2020-06-12 12:16:38 peregine/uhd630 using long carry kernels 2020-06-12 12:16:38 peregine/uhd630 OpenCL args "-DEXP=216091u -DWIDTH=256u -DSMALL_HEIGHT=256u -DMIDDLE=1u -DPM1=0 -DWEIGHT_STEP=0xa.34c173fbe77ep-3 -DIWEIGHT_STEP=0xc.8aa2df2079a98p-4 -DNO_ASM=1 -cl-std=CL2.0 -cl-finite-math-only " 2020-06-12 12:16:43 peregine/uhd630 ASM compilation failed, retrying compilation using NO_ASM 2020-06-12 12:16:47 peregine/uhd630 OpenCL compilation error -11 (args -DEXP=216091u -DWIDTH=256u -DSMALL_HEIGHT=256u -DMIDDLE=1u -DPM1=0 -DWEIGHT_STEP=0xa.34c173fbe77ep-3 -DIWEIGHT_STEP=0xc.8aa2df2079a98p-4 -DNO_ASM=1 -cl-std=CL2.0 -cl-finite-math-only -DNO_ASM=1) 2020-06-12 12:16:47 peregine/uhd630 1:58:26: warning: unsupported OpenCL extension 'cl_khr_int64_base_atomics' - ignoring #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable ^ 1:59:26: warning: unsupported OpenCL extension 'cl_khr_int64_extended_atomics' - ignoring #pragma OPENCL EXTENSION cl_khr_int64_extended_atomics : enable ^ error: undefined reference to `_Z8atom_addPU3AS1Vmm()' error: backend compiler failed build. 2:58:26: warning: unsupported OpenCL extension 'cl_khr_int64_base_atomics' - ignoring #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable ^ 2:59:26: warning: unsupported OpenCL extension 'cl_khr_int64_extended_atomics' - ignoring #pragma OPENCL EXTENSION cl_khr_int64_extended_atomics : enable ^ error: undefined reference to `_Z8atom_addPU3AS1Vmm()' error: backend compiler failed build. 2020-06-12 12:16:47 peregine/uhd630 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:246 build 2020-06-12 12:16:47 peregine/uhd630 Bye [/CODE]It came closer in V6.5-61:[CODE]2019-05-30 13:17:51 Note: no config.txt file found 2019-05-30 13:17:51 config: -device 0 2019-05-30 13:17:51 85469147 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 18.11 bits/word 2019-05-30 13:17:51 using short carry kernels 2019-05-30 13:18:42 OpenCL compilation in 50608 ms, with "-DEXP=85469147u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-30 13:18:44 85469147.owl not found, starting from the beginning. 2019-05-30 13:25:50 85469147 EE 2000 0.00%; 95.53 ms/sq; ETA 94d 11:54; 91e7259a0ae0534b (check 96.17s) 2019-05-30 13:25:50 85469147.owl not found, starting from the beginning. 2019-05-30 13:32:39 85469147 EE 2000 0.00%; 156.09 ms/sq; ETA 154d 09:38; 91e7259a0ae0534b (check 96.44s) 2019-05-30 13:32:39 85469147.owl not found, starting from the beginning. 2019-05-30 13:41:12 Note: no config.txt file found 2019-05-30 13:41:12 config: -device 2 2019-05-30 13:41:12 85469147 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 18.11 bits/word 2019-05-30 13:41:12 using short carry kernels[/CODE] |
iGPU's share RAM with the CPU so unless the CPU can be kept busy largely from cache I don't think there's much point even if it's possible? Regardless I'd be interested in how the iGPU in the Renoir mobile APU's does if it's able at all (it might be, [URL="https://github.com/RadeonOpenCompute/ROCm#supported-gpus"]ROCm[/URL] indicates the earlier APU's have limited unofficial ROCm OpenCL functionality). By rights those APU's should end up quite popular and the iGPU is much more performant than intel's UHD630, although "much more performant" is admittedly still a blip compared to the big boys.
|
[QUOTE=M344587487;547824]iGPU's share RAM with the CPU so unless the CPU can be kept busy largely from cache I don't think there's much point even if it's possible? Regardless I'd be interested in how the iGPU in the Renoir mobile APU's does if it's able at all (it might be, [URL="https://github.com/RadeonOpenCompute/ROCm#supported-gpus"]ROCm[/URL] indicates the earlier APU's have limited unofficial ROCm OpenCL functionality). By rights those APU's should end up quite popular and the iGPU is much more performant than intel's UHD630, although "much more performant" is admittedly still a blip compared to the big boys.[/QUOTE]
Right. I've seen mfakto on an Intel igp reduce prime95 throughput on the cpu by half. And TF is not as memory hungry. If I had put more effort into making the igp/gpu part of the poll choices, it would probably have split off into a separate poll question. A little more digging revealed this list. [LIST][*]AMD igpu ([URL]https://www.cpu-monkey.com/en/amd_igpus[/URL])[*]AMD discrete gpu such as RX550 to Radeon VII[*]NVIDIA discrete gpu GTX10xx, RTX20XX etc[*]Intel igp HDxxx, HD4xxx or UHDxxx etc[*]Smartphone gpu (see [URL]https://deviceatlas.com/blog/most-used-smartphone-gpu[/URL]) Many makers[/LIST]Supposedly there will eventually be Intel discrete gpus ([URL]https://www.tomshardware.com/news/intel-xe-graphics-all-we-know[/URL]) Huawei is a possible future PC gpu entrant. |
1 Attachment(s)
I can confirm there is no HWinfo output for the serial number of a Intel GPU. However, using a different utility (GPU-Z) one can get the Device ID. See the attached photograph which shows the output on a laptop with a i7-8750H, GTX 1050 ti and UHD 630.
The downside with this method is that GPU-Z only works in Windows (XP+). |
[QUOTE=kriesel;559131][B]History and evolution of Mersenne prime software[/B]
Antiquity Precomputer[/QUOTE] Say, search, or something else instead of software. What software are we talking about in the pre-computers' era or in antiquity? :razz: A good start could be found in Mr. Caldwell's pages (he talks about antiquity, math, and how LL tests were first implemented long ago by Noll and co). |
1 Attachment(s)
Attached is a spreadsheet with my full testing data (prime95, mfakto, and gpuowl) for the A8-9600. I though about posting it in the GPU Computing forum but there did not seem to be a good spot for it and I did not feel like starting a new thread for an old CPU. I saw that you are keeping a reference thread for integrated graphic so I thought you might be interested in the additional details.
Interesting information:[LIST][*]The iGPU is faster running gpuowl than the CPU is running prime95.[*]With both Prime95 and gpuowl running, gpuowl errored three times in the same spot and ended.[*]the iGPU seems to be favored over the CPU. Prime95 CPU utilization dropped to 45% or less when either gpuowl or mfakto were also running.[*]The combined output of prime95 and gpuowl together is almost the same as gpuowl alone but with gpuowl failing it does not matter.[*]Prime95 does not seem to slow down mfakto at all.[*]The most efficient use of this CPU is to only use the iGPU for either PRP or TF.[/LIST]Feel free to use this information or delete it. |
[QUOTE=LaurV;575289]To add to it, if you want to show off to your friends, you buy the best swiss knife, but if you want to get any job done, buy a toolbox (hammer, screwdriver, pliers, etc). Each tool adequate to its purpose is more productive.[/QUOTE][url]https://forum.multitool.org/index.php?topic=5649.0[/url]
|
The worst passenger ship disaster in the US, the burning of the [I]General Slocum[/I] on June 15, 1904, killed over a thousand people. The ship had been inspected and found "safe" a month earlier. Contributing to the loss of life were inferior fire hoses which burst, lifeboats which were rusted in place and could not be launched, and life jackets which were stuffed with cork dust rather than solid cork. They soaked up water like sponges, and the rotted fabric split. As a result, many of the fatalities were people who donned life jackets, jumped into the water, were pulled under by the water-laden life jackets, and drowned. Many of them had large amounts of cork dust in their hair when they were pulled from the water. The ship company's records had been falsified to show it had purchased new life jackets.
The ship's owner and the inspector were acquitted at trial, while the ship's captain was sentenced to ten years in prison. He was pardoned in 1911 by President Taft. edit kriesel: [URL]https://en.wikipedia.org/wiki/PS_General_Slocum[/URL] is an interesting read. Notably, its legal capacity was 2500 passengers, and one of its prior accidents occurred with 4700 passengers aboard. The event Dr. Sardonicus wrote about was not however the worst passenger ship disaster on US waterways. That would be the [URL="https://www.asme.org/topics-resources/content/the-greatest-maritime-disaster-in-u-s-history"]Sultana[/URL], in 1865, returning Civil War POWs, with a badly repaired boiler and carrying far more than its rated capacity, killing an estimated 1500-1900 on a ship rated for 376. |
[QUOTE=kriesel;582966] Since the cost of P-1 B1 is very small in Gpuowl V7.x, assuming a full PRP will be completed if no factor is found by either stage of P-1, there is good reason for the B1 selection to differ considerably for Gpuowl V7.x, using much larger B1. The basis of the much larger B2 is less clear.[/QUOTE]
v7 also has faster stage 2 (similar to 30.4), so a doubling of B2 would be understandable. But 5x? Not sure about that. EDIT:- Since v7 integrates P-1 & PRP, it is probably optimal to do 4x B1 and ~2x B2 of the mersenne.ca bounds, so maybe something like (2.5M, 50M) |
[QUOTE=axn;582975]v7 also has faster stage 2[/QUOTE]That's not what I measured. See [URL]https://www.mersenneforum.org/showpost.php?p=574629&postcount=29[/URL]. For the same fft length / similar exponent, bounds, OS, system, GPU, etc., V7.2-53 stage 2 took significantly longer.
Thanks for using the discussion thread as intended BTW. (That is surprisingly rare among some other forumites.) |
[QUOTE=SELROC;489062]I'd say that you have to upgrade the bios as well.[/QUOTE]Generally I would agree, bit I am aware of a possible reason why it might not be a good idea to update the BIOS for some people.
Some CPUs on eBay are either engineering samples of mainstream CPUs, or non-standard CPUs that were supplied to a specific customer - Amazon in particular often have non-standard CPUs. Apparently it is possible for a BIOS update to render a system unusable with a CPU that was previously working fine. So if one uses weird CPUs, and they are working, it might be better to not update the BIOS. |
kriesel,
A comment on your post about worktodo.txt [URL]https://www.mersenneforum.org/showpost.php?p=522098&postcount=22[/URL] [B]"tests_saved = integer number of future primality tests saved if a factor is found, usually 2 for a first test candidate, 1 for a double-check candidate, 0 if a sufficient P-1 has already been completed, or optionally up to 9 for aggressive P-1 factoring"[/B] Although I have not verified it, from things I've seen written, this can be a [B]real [/B]number, so is not restricted to integer values as stated. Someone wrote that mprime only takes two digits after the decimal point. |
[QUOTE=drkirkby;582984]this can be a [B]real [/B]number, so is not restricted to integer values as stated. Someone wrote that mprime only takes two digits after the decimal point.[/QUOTE]Strictly speaking, mprime / prime95 would be supporting rational numbers x/10[SUP]n[/SUP]. I've only ever seen single-digit integer values issued by PrimeNet or manual assignment page. I've run CUDAPm1 with several values.
|
[QUOTE=kriesel;582981]That's not what I measured. See [URL]https://www.mersenneforum.org/showpost.php?p=574629&postcount=29[/URL]. For the same exponent, bounds, OS, system, GPU, etc., V7.2-53 stage 2 took significantly longer.[/QUOTE]
Interesting. Do you have the screen output / logs from the v7 run (specifically with the maxAlloc enforced)? I am interested in seeing all the internal parameters it chose. |
[QUOTE=kriesel;582985]Strictly speaking, mprime / prime95 would be supporting rational numbers x/10[SUP]n[/SUP]. I've only ever seen single-digit integer values issued by PrimeNet or manual assignment page.[/QUOTE]Currently PrimeNet has been issuing only 1 and 2. But if the user sets the value to, say, 1.3 Prime95 will gladly calculate bounds based upon that value. I have taken to using values 1<x<1.4 when doing P-1 to free up exponents. I like finding factors.
|
[QUOTE=kriesel;582985]Strictly speaking, mprime / prime95 would be supporting rational numbers x/10[SUP]n[/SUP]. I've only ever seen single-digit integer values issued by PrimeNet or manual assignment page. I've run CUDAPm1 with several values.[/QUOTE]I have only seen integers issued by the server. I convert the 2's to 1's. (Up until quite recently, I normally found that the P-1 factoring had been done by someone else on category 0 and 1 assignments, so [B]tests_saved[/B] was 0. However, now I'm finding that's no longer the case, and it's usually necessary to do the P-1 factoring).
However, if one wants to use non-integer values for [B]tests_saved[/B], as Uncwilly chooses to use 1.4, [B]tests_saved [/B]must be written as a decimal, and not as a fraction m/n. As a test to see if fractions were interpreted as fractions, I put these three lines in worktodo.add, then run mprime -c. [CODE] PRP=1,2,504575707,-1,76,14/10 PRP=1,2,504575749,-1,76,7/5 PRP=1,2,504575831,-1,76,1.4[/CODE]After that, worktodo.txt contained the following lines. [CODE] PRP=CF4DE4E86F51E9B8ADD3DC07311E828F,1,2,504575707,-1,76,10 PRP=5DDE68758BD37D820D703388F5153630,1,2,504575749,-1,76,7 PRP=F7DD04FC451DCF6B7A9A2A38FDC828C1,1,2,504575831,-1,76,1.4[/CODE][B](All exponents have been unassigned, so the AIDs are no longer valid). [/B] Anyway, I thought your entry about worktodo.txt could do with a slight tweak to reflect the fact that the numbers do not have to be integers. |
[QUOTE=drkirkby;582998]if one wants to use non-integer values for [B]tests_saved[/B], as Uncwilly chooses to use 1.4, [B]tests_saved [/B]must be written as a decimal, and not as a fraction m/n.
Anyway, I thought your entry about worktodo.txt could do with a slight tweak to reflect the fact that the numbers do not have to be integers.[/QUOTE] My point regarding rationals not the larger set called reals is there is no way to enter irrational or transcendental numbers as input. I did not mean that ratios or any algebraic or function forms were accepted input. The source code shows %f format in sscanf. Any tests_saved value of finite precision put on a worktodo entry, entered by %f, or its approximate internal representation in binary, is a [URL="https://en.wikipedia.org/wiki/Rational_number"]rational number[/URL]. Some applications further restrict tests_saved to natural numbers or a small subset. [URL]https://www.mersenneforum.org/showpost.php?p=522098&postcount=22[/URL] has been modified repeatedly regarding tests_saved and with misc. edits. |
[QUOTE=axn;582986]Interesting. Do you have the screen output / logs from the v7 run (specifically with the maxAlloc enforced)?
I am interested in seeing all the internal parameters it chose.[/QUOTE]Log excerpts are added to the post linked earlier. |
[QUOTE=kriesel;583024]Log exerpts are added to the post linked earlier.[/QUOTE]
I was looking for the stage 2 parameters used by v7. It's ok if you don't have it handy. Is preda aware of this regression? |
[QUOTE=axn;583036]I was looking for the stage 2 parameters used by v7. It's ok if you don't have it handy. Is preda aware of this regression?[/QUOTE]Added to [URL]https://www.mersenneforum.org/showpost.php?p=574629&postcount=29[/URL]. I can't say whether Preda has seen that, [url]https://www.mersenneforum.org/showpost.php?p=488535&postcount=2[/url] updated in April, or
[URL]https://mersenneforum.org/showpost.php?p=573188&postcount=197[/URL] posted on March 7. The last gpuowl [URL="https://github.com/preda/gpuowl/commit/f5bb7c83e4438dc87398ad904582d08ad12120f0"]Github[/URL] change was March 2. |
[QUOTE=kriesel;583044]Added to [URL]https://www.mersenneforum.org/showpost.php?p=574629&postcount=29[/URL].[/QUOTE]
Thanks. It looks like it is doing the right thing and should be, in theory, little bit (10-15%) faster than v6 stage 2. I believe you said there was a general FFT regression from v6 to v7. Could this explain why v7 stage 2 is slower than v6, despite having a better stage 2 algorithm? |
[QUOTE=axn;583047]I believe you said there was a general FFT regression from v6 to v7.[/QUOTE]It depends on the fft length apparently, per [URL]https://www.mersenneforum.org/showpost.php?p=488535&postcount=2[/URL]
Note due to the time required for even a single PRP benchmark per fft length across all supported lengths per selected Gpuowl version and single OS version, the tabulated values are single measurements each [quote]Could this explain why v7 stage 2 is slower than v6, despite having a better stage 2 algorithm?[/quote]Partially. It does not explain the full magnitude of discrepancy in V7/v6 P-1 performance ratio, between ~112% expectation and ~75% observation. |
| All times are UTC. The time now is 18:58. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.