![]() |
[QUOTE=LaurV;489237]Haha, I think it is the second time when you catch me with this...:redface:
Anyhow, it is irrelevant, because log(10,2) is 3.32192809488736 and when you multiply it with either 10M or 10M-1, you get 33219280.xx and 33219277.xx, respectively, and there is no prime in between. The next prime candidate for the exponent is (as ET already said) 33219283 (which has 10M+2 digits, probably). :razz:[/QUOTE] Added some columns for nearest prime exponents. [URL]https://www.alpertron.com.ar/ECM.HTM[/URL] shows 33219281 prime, a closer fit than 33219283 for least prime exponent >=10million digits.Mersenne number. Calculatorsoup calculator agrees on primality of 332919281. |
[URL="http://www.mersenne.ca/exponent/33219281"]You are right[/URL], and [URL="http://www.mersenne.ca/exponent/33219283"]both of them[/URL] have 10M+1 digits. I trusted ET's numbers without check.
|
[QUOTE=LaurV;489277][URL="http://www.mersenne.ca/exponent/33219281"]You are right[/URL], and [URL="http://www.mersenne.ca/exponent/33219283"]both of them[/URL] have 10M+1 digits. I trusted ET's numbers without check.[/QUOTE]
Thanks for the confirmation, and the additional cross-check, James' excellent site. |
How best to determine exponent limit for a transform type and fft length?
1 Attachment(s)
As I understand it, there are approximate rules, such as compute the number of bits of the candidate Mersenne number being expressed per word of the fft length, and compare to the accepted ok values, for DP transforms; approx 20 bits/word for 2M, and subtract about a bit for every power of two increase, so 8M gets limited to 18-18.2 or so. (Don't rely on these, they're from memory, can vary from program to program, do vary from transform type to type, etc.)
For the 4M -fft DP in gpuOwL v1.9, there was no prior experience available to consult. Preda guessed a number of bits/word. But a difference of 1 or even 0.1 bit per word seems small but means considerably different limit exponents. The more general issue is there does not seem to be a known or sharp function for what will run successfully n iterations or what won't, or how many iterations it will take versus exponent to have excessive round-off error or some other error. Rather it is fuzzy, variable. (An exponent p that completes successfully is one that doesn't generate an error in p iterations. The one I'm looking for is the highest up to which they all complete, defining the useful range of the transform and size.) In trying to determine the bounds experimentally, I settled on the following. Run some short trials at the expected bits/word exponent value and a little above and a little below, going up or down until a spot is found where some failures occur within a small number of iterations and nearby longer runs ok to higher number of iterations. Then approach the bound from the right (higher exponent), because a quick fail provides bounds information much more efficiently than a slow success. In some cases errors are detected in 500 to 5000 iterations. Tabulate iterations to failure versus exponent. Patterns will seem to emerge from the data collected this way. Often the next run or two will provide a contradiction to the last perceived pattern. See for example the green points on the plot attached, which were run after most of the red points. (Orange were first on the right) The slope of trend lines plotted seems to steepen as the limit is approached from above. The point where such trend lines predict running enough iterations to complete before an error occurs, is an estimate for the bound. The vertical position for the blue points merely reflects where I chose not to spend more iterations yet, so a trend line through those means very little. Is there a better way to explore the transform's limits? |
The "limits" are "soft" (think cotton fabric, not hard steel), i.e. we are talking about "bands" in which some exponents works, some not, and the width and position of each limit on the number line is dependent of how the FFT is implemented.
|
[QUOTE=kriesel;488537]Here is the latest posted version of the list I am maintaining for gpuOwL. As always, this is in appreciation of the authors' past contributions. Users may want to browse this for workarounds included in some of the descriptions, or for an awareness of some known pitfalls. Please respond with any comments, additions or suggestions you may have, preferably by PM to kriesel.
(last updated 2018-12-31)[/QUOTE] About item #47 on your pdf: I have just recompiled both gpuowl and mfakto with the new g++8 but I am not seeing any performance improvements. |
(in reference to [URL]https://www.mersenneforum.org/showpost.php?p=505634&postcount=18[/URL])
Good post. The calculus is not exact, but the idea is right. With the patience of the user, you could add something about his personality/mood too, hehe, some of us consider that having a factor to prove compositeness is more "cute" than having two LL tests, that is why we go a little bit higher with the factoring. Or, better said, longer (in time, because factor hunters will prefer lower bitlevel, not higher, and they will also prefer higher exponents, exactly for the reason you said, to find more factors per time unit). As we advance in the exponents list, LL is getting more and more difficult, and the factoring (TF) is getting easier (but not by much, I mean for the same bitlevel, where the same probability lies - or is it lays?) |
Reference material discussion thread
[QUOTE=kriesel;512813]For any Mersenne number formed from a composite exponent, the resulting Mersenne number is composite. Some of the Mersenne number's factors are easily found by deriving them from the prime factors of the composite exponent.[/QUOTE]A common convention in exhibiting factors in Cunningham tables is to deal just with the "primitive part," supplemented with any "intrisnsic" prime factor, or, as they arise, cracked into Aurifeuillian factors. The standard factorization of x^n - 1 as the product of cyclotomic polynomials gives a partial factorization. This is easily implemented for integers like 2^30 - 1 as follows.
[code]? fordiv(30,m,f=1;fordiv(m,d,f*=(2^d-1)^moebius(m/d));print(m" "f)) 1 1 2 3 3 7 5 31 6 3 10 11 15 151 30 331[/code]Each divisor of the exponent gives a cyclotomic factor. Your example fortuitously includes an "intrinsic" prime factor, the factor 3 for the divisor 6 of 30. The last factor, 331, for the divisor 30 of 30, is the "primitive part" of 2^30 - 1. |
Perhaps another topic for inclusion is:
Why don't we run the algorithm backwards. Start at s=0 and work in reverse to see if we get s=4. Which follows on from: why doesn't someone just go back one iteration and create a save file that yields a zero residue upon running the last step, and fool the system? [size=1][color=grey]Obviously because discrete logs are hard, but many seem unaware of that.[/color][/size] |
(In response apparently to [url]https://www.mersenneforum.org/showpost.php?p=515172&postcount=9[/url])
There is no such thing like "iterations independent of exponent". The only "independent" iterations are the few steps where the squaring didn't reach m, so x^2 (natural) is equal to (x^2 (mod m)). Once you take the first modulus, the "independence" is gone. So, they can't serve as a "self test" because the modulus part of your software is never tested, unless you [U]take[/U] a modulus calculation, and to use them as a start value instead of the classic 4 or 10, to save time, you save... how much? few hundred milliseconds? for a LL test that takes a month? Don't forget that by squaring, the result doubles every time, so to reach 80 million bits, you only need 26 squares, so you can start with the iteration 25 or 26 and run the other 79 999 9xx iterations instead of 80 M. Big savings, hehe, it would be more complicate to keep track of every exponents to get the starting iteration... plus memory to store them... |
[QUOTE=LaurV;515366](In response apparently to [URL]https://www.mersenneforum.org/showpost.php?p=515172&postcount=9[/URL]
or [URL]https://www.mersenneforum.org/showpost.php?p=510741&postcount=3[/URL]) There is no such thing like "iterations independent of exponent". [/QUOTE]Yes, there is, at the very beginning; seed 4 LL, iteration one is the same 0xE value for any p>4, for a trivial example, disregarding pseudo random offsets where applicable. Add another iteration at about every doubling of p. [QUOTE]The only "independent" iterations are the few steps where the squaring didn't reach m, so x^2 (natural) is equal to (x^2 (mod m)). Once you take the first modulus, the "independence" is gone. So, they can't serve as a "self test" because the modulus part of your software is never tested, unless you [U]take[/U] a modulus calculation, and to use them as a start value instead of the classic 4 or 10, to save time, you save... how much? few hundred milliseconds? for a LL test that takes a month? Don't forget that by squaring, the result doubles every time, so to reach 80 million bits, you only need 26 squares, so you can start with the iteration 25 or 26 and run the other 79 999 9xx iterations instead of 80 M. Big savings, hehe, it would be more complicate to keep track of every exponents to get the starting iteration... plus memory to store them...[/QUOTE] This may be a case of vigorous agreement re [URL]https://www.mersenneforum.org/showpost.php?p=510741&postcount=3[/URL] or incomplete understanding of [URL]https://www.mersenneforum.org/showpost.php?p=515172&postcount=9[/URL]. The proposed selftest in 9 is admittedly very brief and not comprehensive. Some confirmation of function early could be a good thing. There are periodically novice CUDALucas users unknowingly running for hours or days or longer with fft lengths or other misconfigurations that produce garbage from the start. (For one recent example, see [URL]https://www.mersenneforum.org/showpost.php?p=509707&postcount=178[/URL]) It is not necessary to "keep track of every exponents" to know the starting iteration possible, or know what self-test iteration is the limit. The usable initial-self-test iteration number can be calculated from the exponent constituting the current assignment. Or the minimum exponent for a given self-test iteration number could be predetermined for LL seed 4, LL seed 10, PRP3 cases, and stored also in very small tables. Then the maximum suitable iteration number could be found from a fast reverse order sequential search. It is likely that the various applications are programmed so that the iteration loop does the Lucas Lehmer s[SUP]2[/SUP]-2 carry and modulo, or the PRP3 equivalent loop including modulo, from the very start, whether the s term has grown large enough to actually be affected by the modulo or not. So that modulo related code would be getting somewhat exercised in the very early iterations. As I recall, the modulo is basically for free in the IBDWT representation, so it's hard to avoid doing it. (I think Ernst's explanation at [URL]https://www.mersenneforum.org/showpost.php?p=1483&postcount=10[/URL] is applicable and helpful here.) It's probably faster and more reliable to include the modulo and carry from the start. It's unlikely there's any branch included to avoid it, since as both Laurv and I have already said more or less, it's less than 1ppm of the iterations at the beginning, which is beyond trivial. (But detecting early a serious error would not be trivial.) |
| All times are UTC. The time now is 18:57. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.