mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Blogorrhea > kriesel

Reply
 
Thread Tools
Old 2018-06-05, 21:32   #12
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·3·5·97 Posts
Default

Quote:
Originally Posted by LaurV View Post
Haha, I think it is the second time when you catch me with this...
Anyhow, it is irrelevant, because log(10,2) is 3.32192809488736 and when you multiply it with either 10M or 10M-1, you get 33219280.xx and 33219277.xx, respectively, and there is no prime in between. The next prime candidate for the exponent is (as ET already said) 33219283 (which has 10M+2 digits, probably).
Added some columns for nearest prime exponents.

https://www.alpertron.com.ar/ECM.HTM shows 33219281 prime, a closer fit than 33219283 for least prime exponent >=10million digits.Mersenne number.
Calculatorsoup calculator agrees on primality of 332919281.

Last fiddled with by kriesel on 2018-06-06 at 13:27
kriesel is online now   Reply With Quote
Old 2018-06-06, 10:36   #13
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

980110 Posts
Default

You are right, and both of them have 10M+1 digits. I trusted ET's numbers without check.
LaurV is offline   Reply With Quote
Old 2018-06-06, 13:29   #14
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·3·5·97 Posts
Default

Quote:
Originally Posted by LaurV View Post
You are right, and both of them have 10M+1 digits. I trusted ET's numbers without check.
Thanks for the confirmation, and the additional cross-check, James' excellent site.

Last fiddled with by kriesel on 2018-06-06 at 13:29
kriesel is online now   Reply With Quote
Old 2018-06-07, 19:49   #15
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22×3×5×97 Posts
Default How best to determine exponent limit for a transform type and fft length?

As I understand it, there are approximate rules, such as compute the number of bits of the candidate Mersenne number being expressed per word of the fft length, and compare to the accepted ok values, for DP transforms; approx 20 bits/word for 2M, and subtract about a bit for every power of two increase, so 8M gets limited to 18-18.2 or so. (Don't rely on these, they're from memory, can vary from program to program, do vary from transform type to type, etc.)

For the 4M -fft DP in gpuOwL v1.9, there was no prior experience available to consult. Preda guessed a number of bits/word. But a difference of 1 or even 0.1 bit per word seems small but means considerably different limit exponents.

The more general issue is there does not seem to be a known or sharp function for what will run successfully n iterations or what won't, or how many iterations it will take versus exponent to have excessive round-off error or some other error. Rather it is fuzzy, variable. (An exponent p that completes successfully is one that doesn't generate an error in p iterations. The one I'm looking for is the highest up to which they all complete, defining the useful range of the transform and size.)

In trying to determine the bounds experimentally, I settled on the following. Run some short trials at the expected bits/word exponent value and a little above and a little below, going up or down until a spot is found where some failures occur within a small number of iterations and nearby longer runs ok to higher number of iterations. Then approach the bound from the right (higher exponent), because a quick fail provides bounds information much more efficiently than a slow success. In some cases errors are detected in 500 to 5000 iterations. Tabulate iterations to failure versus exponent. Patterns will seem to emerge from the data collected this way. Often the next run or two will provide a contradiction to the last perceived pattern. See for example the green points on the plot attached, which were run after most of the red points. (Orange were first on the right) The slope of trend lines plotted seems to steepen as the limit is approached from above. The point where such trend lines predict running enough iterations to complete before an error occurs, is an estimate for the bound. The vertical position for the blue points merely reflects where I chose not to spend more iterations yet, so a trend line through those means very little.

Is there a better way to explore the transform's limits?
Attached Thumbnails
Click image for larger version

Name:	m61 4M exponent limit probing.png
Views:	217
Size:	22.6 KB
ID:	18476  
kriesel is online now   Reply With Quote
Old 2018-06-08, 08:37   #16
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

34×112 Posts
Default

The "limits" are "soft" (think cotton fabric, not hard steel), i.e. we are talking about "bands" in which some exponents works, some not, and the width and position of each limit on the number line is dependent of how the FFT is implemented.

Last fiddled with by LaurV on 2018-06-08 at 08:38
LaurV is offline   Reply With Quote
Old 2019-01-01, 06:31   #17
SELROC
 

7×1,303 Posts
Default

Quote:
Originally Posted by kriesel View Post
Here is the latest posted version of the list I am maintaining for gpuOwL. As always, this is in appreciation of the authors' past contributions. Users may want to browse this for workarounds included in some of the descriptions, or for an awareness of some known pitfalls. Please respond with any comments, additions or suggestions you may have, preferably by PM to kriesel.


(last updated 2018-12-31)

About item #47 on your pdf: I have just recompiled both gpuowl and mfakto with the new g++8 but I am not seeing any performance improvements.
  Reply With Quote
Old 2019-01-12, 05:03   #18
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

980110 Posts
Default

(in reference to https://www.mersenneforum.org/showpo...4&postcount=18)
Good post.

The calculus is not exact, but the idea is right.

With the patience of the user, you could add something about his personality/mood too, hehe, some of us consider that having a factor to prove compositeness is more "cute" than having two LL tests, that is why we go a little bit higher with the factoring. Or, better said, longer (in time, because factor hunters will prefer lower bitlevel, not higher, and they will also prefer higher exponents, exactly for the reason you said, to find more factors per time unit).

As we advance in the exponents list, LL is getting more and more difficult, and the factoring (TF) is getting easier (but not by much, I mean for the same bitlevel, where the same probability lies - or is it lays?)

Last fiddled with by kriesel on 2019-01-12 at 19:42 Reason: moved by kriesel from reference thread to discussion thread
LaurV is offline   Reply With Quote
Old 2019-04-05, 23:02   #19
Dr Sardonicus
 
Dr Sardonicus's Avatar
 
Feb 2017
Nowhere

2·23·109 Posts
Default Reference material discussion thread

Quote:
Originally Posted by kriesel View Post
For any Mersenne number formed from a composite exponent, the resulting Mersenne number is composite. Some of the Mersenne number's factors are easily found by deriving them from the prime factors of the composite exponent.
A common convention in exhibiting factors in Cunningham tables is to deal just with the "primitive part," supplemented with any "intrisnsic" prime factor, or, as they arise, cracked into Aurifeuillian factors. The standard factorization of x^n - 1 as the product of cyclotomic polynomials gives a partial factorization. This is easily implemented for integers like 2^30 - 1 as follows.
Code:
? fordiv(30,m,f=1;fordiv(m,d,f*=(2^d-1)^moebius(m/d));print(m" "f))
1 1
2 3
3 7
5 31
6 3
10 11
15 151
30 331
Each divisor of the exponent gives a cyclotomic factor.

Your example fortuitously includes an "intrinsic" prime factor, the factor 3 for the divisor 6 of 30.

The last factor, 331, for the divisor 30 of 30, is the "primitive part" of 2^30 - 1.
Dr Sardonicus is online now   Reply With Quote
Old 2019-04-06, 04:36   #20
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

33×233 Posts
Default

Perhaps another topic for inclusion is:

Why don't we run the algorithm backwards. Start at s=0 and work in reverse to see if we get s=4.

Which follows on from: why doesn't someone just go back one iteration and create a save file that yields a zero residue upon running the last step, and fool the system?

Obviously because discrete logs are hard, but many seem unaware of that.
retina is online now   Reply With Quote
Old 2019-05-01, 07:18   #21
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

34×112 Posts
Default

(In response apparently to https://www.mersenneforum.org/showpo...72&postcount=9)
There is no such thing like "iterations independent of exponent". The only "independent" iterations are the few steps where the squaring didn't reach m, so x^2 (natural) is equal to (x^2 (mod m)). Once you take the first modulus, the "independence" is gone. So, they can't serve as a "self test" because the modulus part of your software is never tested, unless you take a modulus calculation, and to use them as a start value instead of the classic 4 or 10, to save time, you save... how much? few hundred milliseconds? for a LL test that takes a month? Don't forget that by squaring, the result doubles every time, so to reach 80 million bits, you only need 26 squares, so you can start with the iteration 25 or 26 and run the other 79 999 9xx iterations instead of 80 M. Big savings, hehe, it would be more complicate to keep track of every exponents to get the starting iteration... plus memory to store them...

Last fiddled with by kriesel on 2019-05-01 at 14:58
LaurV is offline   Reply With Quote
Old 2019-05-04, 13:41   #22
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

16BC16 Posts
Default

Quote:
Originally Posted by LaurV View Post
(In response apparently to https://www.mersenneforum.org/showpo...72&postcount=9

or https://www.mersenneforum.org/showpo...41&postcount=3)
There is no such thing like "iterations independent of exponent".
Yes, there is, at the very beginning; seed 4 LL, iteration one is the same 0xE value for any p>4, for a trivial example, disregarding pseudo random offsets where applicable. Add another iteration at about every doubling of p.
Quote:
The only "independent" iterations are the few steps where the squaring didn't reach m, so x^2 (natural) is equal to (x^2 (mod m)). Once you take the first modulus, the "independence" is gone. So, they can't serve as a "self test" because the modulus part of your software is never tested, unless you take a modulus calculation, and to use them as a start value instead of the classic 4 or 10, to save time, you save... how much? few hundred milliseconds? for a LL test that takes a month? Don't forget that by squaring, the result doubles every time, so to reach 80 million bits, you only need 26 squares, so you can start with the iteration 25 or 26 and run the other 79 999 9xx iterations instead of 80 M. Big savings, hehe, it would be more complicate to keep track of every exponents to get the starting iteration... plus memory to store them...
This may be a case of vigorous agreement re https://www.mersenneforum.org/showpo...41&postcount=3 or incomplete understanding of https://www.mersenneforum.org/showpo...72&postcount=9. The proposed selftest in 9 is admittedly very brief and not comprehensive. Some confirmation of function early could be a good thing. There are periodically novice CUDALucas users unknowingly running for hours or days or longer with fft lengths or other misconfigurations that produce garbage from the start. (For one recent example, see https://www.mersenneforum.org/showpo...&postcount=178)

It is not necessary to "keep track of every exponents" to know the starting iteration possible, or know what self-test iteration is the limit. The usable initial-self-test iteration number can be calculated from the exponent constituting the current assignment. Or the minimum exponent for a given self-test iteration number could be predetermined for LL seed 4, LL seed 10, PRP3 cases, and stored also in very small tables. Then the maximum suitable iteration number could be found from a fast reverse order sequential search.

It is likely that the various applications are programmed so that the iteration loop does the Lucas Lehmer s2-2 carry and modulo, or the PRP3 equivalent loop including modulo, from the very start, whether the s term has grown large enough to actually be affected by the modulo or not. So that modulo related code would be getting somewhat exercised in the very early iterations. As I recall, the modulo is basically for free in the IBDWT representation, so it's hard to avoid doing it. (I think Ernst's explanation at https://www.mersenneforum.org/showpo...3&postcount=10 is applicable and helpful here.) It's probably faster and more reliable to include the modulo and carry from the start. It's unlikely there's any branch included to avoid it, since as both Laurv and I have already said more or less, it's less than 1ppm of the iterations at the beginning, which is beyond trivial. (But detecting early a serious error would not be trivial.)
kriesel is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Mersenne Prime mostly-GPU Computing reference material kriesel kriesel 33 2021-10-24 23:43
P-1 discussion thread Rincewind Five or Bust - The Dual Sierpinski Problem 57 2011-02-06 21:53
Sieving discussion thread jasong Twin Prime Search 311 2010-10-22 18:41
PRP discussion thread philmoore Five or Bust - The Dual Sierpinski Problem 83 2010-09-25 10:20
Theological Discussion Thread clowns789 Soap Box 3 2006-03-09 04:05

All times are UTC. The time now is 19:08.


Thu Oct 28 19:08:35 UTC 2021 up 97 days, 13:37, 0 users, load averages: 1.27, 1.23, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.