mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Aliquot Sequences (https://www.mersenneforum.org/forumdisplay.php?f=90)
-   -   Aliqueit.exe discussion (https://www.mersenneforum.org/showthread.php?t=11618)

jasonp 2009-07-29 15:46

There's no free lunch with the number of curves chosen; if you want a lower probability of finding a factor, then do fewer curves. The ECM driver in msieve is meant to filter out composites that don't need the main QS code to run, and there's tuning in place that limits the ECM time to maybe 5-10% of the QS time. Alternately, if you have a list of B1 sizes and curve counts that maintain an approximately equal probability of finding a factor with an input size granularity smaller than 5 digits, I'd be happy to add that to the current source.

Mini-Geek 2009-07-29 16:05

[quote=jasonp;183294]There's no free lunch with the number of curves chosen; if you want a lower probability of finding a factor, then do fewer curves. The ECM driver in msieve is meant to filter out composites that don't need the main QS code to run, and there's tuning in place that limits the ECM time to maybe 5-10% of the QS time. Alternately, if you have a list of B1 sizes and curve counts that maintain an approximately equal probability of finding a factor with an input size granularity smaller than 5 digits, I'd be happy to add that to the current source.[/quote]
Does Msieve currently factor based on an input size granularity of 5 digits or based on an expected factor size granularity of 5 digits? The logs and (I think) 10metreh imply the latter, you seem to imply the former.

I think something like what aliqueit.exe uses would be good. It is granular to an input size of 1 and to a factor size determined by as small as a single curve, (as limited by the precision of the multiplier and constant) but could have higher-precision constants and work on the log instead of the factor size to be even more precise (but these are probably unnecessary and lacking sufficient data).
In aliquiet, this can be defined by the user (in the .ini settings file):
[code]//Formulae used to determine the maximum factor size we will do ecm to.
//For QS: <qs_k> * input_digits + <qs_m>
qs_k = 0.448
qs_m = -11.26
[/code]and this is in the code: (aliqueit.cc to be precise, in the definition of the run_ecm_big function)[code] int ecm_settings[][3] = {
{ 20, 11000, 74 },
{ 25, 50000, 214 },
{ 30, 250000, 430 },
{ 35, 1000000, 904 },
{ 40, 3000000, 2350 },
{ 45, 11000000, 4480 },
{ 50, 43000000, 7553 },
{ 55, 110000000, 17769 },
{ 60, 260000000, 42017 },
{ 65, 850000000, 69408 }, //quick, someone find a way for this to be useful!
};
float max_factor_size = get_ecm_level( (unsigned int)input_number.size() );

int prev_level_digits = 0;
bool done = false;
for( int level = 0; !done && level < sizeof( ecm_settings ) / ( 3 * sizeof( int ) ); ++level ) {
int b1 = ecm_settings[level][1];
int curves = ecm_settings[level][2];
if( max_factor_size < ecm_settings[level][0] ) {
curves = (int)( curves * ( max_factor_size - prev_level_digits ) / ( ecm_settings[level][0] - prev_level_digits ) );
if( !curves ) break; //if curves is 0 we'll ecm forever (or until we find a factor, whichever comes first)
done = true;
}
//proceed to run P-1, P+1, and ECM, clipped from this snippet
}
[/code]e.g. for a 77 digit input, 0.448*77-11.26=23.236, so run 74 curves at 11k and int(3.236/5*214)=138 curves at 50k, then move on to SIQS.

mklasson 2009-07-29 17:03

One problem with the way aliqueit does it is that it uses a linear mapping from input size to factor size. This is probably quite wrong if the goal is to do ecm for a fixed ratio of the time qs/nfs takes, as aliqueit's goal is. However, I think it works decently enough over the range where it's applied. Just using lovingly optimised fine-grained tables certainly allows for more flexibility though.

Silverman has pointed out several times around the forum that his joint paper with Wagstaff is worth reading when it comes to ecm thresholds. I have not done so, but definitely should before trying to improve aliqueit's method.

Greebley 2009-07-29 21:10

Unfortunately, his advise is useless to anyone that doesn't have a handy university library to actually read it. I really think the copyright of basic research papers is a bit silly. It becomes a bunch of university professors and students talking to one another with no communication with the population at large, but that's not going to change any time soon. For example, he wrote his paper a number of years ago, but no one actually uses the information because they don't have access to it.

It would be much more useful if someone could summarize the results on the web somewhere. Until this happens his research is being wasted. Funny thing is the 'tone' of his post seems to be 'its right there (dummy)' without realizing the major disconnect he has with (at least part of) his audience and that the answer is pretty much useless. Its like the answer to "What is Catalan's Conjecture" is "Catalan conjectured it in 1887 in the Math Society Journal". Not terribly useful.

I mean here we have two programs, aliqueit and msieve both whose authors have figured out the values experimentally because the theoretical paper is too difficult to access. The values that are universally used are those that have been made available on the web - the ones that say the best option given the number of the digits of the factor.

mklasson 2009-07-29 21:29

[QUOTE=Greebley;183334]Unfortunately, his advise is useless to anyone that doesn't have a handy university library to actually read it. I really think the copyright of basic research papers is a bit silly. It becomes a bunch of university professors and students talking to one another with no communication with the population at large, but that's not going to change any time soon. For example, he wrote his paper a number of years ago, but no one actually uses the information because they don't have access to it. [/quote]

I honestly think it is changing as we speak. Many research papers are available online nowadays and my humble guess is that things will only get better. See [url=http://arxiv.org/]arxiv.org[/url] for instance.

[QUOTE=Greebley;183334]I mean here we have two programs, aliqueit and msieve both whose authors have figured out the values experimentally because the theoretical paper is too difficult to access.[/quote]

To be fair, my own failure to read it was primarily due to ignorance. But yes, having it freely available online would certainly make it a lot easier to read now that I know about it.

bsquared 2009-07-29 21:30

FWIW, his paper is available through [URL="http://www.jstor.org/stable/2152967?&Search=yes&term=ecm&term=wagstaff&list=hide&searchUri=%2Faction%2FdoBasicSearch%3FQuery%3Dwagstaff%2Becm%26wc%3Don&item=3&ttl=14&returnArticleService=showArticle"]JSTOR[/URL].

I haven't grokked it all fully, but I believe optimality still requires some experimentation in order to determine some implementation dependant and/or system dependant constants.

axn 2009-07-29 21:36

[QUOTE=bsquared;183339]I haven't grokked it all fully, but I believe optimality still requires some experimentation in order to determine some implementation dependant and/or system dependant constants.[/QUOTE]

The question is, is it relevant in this context? AFAIK, that paper doesn't compare ECM vs QS -- just ECM.

Anyway, there are more questions - first off, is the assumption of a fixed fraction of QS time optimal? Or fixed fraction of the input size? Or whatever?

I personally believe that if you're somewhere near "optimal", further gains will be minimal if you try to optimize it -- probably not worth it. Either msieve's scheme or aliqueit should be good enough. Maybe somebody can do some controlled testing and see if there is noticeable difference in the net thruput.

EDIT:- A test scenario would be, time aliqueit on a aliquot sequence till 90-digits. Then collect all the LHS from that and feed it to msieve as a batch. IF the time difference is +/- 2%, call it a tie.

mklasson 2009-07-29 22:10

[QUOTE=axn;183341]The question is, is it relevant in this context? AFAIK, that paper doesn't compare ECM vs QS -- just ECM.[/quote]

I can see it mentions QS on the first page at least. :smile:

[QUOTE=axn;183341]EDIT:- A test scenario would be, time aliqueit on a aliquot sequence till 90-digits. Then collect all the LHS from that and feed it to msieve as a batch. IF the time difference is +/- 2%, call it a tie.[/QUOTE]

And repeat a couple of times to get a grip on the variance. I imagine the runs might swing a good bit more than 2%, but maybe I'm wrong.

axn 2009-07-29 22:16

[QUOTE=mklasson;183343]I can see it mentions QS on the first page at least. :smile:[/quote]
Oh well. I assume that is just a intro to factoring in general, before segueing into ECM. If it does address the optimality of fully factoring a number, I'd be suprised (pleasantly, I might add). Needless to say, I haven't actually read the paper.

[QUOTE=mklasson;183343]And repeat a couple of times to get a grip on the variance. I imagine the runs might swing a good bit more than 2%, but maybe I'm wrong.[/QUOTE]
Well, a decently long aliquot sequence _should_ be representative in and of itself. So, I expect the variance to be less. IOW, the difference between aliqueit and msieve (if any) should be consistent. Anyrate, that's my story, and I'm stickin' to it :smile:

Mini-Geek 2009-07-29 22:26

It does discuss QS, gives some relative times for various sizes of N, and when to cross over. It does not mention NFS, which I believe was in its infancy at the time of this paper.

jasonp 2009-07-30 00:17

[QUOTE=Greebley;183334]
I mean here we have two programs, aliqueit and msieve both whose authors have figured out the values experimentally because the theoretical paper is too difficult to access. The values that are universally used are those that have been made available on the web - the ones that say the best option given the number of the digits of the factor.[/QUOTE]
Actually, I have JSTOR access now, and did when I wrote the ECM code. Since Bob's ECM paper was written, the ECM algorithm has gotten a much more powerful stage 2, QS speed now depends completely on memory access time and computers have gotten ~100x faster. I couldn't be sure that the assumptions underlying the paper results were still valid, so to achieve a constant-factor speedup of the combined process there's really no alternative to doing one's own experiments.

It almost doesn't matter right now with msieve, because the only QS timings I have are for a small set of test numbers, and I suspect almost nobody uses the -e flag instead of calling ecm.exe

PS: More detail about the ECM breakover points [url="http://mersenneforum.org/showpost.php?p=120946&postcount=163"]here[/url]. When factors are found, ECM continues at a given digit level until the remaining cofactor drops below the minimum size for that level of ECM. Then QS starts.

PPS: Like most prestigious math journals, Math. Comp. requires that you release copyright on your paper to them before publishing. This means whatever happens to the published version depends on them, not on you. Occaisionally you will find authors who publish a draft of their Math. Comp. papers online, but that may not be the same as the published version. The situation is very different in the biological sciences now.


All times are UTC. The time now is 22:40.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.