mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Aliquot Sequences

Reply
 
Thread Tools
Old 2009-07-29, 15:46   #276
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

354310 Posts
Default

There's no free lunch with the number of curves chosen; if you want a lower probability of finding a factor, then do fewer curves. The ECM driver in msieve is meant to filter out composites that don't need the main QS code to run, and there's tuning in place that limits the ECM time to maybe 5-10% of the QS time. Alternately, if you have a list of B1 sizes and curve counts that maintain an approximately equal probability of finding a factor with an input size granularity smaller than 5 digits, I'd be happy to add that to the current source.
jasonp is offline   Reply With Quote
Old 2009-07-29, 16:05   #277
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by jasonp View Post
There's no free lunch with the number of curves chosen; if you want a lower probability of finding a factor, then do fewer curves. The ECM driver in msieve is meant to filter out composites that don't need the main QS code to run, and there's tuning in place that limits the ECM time to maybe 5-10% of the QS time. Alternately, if you have a list of B1 sizes and curve counts that maintain an approximately equal probability of finding a factor with an input size granularity smaller than 5 digits, I'd be happy to add that to the current source.
Does Msieve currently factor based on an input size granularity of 5 digits or based on an expected factor size granularity of 5 digits? The logs and (I think) 10metreh imply the latter, you seem to imply the former.

I think something like what aliqueit.exe uses would be good. It is granular to an input size of 1 and to a factor size determined by as small as a single curve, (as limited by the precision of the multiplier and constant) but could have higher-precision constants and work on the log instead of the factor size to be even more precise (but these are probably unnecessary and lacking sufficient data).
In aliquiet, this can be defined by the user (in the .ini settings file):
Code:
//Formulae used to determine the maximum factor size we will do ecm to.
//For QS: <qs_k> * input_digits + <qs_m>
qs_k = 0.448
qs_m = -11.26
and this is in the code: (aliqueit.cc to be precise, in the definition of the run_ecm_big function)
Code:
    int ecm_settings[][3] = {
        { 20, 11000, 74 },
        { 25, 50000, 214 },
        { 30, 250000, 430 },
        { 35, 1000000, 904 },
        { 40, 3000000, 2350 },
        { 45, 11000000, 4480 },
        { 50, 43000000, 7553 },
        { 55, 110000000, 17769 },
        { 60, 260000000, 42017 },
        { 65, 850000000, 69408 },    //quick, someone find a way for this to be useful!
    };
    float max_factor_size = get_ecm_level( (unsigned int)input_number.size() );

    int prev_level_digits = 0;
    bool done = false;
    for( int level = 0; !done && level < sizeof( ecm_settings ) / ( 3 * sizeof( int ) ); ++level ) {
        int b1 = ecm_settings[level][1];
        int curves = ecm_settings[level][2];
        if( max_factor_size < ecm_settings[level][0] ) {
            curves = (int)( curves * ( max_factor_size - prev_level_digits ) / ( ecm_settings[level][0] - prev_level_digits ) );
            if( !curves ) break;    //if curves is 0 we'll ecm forever (or until we find a factor, whichever comes first)
            done = true;
        }
//proceed to run P-1, P+1, and ECM, clipped from this snippet
    }
e.g. for a 77 digit input, 0.448*77-11.26=23.236, so run 74 curves at 11k and int(3.236/5*214)=138 curves at 50k, then move on to SIQS.
Mini-Geek is offline   Reply With Quote
Old 2009-07-29, 17:03   #278
mklasson
 
Feb 2004

2·3·43 Posts
Default

One problem with the way aliqueit does it is that it uses a linear mapping from input size to factor size. This is probably quite wrong if the goal is to do ecm for a fixed ratio of the time qs/nfs takes, as aliqueit's goal is. However, I think it works decently enough over the range where it's applied. Just using lovingly optimised fine-grained tables certainly allows for more flexibility though.

Silverman has pointed out several times around the forum that his joint paper with Wagstaff is worth reading when it comes to ecm thresholds. I have not done so, but definitely should before trying to improve aliqueit's method.
mklasson is offline   Reply With Quote
Old 2009-07-29, 21:10   #279
Greebley
 
Greebley's Avatar
 
May 2009
Dedham Massachusetts USA

3×281 Posts
Default

Unfortunately, his advise is useless to anyone that doesn't have a handy university library to actually read it. I really think the copyright of basic research papers is a bit silly. It becomes a bunch of university professors and students talking to one another with no communication with the population at large, but that's not going to change any time soon. For example, he wrote his paper a number of years ago, but no one actually uses the information because they don't have access to it.

It would be much more useful if someone could summarize the results on the web somewhere. Until this happens his research is being wasted. Funny thing is the 'tone' of his post seems to be 'its right there (dummy)' without realizing the major disconnect he has with (at least part of) his audience and that the answer is pretty much useless. Its like the answer to "What is Catalan's Conjecture" is "Catalan conjectured it in 1887 in the Math Society Journal". Not terribly useful.

I mean here we have two programs, aliqueit and msieve both whose authors have figured out the values experimentally because the theoretical paper is too difficult to access. The values that are universally used are those that have been made available on the web - the ones that say the best option given the number of the digits of the factor.
Greebley is offline   Reply With Quote
Old 2009-07-29, 21:29   #280
mklasson
 
Feb 2004

4028 Posts
Default

Quote:
Originally Posted by Greebley View Post
Unfortunately, his advise is useless to anyone that doesn't have a handy university library to actually read it. I really think the copyright of basic research papers is a bit silly. It becomes a bunch of university professors and students talking to one another with no communication with the population at large, but that's not going to change any time soon. For example, he wrote his paper a number of years ago, but no one actually uses the information because they don't have access to it.
I honestly think it is changing as we speak. Many research papers are available online nowadays and my humble guess is that things will only get better. See arxiv.org for instance.

Quote:
Originally Posted by Greebley View Post
I mean here we have two programs, aliqueit and msieve both whose authors have figured out the values experimentally because the theoretical paper is too difficult to access.
To be fair, my own failure to read it was primarily due to ignorance. But yes, having it freely available online would certainly make it a lot easier to read now that I know about it.
mklasson is offline   Reply With Quote
Old 2009-07-29, 21:30   #281
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

7×503 Posts
Default

FWIW, his paper is available through JSTOR.

I haven't grokked it all fully, but I believe optimality still requires some experimentation in order to determine some implementation dependant and/or system dependant constants.
bsquared is offline   Reply With Quote
Old 2009-07-29, 21:36   #282
axn
 
axn's Avatar
 
Jun 2003

117378 Posts
Default

Quote:
Originally Posted by bsquared View Post
I haven't grokked it all fully, but I believe optimality still requires some experimentation in order to determine some implementation dependant and/or system dependant constants.
The question is, is it relevant in this context? AFAIK, that paper doesn't compare ECM vs QS -- just ECM.

Anyway, there are more questions - first off, is the assumption of a fixed fraction of QS time optimal? Or fixed fraction of the input size? Or whatever?

I personally believe that if you're somewhere near "optimal", further gains will be minimal if you try to optimize it -- probably not worth it. Either msieve's scheme or aliqueit should be good enough. Maybe somebody can do some controlled testing and see if there is noticeable difference in the net thruput.

EDIT:- A test scenario would be, time aliqueit on a aliquot sequence till 90-digits. Then collect all the LHS from that and feed it to msieve as a batch. IF the time difference is +/- 2%, call it a tie.

Last fiddled with by axn on 2009-07-29 at 21:38
axn is online now   Reply With Quote
Old 2009-07-29, 22:10   #283
mklasson
 
Feb 2004

25810 Posts
Default

Quote:
Originally Posted by axn View Post
The question is, is it relevant in this context? AFAIK, that paper doesn't compare ECM vs QS -- just ECM.
I can see it mentions QS on the first page at least.

Quote:
Originally Posted by axn View Post
EDIT:- A test scenario would be, time aliqueit on a aliquot sequence till 90-digits. Then collect all the LHS from that and feed it to msieve as a batch. IF the time difference is +/- 2%, call it a tie.
And repeat a couple of times to get a grip on the variance. I imagine the runs might swing a good bit more than 2%, but maybe I'm wrong.
mklasson is offline   Reply With Quote
Old 2009-07-29, 22:16   #284
axn
 
axn's Avatar
 
Jun 2003

13DF16 Posts
Default

Quote:
Originally Posted by mklasson View Post
I can see it mentions QS on the first page at least.
Oh well. I assume that is just a intro to factoring in general, before segueing into ECM. If it does address the optimality of fully factoring a number, I'd be suprised (pleasantly, I might add). Needless to say, I haven't actually read the paper.

Quote:
Originally Posted by mklasson View Post
And repeat a couple of times to get a grip on the variance. I imagine the runs might swing a good bit more than 2%, but maybe I'm wrong.
Well, a decently long aliquot sequence _should_ be representative in and of itself. So, I expect the variance to be less. IOW, the difference between aliqueit and msieve (if any) should be consistent. Anyrate, that's my story, and I'm stickin' to it

Last fiddled with by axn on 2009-07-29 at 22:18 Reason: grammer
axn is online now   Reply With Quote
Old 2009-07-29, 22:26   #285
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

426710 Posts
Default

It does discuss QS, gives some relative times for various sizes of N, and when to cross over. It does not mention NFS, which I believe was in its infancy at the time of this paper.

Last fiddled with by Mini-Geek on 2009-07-29 at 22:27
Mini-Geek is offline   Reply With Quote
Old 2009-07-30, 00:17   #286
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

354310 Posts
Default

Quote:
Originally Posted by Greebley View Post
I mean here we have two programs, aliqueit and msieve both whose authors have figured out the values experimentally because the theoretical paper is too difficult to access. The values that are universally used are those that have been made available on the web - the ones that say the best option given the number of the digits of the factor.
Actually, I have JSTOR access now, and did when I wrote the ECM code. Since Bob's ECM paper was written, the ECM algorithm has gotten a much more powerful stage 2, QS speed now depends completely on memory access time and computers have gotten ~100x faster. I couldn't be sure that the assumptions underlying the paper results were still valid, so to achieve a constant-factor speedup of the combined process there's really no alternative to doing one's own experiments.

It almost doesn't matter right now with msieve, because the only QS timings I have are for a small set of test numbers, and I suspect almost nobody uses the -e flag instead of calling ecm.exe

PS: More detail about the ECM breakover points here. When factors are found, ECM continues at a given digit level until the remaining cofactor drops below the minimum size for that level of ECM. Then QS starts.

PPS: Like most prestigious math journals, Math. Comp. requires that you release copyright on your paper to them before publishing. This means whatever happens to the published version depends on them, not on you. Occaisionally you will find authors who publish a draft of their Math. Comp. papers online, but that may not be the same as the published version. The situation is very different in the biological sciences now.

Last fiddled with by jasonp on 2009-07-30 at 00:44
jasonp is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Resuming aliqueit johnadam74 Aliquot Sequences 4 2016-03-28 12:32
Apparent aliqueit issue with specifying factors pakaran Aliquot Sequences 2 2015-09-12 23:10
Using Several Instances of Aliqueit for a large gnfs job EdH Aliquot Sequences 6 2011-12-13 18:58
Setting up aliqueit science_man_88 Aliquot Sequences 185 2011-11-08 12:18
Tried out aliqueit.exe: ggnfs failing Greebley Aliquot Sequences 35 2010-02-13 15:23

All times are UTC. The time now is 14:36.


Fri Aug 6 14:36:59 UTC 2021 up 14 days, 9:05, 1 user, load averages: 2.88, 2.96, 2.79

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.