mersenneforum.org > Math Elliptic Curve Method - the theory
 Register FAQ Search Today's Posts Mark Forums Read

 2002-11-02, 00:00 #2 ewmayer ∂2ω=0     Sep 2002 República de California 3·53·31 Posts Thanks, Phil, for that nicely readable nontechnical introduction to ECM. I'd like to make a couple of additional small points by way of clarification, mainly for folks who are new to the theory of finite groups: - When Phil discusses the smoothness property of the p-1 method, the group order he refers to is the order of the multiplicative group in question, whereas in the case of ECM it is the additive group defined by arithmetic on the elliptic curve. - If the p-1 stage 1 bound is small enough that the product of primes <= B1 will fit into one's computer memory, one can premultiply all these primes together to get their product m (which needs only O(m*log(m)) work, if one uses a fast multiply algorithm once the partial products get much larger than a computer word) and use a left-to-right binary exponentiation (LRBE) in step 1. This needs just 2/3 the work of the one-prime-at-a-time approach. The one disadvantage of this is if we run a stage 2 and fail to find a factor, then wish to go back and run to a larger stage 1 bound B1', we can't use LRBE for that unless we're willing to start from scratch; if not, we've got to do the continuation via the normal, more-expensive powering method. (But we can apply the slower continuation to a stage 1 residue calculated using either method.) - Stage 2 of p-1 (and ECM as well, if I recall correctly) lends itself to a distributed approach. One uses a single machine (perhaps a fast multiprocessor machine) to generate a stage 1 residue, then can farm that out to any number of independent machines, each one of which processes its own distinct stage 2 interval(s). - For really large numbers (i.e. whose computer representation takes up some significant fraction of the machine memory, memory constaints may rule out ECM, because it needs a lot more memory than p-1. Let's call M the amount of memory computer storage needed by N, the number to be factored. This will typically be ~ 4X the number of bits in N itself, since to do fast multiply modulo N (which operation dominates the computer work for both p-1 and ECM), we need to break it into smaller chunks that only take up a fraction of a computer word - a typical implementation puts each ~16-bit chunk of N in a 64-bit word (floating or integer). Then, doing stage 1 of p-1 via LRBE needs memory only = M (in the case of numbers like Fermats or Mersennes where we know how to do modular multiply sans zero-padding of the input vectors) or 2M in the general case. Stage 2 of p-1 can use anywhere from (roughly) 4 times this much to hundreds of times as much - the latter is more efficient if there is enough RAM to prevent swapping to disk, but the efficiency gain is not overwhelming (in other words we can still do a decently fast stage 2 even using the low-memory version.) By way of contrast, even the lowest-memory version of ECM stage one needs about 15X the memory of p-1 stage 1, and stage 2 needs yet more. - Phil mentions that the largest factors found by p-1 are only ~40 digits, compared to over 50 digits for ECM. But in fact one of the big reasons ECM has found larger factors (and these large factors are all of numbers which are utterly miniscule compared to the ones GIMPSers routinely find 30-40-digit factors of using p-1) is simple: a heck of a lot more machine time has been spent running ECM. This is because the way most people use ECM is to find small (say < 60 digits) factors of just one or a few numbers they're particularly interested in. Since the numbers in question (except for the larger Fermats, starting with F16 or so) are generally quite small, running p-1 to find a factor doesn't usually save much machine time - if the factor is small enough that it had a smooth p-1, it likely will also be found quite quickly by running a few ECM curves. On the other hand, if the factor is decently large (again this is for a single given number N), the odds of it being very p-1 smooth are quite small, and drop exponentially fast as the factor gets bigger. So if shallow p-1 or ECM has failed to turn up a factor and I (for a given amount of machine time) want to maximize my chances of finding a factor of that number, ECM is by far the better way to go, since as p gets bigger the odds of one of many ECM curves yielding an A-smooth group order are much better than those of p-1 being B-smooth, even though B is quite a bit larger than A. However, there is a different way to compare the two methods, which is more relevant to the way GIMPS does factoring. If I have a large collection of numbers of a type known to frequently have small factors (i.e. we're not talking about the RSA challenge numbers here - those are designed to have two factors each, each factor being about as large as it can be) then, for a given amount of machine time, I will generally find significantly more factors using p-1 than ECM. That is because we can think of a p-1 run as being like a single very fast ECM run, i.e. for the same CPU time we can run it to significantly deeper bounds than we can an ECM curve. Since p-1 is even more effective if the numbers in question are known to have factors of the form C + 1, where C is some composite with better-than-average chances of being smooth (and by Legendre's Theorem all numbers of the form a^n +- b^n, with GCD(a,b) = 1 do in fact have factors of this very form), for numbers like Mersennes or Fermats it is far more effective at finding large numbers of small factors than ECM is. The main reason GIMPS hasn't yet found a 50-digit- plus factor using p-1 is that nearly all the p-1 work has been done on the big monster numbers, where it simply doesn't pay to set the stage bounds large enough to have a good chance of finding such a big factor. I hope I don't sound like I'm dissing ECM - it is a wonderfully elegant method, and still the only known one which affords us a good chance of finding large factors of any given number N which too big for the Number Field Sieve to handle - but the idea that ECM is somehow inherently better at finding large factors than p-1 is to some degree a misconception, based on the way the method is normally applied and the relatively larger amount of machine spent using it to try to find factors of smallish numbers N. As a concrete example, consider the Fermat numbers. In the paper "Three new factors of Fermat numbers" by Crandall et al., (available at http://www.perfsci.com/free/techpapers/freepapers.html ,) they mention a new 27-digit factor of F13, p27 = 319546020820551643220672513. This factor has p-1 = 2^19.51309697.11878566851267, i.e. p-1 with a stage 1 bound of 10^8 and stage 2 bound slightly larger than 10^14 (neither unreasonable, especially if Silverman et al's FFT-based stage 2 method is used) would have found the factor. Sure, 10^14 is a deep stage 2, but in one sentence they mention having run ECM for 47 days on dedicated hardware (a Dubner Cruncher) to find the factor, yet in the next breath say "The factorizations of p27+-1 explain why Pollard's p+-1 methods could not find the factor p27 in a reasonable amount of time." (Note that the p+1 method is actually due to Williams.) They of course don't mention whether a similar p+-1 effort was expended, but I guarantee you that with a similarly optimized program running on similarly fast hardware, one could find the same factor using p-1 in roughly the same amount of time. (I've run similarly large Mersenne numbers to B1 = 10^9, B2 = 10^12 in far less machine time, and that program didn't use the FFT stage 2 optimization - the latter would've allowed a stage 2 ~100 times deeper in the same machine time - there's your 10^14.) It's interesting that a month-and-a-half of dedicated crunching is perfectly acceptable for ECM, but somehow seems "unreasonably large" for the p-1 method. In the same paper, the authors describe a new 27-digit factor of F16, p27 = 188981757975021318420037633 = 3^2.31.37.13669.1277254085461, found by running ECM on 6 SPARCstations for a month. Again, the same factor could have been found in a comparable amount of machine time via p-1. Lastly, for F15 they also found (in about 120 CPU-days on a P200) a p33 = 168768817029516972383024127016961 = 5.7.53.97.181.199.1331471.149211834097. Again, a comparable or lesser amount of p-1 work would have found the factor. Now, of course for factors > 25 digits or so the odds of a smooth p-1 decomposition drop rapidly, and when the factors in question have no special form p-1 loses one of its key advantages. But for the aforementioned classes of numbers, if we are trying to maximize our odds of finding 30-40-digit factors, we should definitely give both methods at least a comparable amount of machine time.
 2002-11-12, 01:11 #3 philmoore     "Phil" Sep 2002 Tracktown, U.S.A. 3·373 Posts ECM and P-1 factoring Thanks, Ernst, for your comments and analysis. Your observation explains very nicely why we do not do ECM on GIMPS candidates. Such Mersenne numbers have already been trial factored to 19 or 20 digits, so we are looking for factors most likely to be in the range of 20 to 25 digits, although sometimes we get lucky and find factors up to or beyond 30 digits. The constraint for GIMPS is that we do not want to spend more time factoring than we will end up saving by eliminating candidates by Lucas-Lehmer testing, so naturally we want to find as many factors as possible in a given amount of time. P-1 is a superior factoring method for this group of numbers for three reasons: 1) P-1 is a simpler algorithm and therefore runs to the same stage 1 and stage 2 bounds faster than ECM. 2) We hope to be lucky enough to find a smooth group order. We know "a priori" that the group order for P-1 for a Mersenne number 2^k-1 (k prime) is divisible by 2k. We only know that the group order for ECM is divisible by 12 (or 8 or 16, for other implementations). 3) We are looking for factors most likely to be < 30 digits of a large class of numbers. If we are looking for larger factors of one particular number, or of a small class of numbers, such as Fermat numbers small enough to use ECM/P-1, the advantages of P-1 soon diminish. One always wants to do <some> P-1, and of course I'm not saying you might not get lucky and find a large factor that way, but you certainly would not want to spend a comparable amount of time with P-1 looking for large factors of F12 through F20, and probably even through F22, ECM now has a definite edge. Asymptotically, ECM has a complexity estimate of exp(sqrt((2+o(1))(ln f)(ln ln f))) operations to find the factor f. It is the fact that the ln f is inside the square-root that gives it a big advantage over P-1. In P-1, the size of the largest factor of f-1 determines how much work it is to find f. When f-1 has no special form, we expect the natural log of this factor to be, on average, (1-1/e)ln f; i.e., this factor has, on average, about 63% of the digits of f. We usually must do stage 2 P-1 up to this factor to find f. For Mersenne numbers, we get to reduce this to (1-1/e)ln(f/2k), so the amount of work is approximately proportional to (f/2k)^.63. Look at the 9 factors which have been found by ECM since 1988: p40 of F10: 2^12*3*5639*8231*433639*18840862799165386003967 + 1 p21 of F11: 2^14*3*373*67003*136752547 + 1 p22 of F11: 2^13*7*677*91722533083549 + 1 p19 of F13: 2^17*19*1069660739759 + 1 p19 of F13: 2^19*3*4363*525050549 + 1 p27 of F13: 2^19*51309697*11878566851267 + 1 p33 of F15: 2^17*5*7*53*97*181*199*1331471*149211834097 + 1 P27 of F16: 2^20*3^2*31*37*13669*1277254085461 + 1 p23 of F18: 2^23*29*293*1259*905678539 + 1 Of these nine factors, three of them (the p21, the larger p19, and the p23) could have been found fairly easily by P-1, requiring a stage 2 bound of less than 10^9. The p40 is clearly out of reach, and the other five would require a very deep stage 2 factoring effort. I did an experiment on the 27 digit factor of F13: I set 5 866MHz Pentium-3s to work doing ECM on it, running curves with Prime95 with the stage 1 bound (B1) set to 250,000, stage 2 bound (B2) equal to 100 times the stage 1 bound. The computers stopped when they found the factor in: 83 curves, 267 curves, 112 curves, 70 curves, and 205 curves for an average of 147 curves and an average time of 9 hours, 28 minutes per machine. Then I ran P-1 on this same number to B1=40,000,000 and B2=4,000,000,000. Stage 1 took 1 hour, 5 minutes while stage 2 took 4 hours, 29 minutes. (Okay, I know prime95 isn't optimized with SSE2 for P-numbers, but I thought it was still a good order of magnitude estimate.) In order to find the 27 digit factor, I would have to run stage 2 to 1.18x10^13, which would take around 550 days at that rate. Even with a 100 times improvement in stage 2 speed with an FFT polynomial continuation, that still would be 5 days. (By the way, I would like to know who has a good implementation of the FFT polynomial stage 2. Does it take a lot of memory?) I did a little computation with Richard McIntosh's program for figuring the odds of a number being smooth. (By the way, Richard is currently doing distributed stage 2 P-1 work on larger Fermat numbers.) Suppose we have a six-digit Mersenne exponent k, and suppose 2^k - 1 has a 35-digit factor. We would expect to have to run about 1340 ECM curves to B1=1,000,000 and B2=100*B1 to find this factor. Since most of the work in P-1 tends to be in stage 2, we might increase this ratio for P-1 to B2=1000*B1. Then, if we run P-1 to B1=1,000,000 and B2=1000*B1, the odds of a 35 digit factor being smooth enough to turn up are about 1 in 106. Increase B1 to 10,000,000 and the odds go up to 1 in 30. At B1=100,000,000, the odds go up to 1 in 13, and at B1=1,000,000,000, the odds are about 1 in 7. This last bound represents about the same amount of work as the 1340 ECM curves which should have an almost 2/3 chance of finding the factor, but the P-1 work at B1=10^9, B2=10^12 only has a 1 in 7 chance of finding this factor. Go up to a 40-digit factor and the odds for P-1 get even worse! If we are doing ECM for factors of a certain size, we certainly want to do some P-1 work, but the point at which it makes sense to stop is the point where further P-1 work has a smaller chance (per unit of time) of finding the factors than more ECM. The advantage of trying different groups in ECM is a powerful one, which is why it makes more sense to run 1800 curves at B1=1,000,000 in ECM than to run one curve at B1=1,800,000,000. You aren't betting on any one curve being fairly smooth, you are betting on one in 1800 curves being very smooth. By the way, I'm guessing that a fast Athlon could run stage 2 P-1 on F25 up to a stage 2 bound of 500,000,000 within a year. You would probably want to allocate 150MB to 200MB for it. Is anyone game? Phil

 Similar Threads Thread Thread Starter Forum Replies Last Post burrobert GMP-ECM 6 2012-11-08 13:05 Raman Math 8 2009-04-13 19:20 Dirac Factoring 11 2007-11-01 14:01 bongomongo Factoring 5 2006-12-21 18:19 philmoore Math 131 2006-12-18 06:27

All times are UTC. The time now is 22:31.

Thu Apr 15 22:31:20 UTC 2021 up 7 days, 17:12, 1 user, load averages: 2.15, 2.28, 2.35