mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > GMP-ECM

Reply
 
Thread Tools
Old 2006-05-14, 08:56   #1
Jushi
 
Jushi's Avatar
 
Sep 2005
UGent

6010 Posts
Default "Optimal" parameters for ECM

I am wondering how the optimal parameters for ECM were determined. Given the B1 and B2 bounds, the Brent-Suyama polynomial, I suppose the expected number of curves can be found using heuristics.

But this still leaves the question: Why should the given B1 be the optimal one?

I did a little experiment: I looked how long it would take to find a factor of 40 digits of a C161 (according to ecm -v). It turns out that (for my machine) B1=5e6 was about 1% faster than the recommended 3e6. This difference is very small, but maybe it's larger for other numbers/other machines.
Jushi is offline   Reply With Quote
Old 2006-05-14, 11:53   #2
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

9A316 Posts
Default

The commonly used values for particular factor sizes (i.e. B1=1M, 3M, 11M, 44M, etc) are the result of analysis of the expected time to find a factor of given size (i.e. 35, 40, 45, 50 digits), looking for the minumum expected time. Historically, this analysis has been carried out for an enhanced standard continuation. With some realistic assumptions about the relative speed of the two stages, the above B1 values were found to be approximately optimal. The most detailed analysis I know of is in Silverman and Wagstaff, "A Practical Analysis of the Elliptic Curve Factoring Algorithm".

Many assumptions made in that analysis don't really hold for an FFT stage 2 as GMP-ECM uses, so it is not very surprising that the optimal B1 and especially B2 values will be different. We are sticking with the traditional B1 values anyhow mainly for three reasons:

1. afaik no-one has done a better analysis so far, i.e. one that takes the particulars of the FFT stage 2 into account.

2. most of the book-keeping on how much ECM work has been done on particular numbers lists the number of curves done at the traditional B1 values. Using different B1 values now would cause a huge mess. This problem would vanish if everyone counted the expected number of times a factor of a different sizes should have been found yet as we are doing in the Cunningham subforum here. Converting the existing tables would be a lot of work, though.

3. it doesn't make a huge difference anyway. The graph of the expected time as a function of B1 is really flat around the minimum, being somewhat off has only a vanishing effect on overall efficiency. As you observed, increasing the B1 value by 66% only reduced the expected time to find a factor by 1%.

Alex
akruppa is offline   Reply With Quote
Old 2006-05-15, 01:40   #3
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

164448 Posts
Default

Quote:
Originally Posted by akruppa
The commonly used values for particular factor sizes (i.e. B1=1M, 3M, 11M, 44M, etc) are the result of analysis of the expected time to find a factor of given size (i.e. 35, 40, 45, 50 digits), looking for the minumum expected time. Historically, this analysis has been carried out for an enhanced standard continuation. With some realistic assumptions about the relative speed of the two stages, the above B1 values were found to be approximately optimal. The most detailed analysis I know of is in Silverman and Wagstaff, "A Practical Analysis of the Elliptic Curve Factoring Algorithm".

Many assumptions made in that analysis don't really hold for an FFT stage 2 as GMP-ECM uses, so it is not very surprising that the optimal B1 and especially B2 values will be different. We are sticking with the traditional B1 values anyhow mainly for three reasons:

1. afaik no-one has done a better analysis so far, i.e. one that takes the particulars of the FFT stage 2 into account.
We can say the following:

(1) Spending an equal amount of time in step 1 and step 2 maximizes
the per-unit time probability of success. This is independent of the
step 2 METHOD. If a particular step 2 run K times as fast as step 1,
and we take step 1 to B1, then we should take Step 2 to K*B1

In theory, an ordinary FFT implementation [by say a perfect programmer]
can take an ordinary step 2 FFT (i.e. without the Brent-Suyama or
similar extensions) to B1^2 in the same time it took step 1 to run to B1.

The Suyama-Brent and other "polynomial-type " convolutions complicate
matters. I have not analyzed them at all.
R.D. Silverman is offline   Reply With Quote
Old 2006-05-16, 17:36   #4
VJS
 
VJS's Avatar
 
Dec 2004

13·23 Posts
Default

Recently I (or my computers) have been working of ECM factoring of numbers in the range of 150-180 digits prior to snfs.

Let me start by saying that ECM is not my feild nor do I have alot of time to read papers. I'm just starting to understand nfs although I'm pretty good at monkey see monkey do. So I appreciate the input from experience users and those with extensive knowledge in the feild.

My observations on factoring numbers 90-180, I generally start snfs too early or miss alot of the smaller factors with ecm when following the guidelines and using the 2/7's rule allowing ecm to choose it's own b2. This somewhat discourages me from performing ECM on these smaller numbers.

Lately I've been using Bob's advice of time B1=timeB2 with alot more success, on my cpu's this generally means that B2 should be alot higher than recommended by the program.

Example for the levels I've been using:

40-digit B1=3e6 B2=1e10
45-digit B1=11e6 B2=1e11
50-digit B1=5e7 B2= 1e12

I guess it helps that my machines are not memory bound (2G).

What I have been finding is if I would have started at the 45 or 50-digit level I probably would have factored the number in less time.

Just wondering about your opinions, if you plan to ecm a number to a particular level prior to NFS. Are you better off starting at that level and using all the time you would have invested in the lesser levels at the maximum level, i.e. doing more than the required curves at 50 since you didn't complete 45, 40, 35, etc...

Or perhaps a small percentable of each level say 10% followed by 20% more curves at the maximum level.

Thanks, Just interested in some input and suggestions. I'm sorry if this has been discussed before I didn't search.

Last fiddled with by VJS on 2006-05-16 at 17:37
VJS is offline   Reply With Quote
Old 2006-05-17, 11:32   #5
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22·5·373 Posts
Default

Quote:
Originally Posted by VJS
Recently I (or my computers) have been working of ECM factoring of numbers in the range of 150-180 digits prior to snfs.

Let me start by saying that ECM is not my feild nor do I have alot of time to read papers. I'm just starting to understand nfs although I'm pretty good at monkey see monkey do. So I appreciate the input from experience users and those with extensive knowledge in the feild.

My observations on factoring numbers 90-180, I generally start snfs too early or miss alot of the smaller factors with ecm when following the guidelines and using the 2/7's rule allowing ecm to choose it's own b2. This somewhat discourages me from performing ECM on these smaller numbers.

Lately I've been using Bob's advice of time B1=timeB2 with alot more success, on my cpu's this generally means that B2 should be alot higher than recommended by the program.

Example for the levels I've been using:

40-digit B1=3e6 B2=1e10
45-digit B1=11e6 B2=1e11
50-digit B1=5e7 B2= 1e12

I guess it helps that my machines are not memory bound (2G).

What I have been finding is if I would have started at the 45 or 50-digit level I probably would have factored the number in less time.

Just wondering about your opinions, if you plan to ecm a number to a particular level prior to NFS. Are you better off starting at that level and using all the time you would have invested in the lesser levels at the maximum level, i.e. doing more than the required curves at 50 since you didn't complete 45, 40, 35, etc...

Or perhaps a small percentable of each level say 10% followed by 20% more curves at the maximum level.

Thanks, Just interested in some input and suggestions. I'm sorry if this has been discussed before I didn't search.

My paper with Sam Wagstaff "A Practical Analysis of ECM..." analyzes
when to switch from ECM to QS (NFS was in its infancy when we wrote the
paper).

Basically, if NFS/QS can do it faster than the *expected* time for ECM to
succeed, then you should switch. How does one compute the *expected*
time, when the size of the factor(s) is unknown? Uses Bayes' Theorem
combined with the following prior:

The probability that a large integer N has a factor between y and y^(1+e)
is e/(e+1).

Running ECM yields a statistical sample. You can compute the probability
that a factor of given size exists based on ECM *failures*.

You combine these two with Bayes' Thm, compute the expected value of
the resulting posterior, estimate the amount of time ECM will take to
find a factor of that size, and compare it with the time NFS/QS will take.

Doing this is somewhat messy. I had code that used Macsyma, but I
no longer have that code. It could be redone in Maple or (with more work)
in any convenient HLL.
R.D. Silverman is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Aouessare-El Haddouchi-Essaaidi "test": "if Mp has no factor, it is prime!" wildrabbitt Miscellaneous Math 11 2015-03-06 08:17
Optimal Parameters for Small Factors patrickkonsor GMP-ECM 16 2010-09-28 20:19
optimal parameters for GMP-ECM , -oe+ , -I Walter Nissen GMP-ECM 16 2007-03-20 19:35
Would Minimizing "iterations between results file" may reveal "is not prime" earlier? nitai1999 Software 7 2004-08-26 18:12

All times are UTC. The time now is 21:12.


Fri Aug 6 21:12:23 UTC 2021 up 14 days, 15:41, 1 user, load averages: 2.39, 2.54, 2.54

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.