mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Math

Reply
 
Thread Tools
Old 2017-09-29, 15:42   #23
Dr Sardonicus
 
Dr Sardonicus's Avatar
 
Feb 2017
Nowhere

7×23×29 Posts
Default

As I mentioned before in a recent thread, I don't know much about ECM. However, I did notice that the paper FACTORIZATION OF THE TENTH FERMAT NUMBER has a discussion of the performance of ECM. Perhaps that will be helpful to keener minds than my own...
Dr Sardonicus is offline   Reply With Quote
Old 2017-09-29, 16:15   #24
alpertron
 
alpertron's Avatar
 
Aug 2002
Buenos Aires, Argentina

101010101102 Posts
Default

Quote:
Originally Posted by rcv View Post
Code:
// Data from gmp-ecm 6.2
// Expected number of curves to find a factor of n digits: (digits >= 1301)
//  B1 20   25     30      35      40      45      50      55      60      65
//  2K 817  56083  6456542
// 11K 76   1589   51736   2379385
// 50K 20   204    3115    65277   1739727
//250K 8    47     401     4553    64814   1115706
//  1M 5    21     123     957     9094    102769  1362692
//  3M 3    12     56      336     2462    21115   210246  2369351
// 11M 3    7      26      124     702     4628    34585   292164  2735028
// 43M 2    5      15      56      253     1323    7837    51385   373591  2960912
//110M 2    4      10      34      132     600     3062    17350   108466  732708
//260M 2    3      8       23      83      336     1525    7670    42167   251141
//850M 1    3      6       15      47      168     661     2868    13625   69478
//2.9G 1    2      4       11      29      91      316     1202    4960    22000
//4.3G 1    2      4       9       25      75      252     925     3672    15664
As you can see from this table, you can get prime factors p with any value of B1 and B2 if you run enough number of curves (given that this number of curves is a lot less than sqrt(p)), but notice that there is a question of performance here.

The execution time for processing a curve is approximately proportional to B1. The value fo B2 depends on B1 and the specific implementation (GMP-ECM is completely different from Prime95, so B2 is a lot greater in the former, thus requiring less curves). Normally the value of B2 is selected so the second step (the one which uses B2) takes about the same time than the first step (the one which uses B1).

If you read the 25-digit column, you can use 56083 curves with B1 = 2000, 1589 curves with B1 = 11K and so on. In order to minimize the timing to find the factor, you have to minimize the product of the number of curves and B1. You can see that the minimum occurs for B1 = 50K. You will find 25-digit prime factors with other values of B1 (greater or smaller) but it will be slower.

Last fiddled with by alpertron on 2017-09-29 at 16:29
alpertron is offline   Reply With Quote
Old 2017-09-29, 17:24   #25
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

486810 Posts
Default

Quote:
Originally Posted by alpertron View Post
Normally the value of B2 is selected so the second step (the one which uses B2) takes about the same time than the first step (the one which uses B1).
I disagree- the default GMP-ECM settings choose B2 such that the second step takes about 40% of the time of the first step. For most B1 settings, this ratio is optimal (in that manually choosing larger B2 will be faster only for some B1s, specifically where k=5 or k=6 by default but increasing B2 to the next-larger size such that k=2). "Equal time for stage 2" is optimal for no implementation of ECM that I've come across (admittedly, that's only P95 and GMP).

Quote:
Originally Posted by alpertron View Post
If you read the 25-digit column, you can use 56083 curves with B1 = 2000, 1589 curves with B1 = 11K and so on. In order to minimize the timing to find the factor, you have to minimize the product of the number of curves and B1. You can see that the minimum occurs for B1 = 50K. You will find 25-digit prime factors with other values of B1 (greater or smaller) but it will be slower.
I disagree here, too; B1 choices are more for tradition than for optimal speed on GMP-ECM 7. Compare B1=20M to B1=11M for t45 time, or B1=60M to B1=43M for t50 time. Not only are those choices a bit faster than the smaller B1 values, they have increased chances to find larger factors "for free". Both 20M and 60M use default B2 values that use k=2 (two phases for stage 2), and are near the smallest B1s that use those B2 values.
A similar effect happens for B1=150M vs B1 = 110M, etc.
VBCurtis is offline   Reply With Quote
Old 2017-09-29, 17:29   #26
xilman
Bamboozled!
 
xilman's Avatar
 
"π’‰Ίπ’ŒŒπ’‡·π’†·π’€­"
May 2003
Down not across

22×5×72×11 Posts
Default

Quote:
Originally Posted by rcv View Post
II think we all know you can find larger factors than the primary target of a B1 level, and you can find smaller factors that were missed by the previous B1 level.
At the risk of being thought condescending, I believe those are consequences of the already acknowledged statement that a test to a Px level has a 1/e probability of missing a factor of x digits. The finding of a smaller factor is (or should be) obvious. It was missed with a probability of 1/e. The probability of finding a larger factor is less obvious. I'll leave finding a (relatively) simple argument as an exercise for the reader.
xilman is offline   Reply With Quote
Old 2017-09-29, 17:50   #27
alpertron
 
alpertron's Avatar
 
Aug 2002
Buenos Aires, Argentina

2·683 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
I disagree- the default GMP-ECM settings choose B2 such that the second step takes about 40% of the time of the first step. For most B1 settings, this ratio is optimal (in that manually choosing larger B2 will be faster only for some B1s, specifically where k=5 or k=6 by default but increasing B2 to the next-larger size such that k=2). "Equal time for stage 2" is optimal for no implementation of ECM that I've come across (admittedly, that's only P95 and GMP).
The exact percentage depends on the implementation. For example, in my old Java applet, setting B2 = 100 B1 gave me step 2 time equal to step 1 time. What is important in the previous discussion is that the time is approximately proportional to B1.

Quote:
Originally Posted by VBCurtis View Post
I disagree here, too; B1 choices are more for tradition than for optimal speed on GMP-ECM 7. Compare B1=20M to B1=11M for t45 time, or B1=60M to B1=43M for t50 time. Not only are those choices a bit faster than the smaller B1 values, they have increased chances to find larger factors "for free". Both 20M and 60M use default B2 values that use k=2 (two phases for stage 2), and are near the smallest B1s that use those B2 values.
A similar effect happens for B1=150M vs B1 = 110M, etc.
What I said is that the product number of curves times B1 has to be minimized. That will mean that the best value of B1 will be slightly different than the ones computed 25 years ago, because the current programs were not available then.
alpertron is offline   Reply With Quote
Old 2017-09-30, 00:27   #28
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

23×3×72 Posts
Default

I'm going to play devils advocate for a second as a layman non-math guy:

Would running P-1 (extenting current B1,B2 bounds) on (almost) 'virgin' ECM targets be a viable option for the purpose of the OP?

Target of the OP is to find many factors with minimal effort (in GHzdays) on <10M exponents right?

Let's take for example:
9105077
https://www.mersenne.org/report_expo...ll=1&ecmhist=1
TF 2^68
P1 (B1=110,000, B2=2,007,500)
3 ECM curves B1=50,000 (so almost no ECM done)

Probably incorrect/nonsence following:
{
I thought P-1 was much faster than ECM (10-100 times in the case of mersenne numbers because of the 'free P' in the calculus?).
So could that work out? Do P-1 with 100x the B1 bounds of ECM before starting ECM??

This of course doesn't take into account the already done 3 ECM curves and P-1 done:
M9105077, factored to 68 bits P-1 with B1=5,000,000 and B2=50,000,000
Probability = 7.755637%
Should take about 2.986382 GHz-days (using FFT size 480K)

This is much better in terms of probability than the P-1 that was already done:
M9105077, factored to 68 bits, with B1=110,000 and B2=2,007,500, using K*B^N+C = 1*2^9105077-1
Probability = 2.587171%
Should take about 0.085515 GHz-days (using FFT size 480K)

This compares to only about 7 ECM curves with B1=50,000 in terms of effort in GHzdays
ECM M9105077 (curves=7, B1=50000, B2=5000000) = 2.94200439 GHz-Days
}

But as I said, that is probably nonsense (to stay in the RDS theme) .
VictordeHolland is offline   Reply With Quote
Old 2017-09-30, 01:33   #29
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

469310 Posts
Default

Quote:
Originally Posted by VictordeHolland View Post
I'm going to play devils advocate for a second as a layman non-math guy:

Would running P-1 (extenting current B1,B2 bounds) on (almost) 'virgin' ECM targets be a viable option for the purpose of the OP?

Target of the OP is to find many factors with minimal effort (in GHzdays) on <10M exponents right?

. . . . .
Interesting...thx
petrw1 is online now   Reply With Quote
Old 2017-10-02, 11:09   #30
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

25B916 Posts
Default

Quote:
Originally Posted by VictordeHolland View Post
But as I said, that is probably nonsense (to stay in the RDS theme) .
[again, if my memory serves me right, actually, RDS advocated for P-1, as opposed to TF]
LaurV is offline   Reply With Quote
Old 2017-10-02, 14:27   #31
axn
 
axn's Avatar
 
Jun 2003

2×3×7×112 Posts
Default

Quote:
Originally Posted by VictordeHolland View Post
I thought P-1 was much faster than ECM (10-100 times in the case of mersenne numbers because of the 'free P' in the calculus?).
So could that work out? Do P-1 with 100x the B1 bounds of ECM before starting ECM??

This of course doesn't take into account the already done 3 ECM curves and P-1 done:
M9105077, factored to 68 bits P-1 with B1=5,000,000 and B2=50,000,000
Probability = 7.755637%
Should take about 2.986382 GHz-days (using FFT size 480K)

This is much better in terms of probability than the P-1 that was already done:
M9105077, factored to 68 bits, with B1=110,000 and B2=2,007,500, using K*B^N+C = 1*2^9105077-1
Probability = 2.587171%
Should take about 0.085515 GHz-days (using FFT size 480K)
So, doing P-1 to 5e6 over and above P-1 done to 1e5 gives you appr 5% (7.7-2.6) chance of success. Not bad. This is probably not valid for smaller expo (say < 3M) which already would have had large-ish P-1 and/or some/lot of ECM.
axn is offline   Reply With Quote
Old 2017-10-02, 17:06   #32
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

13×192 Posts
Default

Quote:
Originally Posted by LaurV View Post
[again, if my memory serves me right, actually, RDS advocated for P-1, as opposed to TF]
Did he have a preference for P-1 vs ECM for smaller exponents.

Victor makes a good case for it....unless a LOT of ECM has already been done.
petrw1 is online now   Reply With Quote
Old 2017-10-03, 03:15   #33
WraithX
 
WraithX's Avatar
 
Mar 2006

479 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Did he have a preference for P-1 vs ECM for smaller exponents.

Victor makes a good case for it....unless a LOT of ECM has already been done.
I'm very interested in this topic and I've recently been reading over RDS's paper titled "A Practical Analysis of ECM". As to your P-1 question, at the end of page 7 he writes:
Quote:
Remark. We note, as a practical matter, that if one wants to perform ECM
with just one curve, then one should use the P - 1 algorithm instead, since it
is significantly faster.
I'm still trying to figure out, once we have the probability of success for each digit level for a given set of B1/B2 values, how do the odds change when we run another curve at the same B1/B2 value? Or how do the odds change when we run a curve with a different B1/B2 value?

From the table rcv posted, I understand that the chance of finding a 25-digit factor with B1=50k is 1/204. And the chance of finding a 25-digit factor with B1=250k is 1/47.

The following may be a bit of a stretch, but it is my current understanding of RDS's paper. I see the equation: \(1 - [1 - P(B1,B2)]^L\), where P(B1,B2) is the probability of success and L is the number of curves. If I'm reading this correctly, this gives us our chance of success after running L curves at a particular B1/B2 value.

So, if we run 1 curve at B1=50k, we get:
\(1 - [1 - (1/204)]^1 \approx 0.0049\), ie a 0.49% chance of success, or a 99.51% chance of failure.
If we run 1 curve at B1=250k, we get:
\(1 - [1 - (1/47)]^1 \approx 0.0213\), ie a 2.13% chance of success, or a 97.87% chance of failure.

How do we combine these numbers? Multiplying the successes together doesn't seem right, that seems to give us less chance of success. Multiplying the failures together seems to go in the right direction, since 99.51%*97.87% gives us 97.39%, which means our chance of success is 2.61%? Is that right?

Assuming the above is correct, then running the above like this:
20 curves at B1=50k:
\(1 - [1 - (1/204)]^{20} \approx 0.0936\), ie a 9.36% chance of success, or a 90.64% chance of failure.
20 curves at B1=250k:
\(1 - [1 - (1/47)]^{20} \approx 0.3496\), ie a 34.96% chance of success, or a 65.04% chance of failure.

And to combine those we would multiply 90.64%*65.04% to get 58.95% chance of failure, which means our chance of success is 41.05%. Does this sound right?
WraithX is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Odds of Finding a Factor Gordon Factoring 18 2015-09-20 20:33
how much ECM without finding an existing factor dbaugh PrimeNet 4 2013-01-11 16:31
p-1 testing ram vs finding a factor crash893 Software 11 2006-07-03 21:48
Probability of finding a factor JuanTutors Software 20 2004-09-26 09:47
possibility of finding a factor there_is_no_spoon Math 10 2004-03-11 20:05

All times are UTC. The time now is 16:10.


Mon Aug 2 16:10:52 UTC 2021 up 10 days, 10:39, 0 users, load averages: 3.07, 2.45, 2.27

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.