![]() |
|
|
#12 | |
|
Sep 2003
5·11·47 Posts |
Quote:
|
|
|
|
|
|
|
#13 |
|
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
224058 Posts |
Suppose you are in a completely dark room and you are catching the black cat that is supposedly there.
You randomly grab space in front of you; you make random moves around you. Now, if the cat is there, you may find it after X attempts. Furthermore, if you have established you grab (that should exceed your reach!) and have done this excercise enough times, you can build a model and you can even make it a sophisticated model: it depends on the size of the cat! You can make quite a nice formula for the X, given enough experimentation. That's 'normal' statistics. But all of the above is correct if the cat is actually there. Bayesian statistics deals with the possibility that the cat is not there. As you grab the air for the third hour, your knowledge that the cat is actually in the room continues to change. Bayes would say that your initial belief (that the cat is there and that you can catch it), or in short, a 'prior', changes into your new belief called the 'posterior'.
|
|
|
|
|
|
#14 | ||
|
Feb 2017
Nowhere
4,643 Posts |
Quote:
Quote:
Still -- taking n = 1, the chances that a number has a factor of 1 digit is approximately... 1! Well, 162/210 is pretty close to 1 ![]() However, the sum of the harmonic series between k and round(e*k ) is also about 1 for any positive integer k, so n should have a factor with between k and round(e*k) digits, if round(k) is no larger than the number of digits of n
Last fiddled with by Dr Sardonicus on 2017-09-29 at 02:09 |
||
|
|
|
|
|
#15 | |
|
"Curtis"
Feb 2005
Riverside, CA
4,861 Posts |
Quote:
The expected number of factors up to sqrt(n) is greater than 1, since every composite number has a factor and some have more than one below sqrt(n). |
|
|
|
|
|
|
#16 |
|
Romulan Interpreter
Jun 2011
Thailand
7·1,373 Posts |
One should not forget that all this calculus is just a coarse approximation. Mostly to avoid logarithms and Euler's constant, and complicate formulas which, in fact, are also mostly intuitive and heuristic, and still unproven. Somehow, the number of "bits" in the representation is here (in this discussion) transformed into number of "digits", but well... this is still an approximation. In fact, George does a nice dissection of it, in the math page, where he explains about chances to find a factor. If you clicked the link, then look there for the "Euler's constant" contraption, and click the links to the UTM page around that contraption...
The (n-1)/n stuff comes from the fact that the harmonic series sums to log(n), and from the fact that there are ~n/log(n) primes up to n. Now, of course, if you look for factors of some N, you only have to look below n=sqrt(N), and also, in case of mersenne numbers, ou look for special ("rarer" than normal) primes q=2kp+1, where p is your exponent (of m=2^p-1), and they are which are 1 or 7 (mod 8). [we try hard to stick to this notation along our posts here, hehe]. But this only complicates the calculus, at not much benefit for the accuracy. You can stick to 1/n and so. Last fiddled with by LaurV on 2017-09-29 at 05:14 |
|
|
|
|
|
#17 |
|
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
22×7×167 Posts |
Yeah Yeah I know its just a fluke.
But after 2 exponents in my range of interest and a total of 110 curves between them I have my first factor! |
|
|
|
|
|
#18 |
|
Dec 2011
11·13 Posts |
Going back to petrw1's original question. Assume the people who wrote your favorite ECM factoring program have done all the nasty math involving the Pi and Dickman's function and have come up with the optimal number of curves to run at each B1 level. For example, a certain version of gmp-ecm recommends running 206 curves at the B1=50000 level.
You can think of this very much like a spinner with 206 spaces. 205 spaces say "loser". The final space is secretly marked by a number god to say "winner", but only if there is a 25-digit factor. (To be more precise, I mean if it has a factor in the range of your B1, or about 25-digits.) But, if there is no 25-digit factor, the number god marks that 1 space "loser". Then the number god hides all of the markings. After you expend the work to run an ECM curve, the contents of one random spot on the spinner is revealed to you, and then hidden before the next contestant runs. If the number has no 25-digit factor, it doesn't matter how many curves you run or how many times you spin the spinner, it will never show "winner". Now, let's consider if the number actually has a 25-digit factor The first time you take a crack at the spinner, you really do have a 1/206 chance of finding a 25-digit factor. The other 205 out of 206 times, the spinner shows "loser". Do you want to spin again, or not? Still assuming the number actually has a 25-digit factor, if you spin a second time, your chance of finding a 25-digit factor is 1/206*205/206. (You only got to ask this question if the first spin was a loser.) So, your chance of finding a factor on the 2nd spin is a little lower than your chances on the first spin. Do you spin again? Same assumptions. If you spin a third time, your chance of finding a 25-digit factor is 1/206*(205/206)^2. (You only get to ask this question if the first two spins were losers.) Do you spin again? On a spinner with a large number, n, of positions, the probability that you will have no winner after n spins approaches 1/e. Code:
1 - Sum[1/206*(205/206)^(i - 1), {i, 1, 206}] // N
0.366985
1/Exp[1] // N
0.367879
The common wisdom is that you should generally give up on one B1 level when the probability of missing a factor declines to 1/e. But, it is also well proven that the exact place where you change from one B1 level to the next doesn't really matter very much. And this isn't very relevant to your question, except if you want to do a zillion curves at the B1=50000 level. Now, we can consider the OP's original question. Assuming the OP's goal is to maximize the number of factors he finds per unit time, running only B1=50000 curves, and if all candidates to be tested took the same amount of time to run a single ECM curve, then the OP should run just 1 curve on each number. However, the OP's choice of candidates have varying digit-lengths. And the cost of running an ECM curve is higher as the candidate's length increases. And this is where the practical answer to his question gets messy. Ideally, the OP ("you") should measure the cost of running a single curve at various digit-lengths in your range of interest. Then for each potential candidate, you should divide the time required to run the kth curve for that candidate by the probability (as shown by the spinner formulas above) that the candidate's kth curve will produce a factor. This represents the instantaneous expected value of the "time to find a factor, if it exists" for each candidate. Always pick the candidate whose expected time to find a factor is smallest, among your choice of candidates. And especially realize that this value increases slightly with each curve you run. Finally, to provide an accurate expected "time to find the next factor", you would also need to account for the probability that a candidate actually has a factor in the 25-digit range. (I.e., divide the "time to find a factor, if it exists" by 0.19, if your sources tell you these candidates have a 19% chance of having a factor in the B1=50000 (25-digit) range.) Last fiddled with by rcv on 2017-09-29 at 11:42 |
|
|
|
|
|
#19 | |
|
Aug 2002
Buenos Aires, Argentina
101010101102 Posts |
Quote:
So you can easily select 280 curves, and none of them coincide with the lucky ones. Or you can run 280 curves and hit 4 lucky curves, so you would have found 4 times that number p. So the number of "winner slots" and "loser slots" is not even known. |
|
|
|
|
|
|
#20 |
|
Dec 2011
8F16 Posts |
Of course, you are right. As long as an *average* of 1 in n ECM curves is one of the "lucky curves", and as long as the designer of the ECM program has applied the "common wisdom" in selecting the number of curves for each B1 level (which decision depends on several factors beyond the scope of the OP's question), then I think my spinner analogy holds, and a regular mortal (non PhD mathematician) can do the arithmetic.
|
|
|
|
|
|
#21 |
|
Aug 2002
Buenos Aires, Argentina
25268 Posts |
This is also not correct. Some group orders are smoother than others, so by running 280 curves with B1 = 50000, B2 = 5 million, there is a probability of about 10% to find a 30-digit factor. In this case the largest prime factor of the group order would be less than B2 and the second largest prime factor of the group order would be less than B1.
|
|
|
|
|
|
#22 | |
|
Dec 2011
11·13 Posts |
Quote:
Unless you believe my analogy is practically flawed, further discussion of the details probably belongs in another thread or in an offline discussion. I think we all know you can find larger factors than the primary target of a B1 level, and you can find smaller factors that were missed by the previous B1 level. And that your chance of success at a given B1 level also depends on your choice of B2. The following sample table from an older version of gmp-ecm, and which was produced (if I recall correctly) through gmp-ecm's sophisticated implementation and use of Dickman's functions, provided the user with an quantifiable glimpse of the final effects of all of this voodoo. Would you be happier if I suggested an ECM curve at the B1=50000 level is analagous to simultaneously spinning 5 spinners? Code:
// Data from gmp-ecm 6.2 // Expected number of curves to find a factor of n digits: (digits >= 1301) // B1 20 25 30 35 40 45 50 55 60 65 // 2K 817 56083 6456542 // 11K 76 1589 51736 2379385 // 50K 20 204 3115 65277 1739727 //250K 8 47 401 4553 64814 1115706 // 1M 5 21 123 957 9094 102769 1362692 // 3M 3 12 56 336 2462 21115 210246 2369351 // 11M 3 7 26 124 702 4628 34585 292164 2735028 // 43M 2 5 15 56 253 1323 7837 51385 373591 2960912 //110M 2 4 10 34 132 600 3062 17350 108466 732708 //260M 2 3 8 23 83 336 1525 7670 42167 251141 //850M 1 3 6 15 47 168 661 2868 13625 69478 //2.9G 1 2 4 11 29 91 316 1202 4960 22000 //4.3G 1 2 4 9 25 75 252 925 3672 15664 |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Odds of Finding a Factor | Gordon | Factoring | 18 | 2015-09-20 20:33 |
| how much ECM without finding an existing factor | dbaugh | PrimeNet | 4 | 2013-01-11 16:31 |
| p-1 testing ram vs finding a factor | crash893 | Software | 11 | 2006-07-03 21:48 |
| Probability of finding a factor | JuanTutors | Software | 20 | 2004-09-26 09:47 |
| possibility of finding a factor | there_is_no_spoon | Math | 10 | 2004-03-11 20:05 |