![]() |
|
|
#1409 |
|
Jun 2003
5,087 Posts |
You can say that ECM is 14.81% complete to 65-digit level or you could say that ECM is complete to 60.7 digits (60 + (65-60)*14.81%). I am not sure that the second one is the right way to calculate the precise digit level, but it has the advantage of having only one metric (completed digit level), instead of two (target digit level and % complete).
|
|
|
|
|
|
#1410 | |
|
"Victor de Hollander"
Aug 2011
the Netherlands
23·3·72 Posts |
Quote:
I prefer the " xx curves tested B1=y" , just like the ECM progess report. Or "53,321 of 360,000 curves tested (B1=800,000)". |
|
|
|
|
|
|
#1411 |
|
"GIMFS"
Sep 2002
Oeiras, Portugal
2·11·67 Posts |
ECM is an asymptotic method, that gives a certain probability of finding a factor of a given size after running the prescribed number of curves (i.e., even after running 360000 curves with B1=800,000,000, we can´t be sure that no factor smaller than 65 digits exists). So I think the quoted wording is misleading.
Last fiddled with by lycorn on 2014-11-05 at 16:37 |
|
|
|
|
|
#1412 | |
|
Serpentine Vermin Jar
Jul 2014
3,313 Posts |
Quote:
Given that ECM curves in some particular bound aren't really a guarantee of finding any factors in that range, if we stipulate that fact then we could still say "ECM work is xx% complete at B1=xxx (or @ xx digits)", because we're not saying anything about ECM itself, just that we've done x% of the stipulated maximum work we're going to bother with. If I had the inclination, and perhaps incurable insomnia, I might actually look at why the # of curves for each bound are determined the way they are. I got the bit about 1/e (~37%) chance of finding a factor, I just don't know how or why that's important. It sounds like you're saying it approaches some asymptotic value, so I'm assuming anything beyond the curves / bound that are identified have a REALLY low chance of success. Anyway, if we can find some way to represent that it's x% done in some way or another while also conveying that, hey, ECM work isn't really *ever* complete... Maybe "ECM work is X% of the way done to a goal of Y", so we're merely stating that wherever we end up, it's a goal, not the end all, be all. :) |
|
|
|
|
|
|
#1413 |
|
"Curtis"
Feb 2005
Riverside, CA
10011000010102 Posts |
While you're correct that doing more curves after the prescribed number has a low chance of success, that alone isn't reason not to do them; it's that once that number of curves is complete, it is more efficient use of computrons to step up to the B1 that's optimal for finding a factor 5 digits larger.
If you want to play with the number of curves for various digit levels and B1 values, I suggest downloading GMP-ECM and using the -v flag; it will spit out the expected number of curves to find a factor of various sizes for any B1 you choose to start with. If you want curve counts as used by GIMPS, you'll have to also specify B2 as 100* B1 (since GMP-ECM uses a different algorithm, its default B2 values are much higher than Prime95's- also, you should use a small input number to play with so you don't get not-enough-memory faults). Roughly, it takes 6 times more effort to complete a level 5 digits larger. There is a best B1 value for each digit level, but it is not necessary to use precisely that B1; using one that's too small will take more curves to complete the level, while using one that's too big will take more time per curve without a corresponding reduction in the number of curves needed. Time per curve in stage 1 scales linearly with B1; I'm not sure about stage 2 in Prime95. For each candidate number, there is a limit to B1/B2 bounds that fit in the allowed memory for ECM. So, one might choose suboptimal B1/B2 combinations due to memory constraints, but still do meaningful work toward a digit level. In the case of your B1=800M example, curves run at 400M would still help with the t65 level, but not as well as half an 800M curve. For simplicity, GIMPS has chosen to record ECM work as a sum of B1 bound* curves run as a proxy for digit level; this is not-quite-accurate, but isn't far enough off to really matter as long as people aren't doing silly things like reporting thousands of curves at B1=3M when a t60 has already been done. The readme file that comes with GMP-ECM may also help solidify these concepts. I don't know how GMP-ECM calculates the expected curve counts, but I'm pretty sure RDS wrote a paper on it that you could code given time and interest. Googling something like "silverman optimal ECM bounds" might do it (sorry, I'm lazy, didn't try myself). Last fiddled with by VBCurtis on 2014-11-05 at 19:21 Reason: curve time comment |
|
|
|
|
|
#1414 | |
|
Serpentine Vermin Jar
Jul 2014
3,313 Posts |
Quote:
I did wonder if extra curves were run at the lower bounds that technically add to that "total ECM effort" #, so it doesn't necessarily apply to how many curves have been run at the higher bounds using the simple arithmetic there. I guess there's always that chance. I guess when I get my quantum CPU built, I'll go back through and find all the missing factors of the first few million exponents during my lunch break.
|
|
|
|
|
|
|
#1415 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
From the GMP-ECM README:
Code:
The ECM method is a probabilistic method, and can be viewed in some sense
as a generalization of the P-1 and P+1 method, where we only require that
P+t+1 is smooth, where t depends on the curve we use and satisfies
|t| <= 2*P^(1/2) (Hasse's theorem). The optimal B1 and B2 bounds have
to be chosen according to the (usually unknown) size of P. The following
table gives a set of nearly optimal B1 and B2 pairs, with the corresponding
expected number of curves to find a factor of given size (column "-power 1"
does not take into account the extra factors found by Brent-Suyama's exten-
sion, whereas column "default poly" takes them into account, with the poly-
nomial used by default: D(n) means Dickson's polynomial of degree n):
digits D optimal B1 default B2 expected curves
N(B1,B2,D)
-power 1 default poly
20 11e3 1.9e6 74 74 [x^1]
25 5e4 1.3e7 221 214 [x^2]
30 25e4 1.3e8 453 430 [D(3)]
35 1e6 1.0e9 984 904 [D(6)]
40 3e6 5.7e9 2541 2350 [D(6)]
45 11e6 3.5e10 4949 4480 [D(12)]
50 43e6 2.4e11 8266 7553 [D(12)]
55 11e7 7.8e11 20158 17769 [D(30)]
60 26e7 3.2e12 47173 42017 [D(30)]
65 85e7 1.6e13 77666 69408 [D(30)]
Table 1: optimal B1 and expected number of curves to find a
factor of D digits with GMP-ECM.
After performing the expected number of curves from Table 1, the
probability that a factor of D digits was missed is exp(-1), i.e.,
about 37%. After twice the expected number of curves, it is exp(-2),
i.e., about 14%, and so on.
Example: after performing 8266 curves with B1=43e6 and B2=2.4e11
(or 7553 curves with -dickson 12), the probability to miss a 50-digit
factor is about 37%.
Running a bunch of curves at a given level (e.g. B1=260e6 for t60) does put work in to increasing the next highest t level, but running (say) 10x the recommended t60 B1 curves is a lot less efficient at finding a 65 digit factor than running the recommended t65 B1 curves. (Note that the 10x is just a random number -- I don't know "how many t60s is equivalent to a t70".) By default rather than by any substantive reasoning, as far as Mersenne numbers go where ECM is the only plausible method, the current practice is to complete 1t60, then complete 1t65, then 1t70 etc. until a factor is found. The work increases exponentially of course (or rather, I believe it's sub-exponentitally; but still quite sharply). Last fiddled with by Dubslow on 2014-11-06 at 02:22 |
|
|
|
|
|
#1416 | |
|
Einyen
Dec 2003
Denmark
2×1,579 Posts |
Quote:
http://www.mersenneforum.org/showthread.php?t=5871 and it all comes back to the often quoted paper by Silverman and Wagstaff, "A Practical Analysis of the Elliptic Curve Factoring Algorithm": http://www.ams.org/journals/mcom/199...993-1122078-7/ Last fiddled with by ATH on 2014-11-06 at 07:27 |
|
|
|
|
|
|
#1417 | |
|
May 2013
East. Always East.
11×157 Posts |
Quote:
Is the B2 = 100 x B1 convention something that we have since determined to be optimal or is that also strictly for book-keeping? I have always had a degree of interest in ECM, either for fully factoring some Mersennes or for finding at least one factor for all Mersennes. I would like to some day understand the math (getting there slowly) and run my own B1 and B2 but in the meantime I would stick to the Primenet "Optimums", assuming that they're designed to maximize the chance of finding a factor given previous effort (previous Bounds and / or currently known factors). |
|
|
|
|
|
|
#1418 | |
|
"Curtis"
Feb 2005
Riverside, CA
114128 Posts |
Quote:
The Wagstaff/RDS paper states that optimal B2 is when time spent in stage 2 is roughly equal to stage 1; that is the source of the 100x standard with Prime95. Current GMP-ECM implementations on modern Core hardware spends a bit under half the time in stage 2 vs stage 1, but I've found that setting B2 larger to match time spent on stage 1 vs stage 2 is not more efficient. Users with tons of memory (say, 16GB or more) can run GMP-ECM on fairly large mersenne candidates on one core; also, note GPU can be used on stage 1 of ECM, while a single core does large-memory stage 2 work. One can choose B2 such that a single core (or two, I suppose, if memory is sufficient) to keep up with the GPU's stage 1 output. See GPU-ECM thread in the ECM forum. I don't know how large the inputs can be for GPU-ECM work. Edit: GPU-ECM has been compiled for 512-bit inputs, 1024-bit, and 4096-bit. I am not sure how large that limit can be compiled for, but at present the max size is hard-coded at compile time. Last fiddled with by VBCurtis on 2014-11-07 at 04:39 |
|
|
|
|
|
|
#1419 | |
|
Serpentine Vermin Jar
Jul 2014
331310 Posts |
Quote:
I had a troublesome server a while back with a flaky memory module. The server had a nasty habit of only crashing when some memory hungry app was running, and only after a while. This was a SQL server (in a cluster, thankfully) and when it was running, eventually SQL would reach enough memory that it hit the bad block and it would crash the entire server (the Proliant detected an uncorrectable memory error and helpfully rebooted). I tried memory burn in tests, both in Windows an standalone bootable programs, but oddly enough the only thing that ever crashed it was SQL. I eventually found the bad module... the Proliant logged where the defective one was installed, but I really hoped to do a good test again once I replaced it to make sure it was truly solved. I mean, let's say you have a server with 192GB of RAM and 72 cores... having something like Prime95 doing huge memory tests while also fully thrashing the CPU would really be an awesome burn-in. It'd also exercise the cooling system and dual power supplies. :) I know the Prime95 stress test will do CPU stuff, but if I wanted to get it doing some memory thrashing, I'm not quite sure how to optimize telling it how much memory to use during ECM work, and how many worker threads should be running the curves to make sure it'll hit as much memory as possible. If that were baked into the stress test somehow, it'd be neat. |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Newer X64 build needed | Googulator | Msieve | 73 | 2020-08-30 07:47 |
| Performance of cuda-ecm on newer hardware? | fivemack | GMP-ECM | 14 | 2015-02-12 20:10 |
| Cause this don't belong in the milestone thread | bcp19 | Data | 30 | 2012-09-08 15:09 |
| Newer msieves are slow on Core i7 | mklasson | Msieve | 9 | 2009-02-18 12:58 |
| Use of large memory pages possible with newer linux kernels | Dresdenboy | Software | 3 | 2003-12-08 14:47 |