![]() |
ECM probabilities and bit levels
Hi all,
I'm getting into doing more ECM work and I'm wondering -- if the expected number of curves indicated for Prime95 (e.g. 280 curves at B1=50k, 640 curves at B1=250k) are run, and no factor is found, what is the probability that we can exclude a factor in that digit range? In other words, is performing the expected number of curves, the equivalent of a "no factor found" TF result for a specific bit range? Also does anybody have a rough breakdown of what bit level each B1 level corresponds to? I understand that according to this page [url]https://www.alpertron.com.ar/ECM.HTM[/url] that for example B1 of 50000 is the 25 digits, 250000 is 30 digits, so is it just as simple as multiplying these values by ln(10)/ln(2) or ~3.3? Thanks! |
If you do the prescribed number of curves for a certain B1 value (say 640 for B1 = 250,000), the chance that you miss a factor smaller than the corresponding number of digits (should it exist, in the first place...) is roughly 1/e ~36%. For B1 = 250,000 that number is 30 digits. So no, getting a NF-ECM result is not the same as a TF "no factor found", as ECM is probabilistic whereas TF is deterministic.
As for your second question yes, it´s as simple as you stated. PS - Welcome to the ECM tribe... :smile: |
[QUOTE=lycorn;529297]PS - Welcome to the ECM tribe... :smile:[/QUOTE]
Just out of interest... How big are the ECM checkpoint files for the current (reasonable) "wavefront"? I'm looking at offering Colab / Kaggle TF'ers the option of also doing CPU work in parallel -- some might be interested in doing ECM. P.S. Just recovering from an unscheduled power failure... Grrr... Now have reestablish several dozen SSH sessions into servers... |
[QUOTE=chalsall;529298]Just out of interest... How big are the ECM checkpoint files for the current (reasonable) "wavefront"?[/QUOTE]
Not at the current wavefront, but for the two ecm runs that I have (one for M55057 and one for M7777727) the files are 13868 and 1944532 bytes, respectively. There are two backups for each of them. I am not sure if the lengths depend on the B1 or B2 bounds, however. |
[QUOTE=Dylan14;529304]Not at the current wavefront, but for the two ecm runs that I have (one for M55057 and one for M7777727) the files are 13868 and 1944532 bytes, respectively. There are two backups for each of them. I am not sure if the lengths depend on the B1 or B2 bounds, however.[/QUOTE]
For both of those cases, the exponent-to-size ratio is almost exactly 4:1. To be exact, they're 3.97:1 and 3.99979:1, respectively. Using one of my examples, I have a 51540-byte checkpoint for M205759 (B1=250k), for a ratio of 3.9922:1. |
[QUOTE=Dylan14;529304]Not at the current wavefront, but for the two ecm runs that I have (one for M55057 and one for M7777727) the files are 13868 and 1944532 bytes, respectively.[/QUOTE]
Hmmm... So relatively small. You could put that over HTTP without any problem. |
The size of the save files is related to the size of the exponent under test (which in turn determives the size of the FFT used).
I am currently running curves on exponents in the 29xxxx range (15k FFT) and the save files are 72Kb in size. Two save files (main and backup) are used per test, although you may choose to have two backups. If you let the server decide, it will currently assign exponents in the 13 08x xxx range (672 K FFT) and the save files will grow to 3.1 MB each. |
[QUOTE=lycorn;529313]I am currently running curves on exponents in the 29xxxx range (15k FFT) and the save files are 72Kb in size. Two save files (main and backup) are used per test, although you may choose to have two backups.[/QUOTE]
OK. And am I inferring correctly from your statement that this is "non-nominal" work? With regards to instance design, each save file would get "thrown back" to the server when it first was "noticed". It would then be up to the server to decide how many, and for how long, they would be kept. Is there an option to set the frequency of the save files? [QUOTE=lycorn;529313]If you let the server decide, it will currently assign exponents in the 13 08x xxx range (672 K FFT) and the save files will grow to 3.1 MB each.[/QUOTE] Again I'm inferring you mean asking Primenet for whatever it thinks is best? Even ~3 MB isn't that big a deal -- we have already established that Colab / Kaggle instances are in no significant way bandwidth constrained. |
[QUOTE=chalsall;529314]Is there an option to set the frequency of the save files?[/QUOTE]
Yes. The option to change is [CODE]DiskWriteTime=30[/CODE] in prime.txt. The units of time here are minutes. The program will also save when stopped gracefully by the user (via Ctrl+C, for instance). |
[QUOTE=lycorn;529297]If you do the prescribed number of curves for a certain B1 value (say 640 for B1 = 250,000), the chance that you miss a factor smaller than the corresponding number of digits (should it exist, in the first place...) is roughly 1/e ~36%. For B1 = 250,000 that number is 30 digits. So no, getting a NF-ECM result is not the same as a TF "no factor found", as ECM is probabilistic whereas TF is deterministic.
As for your second question yes, it´s as simple as you stated. PS - Welcome to the ECM tribe... :smile:[/QUOTE] Just another quick follow up probability question: so say an exponent has the expected B1=50k and B1=250k curves done, is the overall probability of missing a factor less than 30 digits still 1/e or a bit lower? In other words do the 50k curves contribute some additional effort independent of the 250k curves or is the 1/e estimate under the assumption that appropriate curves have been run at lower B1 values as well? |
[QUOTE=Dylan14;529321]...by the user (via Ctrl+C, for instance).[/QUOTE]
Or by the instance, for instance... :wink: Thanks for the information. |
| All times are UTC. The time now is 10:20. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.