mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Math

Reply
 
Thread Tools
Old 2016-07-17, 10:01   #12
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

12CF16 Posts
Default

Quote:
Originally Posted by WraithX View Post
Ok, I've created a simple version of a program that gathers timings for running gmp-ecm on various input numbers at various B1 values. I've included the traditional list of B1 values, and my updated list of B1 values. By default the program will use the traditional list of B1 values. There is a a user-defined-variables section where you can specify a computer name, you can pick which B1 list you want, you can specify which B1 "levels" you want to test (say 20 digits to 50 digits, or 35 digits to 80 digits, for example), and you can specify if you are using gmp-ecm 6 or 7, and with 7 you can specify if you want to use param=0 or param=1 (ecm7 with param=1 is fastest for almost all cases). While the script is running, it will show you the equivalent command line for gmp-ecm and the output from the program. At the end, it will print out the results to screen and to a file called 'ecm_times.txt'. One important note, this script is written on the premise that the ecm executable is 1) named ecm and 2) that it is either in the same directory as the script, or somewhere in your current PATH. I can make this more flexible in a future update if that would help anyone out.

Here are two examples of the output:
Code:
param = 0 (this line not currently included in any output, it's just here so you can better understand timing differences)
# computer, digits, 11e3, 50e3, 250e3, 1e6, 3e6, 11e6, 43e6
pc1, 150, 0.031s+0.032s, 0.109s+0.062s, 0.499s+0.250s, 1.841s+1.061s, 5.491s+2.387s, 19.718s+8.034s, 73.928s+23.354s
pc1, 155, 0.031s+0.016s, 0.140s+0.078s, 0.577s+0.249s, 2.199s+1.077s, 6.474s+2.527s, 23.836s+8.471s, 92.305s+25.850s
pc1, 160, 0.062s+0.016s, 0.140s+0.094s, 0.530s+0.265s, 2.106s+1.107s, 6.177s+2.496s, 22.698s+8.361s, 88.140s+25.475s
pc1, 165, 0.031s+0.015s, 0.124s+0.063s, 0.577s+0.265s, 2.091s+1.123s, 6.287s+2.558s, 22.526s+8.642s, 88.312s+26.801s
pc1, 170, 0.031s+0.032s, 0.140s+0.078s, 0.577s+0.265s, 2.121s+1.186s, 6.208s+2.590s, 22.542s+8.564s, 88.514s+26.302s

param = 1 (this line not currently included in any output, it's just here so you can better understand timing differences)
# computer, digits, 11e3, 50e3, 250e3, 1e6, 3e6, 11e6, 43e6
pc1, 150, 0.046s+0.016s, 0.093s+0.078s, 0.390s+0.234s, 1.498s+0.998s, 3.978s+2.277s, 14.555s+7.675s, 57.003s+23.431s
pc1, 155, 0.031s+0.031s, 0.109s+0.078s, 0.468s+0.250s, 1.670s+1.092s, 4.852s+2.511s, 17.722s+8.408s, 69.155s+25.693s
pc1, 160, 0.031s+0.031s, 0.109s+0.078s, 0.468s+0.250s, 1.654s+1.108s, 4.821s+2.496s, 17.628s+8.346s, 69.015s+25.490s
pc1, 165, 0.046s+0.032s, 0.109s+0.093s, 0.468s+0.265s, 1.654s+1.139s, 4.851s+2.543s, 17.675s+8.611s, 68.812s+26.442s
pc1, 170, 0.031s+0.032s, 0.109s+0.078s, 0.452s+0.250s, 1.654s+1.139s, 4.867s+2.558s, 17.628s+8.611s, 68.968s+26.349s
I recommend that we go to this comma-separated-values (csv) format to make the results more compact, easier to read, and easier to parse/search in future scripts. The first line is basically just a column header line, describing the computer name, the number of digits in the tested number, and which B1 value was used to generate the given times. I have split the times up into step1+step2, and I have converted from milliseconds to seconds. It will be easy to write a function to add the step1 and step2 times together to get a total time for a given B1 value.

Please let me know if you have any problems with this script, or would like to see any features added or modified.

fivemack, to your point of:

If you think my list of new B1 values is appropriate, we can generate as continuous a list of B1 values as you want. We could generate B1 values for every digit, or every 0.1 digits, etc. With the formula B1 = 10^(digits/10 + 2.5), we could get a list like:
Code:
digits -    B1
    40 -  3.16e6
    41 -  3.98e6
    42 -  5.01e6
    43 -  6.30e6
    44 -  7.94e6
    45 - 10.00e6
...etc...
That way, we could just pick which "digit levels" to test (which can be either "for d in range(a,b)", or "for d in d_list") and then the formula will give us the corresponding B1 values to use. I was going to implement this in the program, but wanted to ask what you thought of it first.
Code:
luigi@luigi-Aspire-MC605:~/luigi/ecm$ python gen_ecm_times.py 

-> echo "10^150-233" | ecm -sigma 0:3046069392 11e3
Traceback (most recent call last):
  File "gen_ecm_times.py", line 147, in <module>
    p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

luigi@luigi-Aspire-MC605:~/luigi/ecm$ ll /usr/lib/python2.7/subprocess.py
-rw-r--r-- 1 root root 59046 giu 22  2015 /usr/lib/python2.7/subprocess.py

Last fiddled with by ET_ on 2016-07-17 at 10:02
ET_ is offline   Reply With Quote
Old 2016-07-17, 12:29   #13
WraithX
 
WraithX's Avatar
 
Mar 2006

1110111112 Posts
Default

Quote:
Originally Posted by ET_ View Post
Code:
luigi@luigi-Aspire-MC605:~/luigi/ecm$ python gen_ecm_times.py 

-> echo "10^150-233" | ecm -sigma 0:3046069392 11e3
Traceback (most recent call last):
  File "gen_ecm_times.py", line 147, in <module>
    p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
Hmmm, if your ecm binary was in the same path as the gen_ecm_times.py script, then let's try a different way to launch the binary. I have made an update so that, like the ecm.py program, you can specify an ECM_PATH in the user-defined-section. This should give you more flexibility in specifying where the ecm binary is located. Please let me know how this new version works for you.

And just to let everyone know, I'm running the script on my computer from digit level 30 to 70 (ie, original B1 list: 250e3 <= B1 <= 2.9e9). This will take several days to generate, but I'll post my results when I get them.
Attached Files
File Type: zip gen_ecm_times.zip (2.8 KB, 78 views)
WraithX is offline   Reply With Quote
Old 2016-07-17, 14:31   #14
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

12CF16 Posts
Default

Quote:
Originally Posted by WraithX View Post
Hmmm, if your ecm binary was in the same path as the gen_ecm_times.py script, then let's try a different way to launch the binary. I have made an update so that, like the ecm.py program, you can specify an ECM_PATH in the user-defined-section. This should give you more flexibility in specifying where the ecm binary is located. Please let me know how this new version works for you.

And just to let everyone know, I'm running the script on my computer from digit level 30 to 70 (ie, original B1 list: 250e3 <= B1 <= 2.9e9). This will take several days to generate, but I'll post my results when I get them.
I have "made install" ecm, so I run it without dot-slash prefix.
Unfortuntely, still no success with the new script, but it's no issure in reality, as I have write a script myself to test te speed of gmp-ecm with Fermat numbers.

Thank you for the quick answer!

Luigi
ET_ is offline   Reply With Quote
Old 2016-07-17, 15:58   #15
WraithX
 
WraithX's Avatar
 
Mar 2006

479 Posts
Default

Quote:
Originally Posted by ET_ View Post
I have "made install" ecm, so I run it without dot-slash prefix.
Unfortuntely, still no success with the new script, but it's no issure in reality, as I have write a script myself to test te speed of gmp-ecm with Fermat numbers.

Thank you for the quick answer!

Luigi
Also note, you can set the ECM_PATH to point to the location of your ecm binary. You can use "which ecm" to find out what the path is. For example:
$ which ecm
/usr/local/bin
Then you can set ECM_PATH = '/usr/local/bin/'
WraithX is offline   Reply With Quote
Old 2016-07-17, 16:46   #16
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

32×5×107 Posts
Default

Quote:
Originally Posted by WraithX View Post
Also note, you can set the ECM_PATH to point to the location of your ecm binary. You can use "which ecm" to find out what the path is. For example:
$ which ecm
/usr/local/bin
Then you can set ECM_PATH = '/usr/local/bin/'
Sounds perfect, thanks
ET_ is offline   Reply With Quote
Old 2016-07-17, 22:58   #17
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

23·3·5·72 Posts
Default

Surely all we need to judge the ratio between nfs speed on different pcs is to run a fairly quick sieving test as a benchmark. I suppose different size numbers might have a slightly different ratio(as the ratio of time spend in qs vs sieving etc changes) but it should be fairly good.
henryzz is offline   Reply With Quote
Old 2016-07-23, 03:38   #18
WraithX
 
WraithX's Avatar
 
Mar 2006

479 Posts
Default

Quote:
Originally Posted by WraithX View Post
Hmmm, if your ecm binary was in the same path as the gen_ecm_times.py script, then let's try a different way to launch the binary. I have made an update so that, like the ecm.py program, you can specify an ECM_PATH in the user-defined-section. This should give you more flexibility in specifying where the ecm binary is located. Please let me know how this new version works for you.

And just to let everyone know, I'm running the script on my computer from digit level 30 to 70 (ie, original B1 list: 250e3 <= B1 <= 2.9e9). This will take several days to generate, but I'll post my results when I get them.
Several updates:
1) I was able to run the gen_ecm_times.py script on a linux machine and found the problem that ET_ was running into. I've fixed that in the attached gen_ecm_times zip file.

2) I have finished my run of gen_ecm_times.py with the original B1 from 250e3 to 2.9e9 on all 31 numbers from 150 to 300 digits. This was on my main machine, so my occasional use of the machine may have slightly skewed some of the times, but hopefully not by too much. I'm running another batch with the original B1 from 250e3 to 25e9 on a linux machine that is dedicated to this task only. It will probably take around 20 days to finish.

3) Since the format of ecm_times.txt is different from fivemack's, I've rewritten ecm-toy.py to be able to understand this new data format. I'm attaching a zip of my rewritten ecm-toy.py, which includes my ecm_times.txt and my new calculations for ecm-probabilities.txt.

4) I did some number crunching on the num_curves = a*e^(b*x) equations and came up with some slightly different formulas. It took me a minute to realize that your (fivemack's) multipliers needed to be inverted (1/a, instead of a). Once I accounted for that I was able to compare your curves with mine. I'm including an excel spreadsheet that lists your equations and mine, shows the number of curves recommended by gmp-ecm, and compares that in a couple of ways to the curve counts generated by your equation, and two equations of mine. The gist of the results is that your equations seem to work better for 35-55 digits. Then from 60-80 digits mine take over as more accurate. I've compared the difference in curve counts, and the ratio of the curve counts, which both basically show the same thing. I've highlighted the best curve estimates in green to show which equation was best for that digit level. I'm attaching these results as est_curves.zip. I've put my "m" equations into the ecm-probabilites.txt file.

5) I've formatted the output at the end of the program to make more sense to me. Hopefully others will find it useful too. Here is an example:
Code:
> ecm-toy.py g250 100@260e6 pc1
<...snip...>
Best b1 is 850000000 expected time 810.852 days
Expected NFS time is 880.439 days
And now NFS beats them

Recommend running the following number of ecm curves before switching to NFS:
 4500 curves with B1 = 110.0e6
  300 curves with B1 = 850.0e6

Probability estimates for a factor to be within a given digit range:

Digit Range | Probability | Cumulative
------------|-------------|-----------
  30 -  34  |   0.00000   |   0.00000
  35 -  39  |   0.00000   |   0.00000
  40 -  44  |   0.00000   |   0.00000
  45 -  49  |   0.00021   |   0.00021
  50 -  54  |   0.01955   |   0.01977
  55 -  59  |   0.05703   |   0.07680
  60 -  64  |   0.06893   |   0.14573
  65 -  69  |   0.07094   |   0.21666
  70 -  74  |   0.07123   |   0.28789
  75 -  79  |   0.07126   |   0.35914
  80 -  84  |   0.07125   |   0.43039
  85 -  89  |   0.07123   |   0.50163
  90 -  94  |   0.07122   |   0.57285
  95 -  99  |   0.07121   |   0.64406
 100 - 104  |   0.07120   |   0.71526
 105 - 109  |   0.07119   |   0.78646
 110 - 114  |   0.07119   |   0.85765
 115 - 119  |   0.07118   |   0.92883
 120 - 124  |   0.07117   |   1.00000
 125 - 129  |   0.00000   |   1.00000
6) With all of this information it seems like the program recommends a very small amount of ecm before switching to NFS. I'm not sure how we should account for running multiple ecm curves at the same time. I've made a guess and have included a "num_threads" variable in the ecm-toy.py script. (num_threads=1 by default) When I set it to 4, or 10, or 20, the number of recommended curves go up to (what I believe are) more reasonable numbers. But, there is a chance I've not accounted for something correctly. So, fivemack, what do you think of the num_threads option I've added? Have I missed anything with it? Do we need to add something similar for NFS?
Attached Files
File Type: zip ecm-toy_wx01.zip (6.2 KB, 77 views)
File Type: zip est_curves.zip (8.6 KB, 72 views)
File Type: zip gen_ecm_times_v0.02.zip (2.8 KB, 81 views)

Last fiddled with by WraithX on 2016-07-23 at 04:08
WraithX is offline   Reply With Quote
Old 2016-09-30, 18:47   #19
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

23×3×5×72 Posts
Default

We could do with working out ecm timings and gnfs timings on the same pc. This would make WraithX's version of the script usable. How does pc1 compare to tractor?

I would suggest that we could set a standardised sieving benchmark that takes 1 hour on the baseline pc. Then the gnfs time can be scaled accordingly. Currently the snfs timing has no scaling at all for different pcs.
In general I don't think it matters that much how accurate the timings are as it doesn't change the probability of finding a factor with ecm that much. As long as we aim for a factor of 1.5 we shouldn't be too far off.
henryzz is offline   Reply With Quote
Old 2016-09-30, 19:01   #20
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

29×61 Posts
Default

I'm having some issues with the ecm-toy.py. When I run
Code:
python ecm-toy.py g250 100@260e6 tractor
I get a ZeroDivisionError:
Code:
Traceback (most recent call last):
  File "ecm-toy.py", line 215, in <module>
    t_one_curve = interpolate_list(entry, targdig)
  File "ecm-toy.py", line 82, in interpolate_list
    (M,e)=fitexp_list(listy[1])
  File "ecm-toy.py", line 50, in fitexp_list
    (m,c)=fitlin(dl)
  File "ecm-toy.py", line 32, in fitlin
    ssxx = sxx - sx*sx/s;
ZeroDivisionError: integer division or modulo by zero
Not sure what's up, but any help is appreciated. Python version is 3.3.3, running on the Ubuntu bash shell in Windows 10.
wombatman is offline   Reply With Quote
Old 2016-09-30, 20:57   #21
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

10110111110002 Posts
Default

Which version of the script are you using(Which post did you download it from?)?

The script is written for python 2. I only had to make two changes to WraithX's version to get it to run. xrange to range and R.sort() to R=sorted(R). I also had to give it some gnfs timings. I used the tractor values. I need to work out what scaling factor should be used though.

edit: You seem to almost be using WraithX's cmd line. tractor doesn't have ecm timings in WraithX's format. Use pc1 rather than tractor. It will need gnfs timings to do gnfs. snfs will work straight away.

Last fiddled with by henryzz on 2016-09-30 at 21:00
henryzz is offline   Reply With Quote
Old 2016-09-30, 23:39   #22
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

29·61 Posts
Default

Yeah, I'm using WraithX's from post 18. I'll make the changes you suggest and see if it works.
wombatman is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Nuke attack - run or hide? MooMoo2 Soap Box 40 2018-01-19 23:48
What Bounds to choose, and what are Bounds 144 Information & Answers 5 2017-03-15 13:36
GPU attack on double Mersennes? Uncwilly GPU Computing 29 2013-09-08 20:53
Attack of the Cosmic Rays S485122 Hardware 3 2010-08-24 01:19
Attack of the Killer Zombies ewmayer Lounge 12 2007-01-30 05:56

All times are UTC. The time now is 17:57.


Fri Jul 16 17:57:44 UTC 2021 up 49 days, 15:44, 1 user, load averages: 1.02, 1.38, 1.45

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.