mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > GMP-ECM

Reply
 
Thread Tools
Old 2011-04-09, 21:41   #1
WraithX
 
WraithX's Avatar
 
Mar 2006

23×59 Posts
Default A Python Diver for GMP-ECM...

I saw mention over in the "A Python Driver for GGNFS and MSIEVE" thread that Mini-Geek was wanting a python driver that would run multiple copies of gmp-ecm on a single number. I too have been wanting this and want to try and create that python script based off of Brian Gladman's factmsieve.py script. I don't need this for aliquiet, like Mini-Geek, I would just like a way to run many many curves simultaneously without having a bunch of command prompts open. [especially when I get a computer with dual 16-core processors ;) ]

For basic logic, I was thinking of the following:
Code:
on startup, check for resume and/or output files

run num_threads version of gmp-ecm
  split -c over threads
  split -maxmem over threads
  give -save, -savea, -resume, -chkpnt, and -treefile unique filenames
And while it was running I wanted it to give a little bit of output, maybe something like:
Code:
Using B1=3000000000, B2=85024954914916, polynomial Dickson(30), 4 threads
finished 150/1000; avg s/curve: stg1 29188s, stg2 14827s; runtime: 6602250s
Where the first line would be static, and that second line would update "however often".

Is this something that others would be interested in? If so, what do you think of the basic description up above? What sort of features would you like to see in something like this?

If you'd like to see some of the previous talk about this, you can see the above mentioned thread here:
http://www.mersenneforum.org/showthr...t=12981&page=8 (starting at post #181)

I have a question for anyone knowledgeable in python, what would be the best way to monitor the output of each instance of gmp-ecm without blocking the main thread from being able to update its own output? Should I just redirect the output of each gmp-ecm to a file, or can I attach to their stdout and monitor it that way (without blocking)?
WraithX is offline   Reply With Quote
Old 2011-04-19, 02:58   #2
WraithX
 
WraithX's Avatar
 
Mar 2006

23×59 Posts
Default

Ok, I have what appears to be a working version of this driver ready. This release will be version 0.01, but hopefully it will function well for everyone. Like factMsieve.py, this requires Python 2.6 or higher to run. Here are some notes about its use. A typical command line will look like one of the following:

D:\Programming\python\ecm_py>c:\python26\python.exe ecm.py -c 100 -one -maxmem 1500 -threads 2 -out all_out.txt 1000000 < test2.txt
or
D:\Programming\python\ecm_py>echo 189137687020123159261852192605885322671854225839439913905089 | c:\python26\python.exe ecm.py -c 100 -one -maxmem 1500 -threads 2 -out all_out.txt 1000000

The command line has to have /path/to/python.exe before ecm.py. This is needed because of a limitation in Windows (possibly Linux also) where just writing ecm.py will not allow redirection to work properly. This seems to be based on a mishandling of starting python scripts through file associations. However, if the program is called as python.exe ecm.py (if python is in your path), or /path/to/python.exe ecm.py, redirection will work correctly. This includes both piping-in (echo ... | python ecm.py) and file redirection (python ecm.py ... < num.txt).

You can use all the standard gmp-ecm command line options.
If you use -c N (run N curves), then each gmp-ecm instance will get -c (N/num_threads) curves to run, with the first few instances running more if that division is not exact.
If you use -maxmem N (limit ecm.exe memory usage in Stage 2 to < N MB of memory), then each instance of gmp-ecm will use only -maxmem (N/num_threads) to make sure that the total memory used by all instances of gmp-ecm doesn't go over your limit.

When this script is factoring a number, it creates a job file representing how much work you've asked to complete on that number. It also keeps track of how many curves have been run on this number, and timings to help you know how long each stage is taking on average. Each instance of gmp-ecm will write to its own file. The script monitors these files to see how much work has been done, and updates a counter on the screen to let you know how far along the job is.

The script has an auto-resume feature. If the job is interrupted by a power outage or with Ctrl-C, you don't have to worry about losing work. The next time you start up the job with the same command line on the same number, it will read in the old job files to see how much work had been done, and then pick up where it left off. If you turn off auto-resume, the script won't check for previous runs, it will just start a new job.

Here are the options available that are specific to the script:
-threads N This will run N copies of gmp-ecm on the input number(s) you specify.
-r <file> Resume the specified job file. After finishing this job, the script will stop.
-out <file> Save work done (on all input numbers) to the specified file.

These command line options can be intermixed with regular gmp-ecm command line options. Also, please make sure that B1 is the last option in your ecm command line. If not, the program will not behave as expected. In future releases I think I can post a proper error if B1 isn't the last option, but for now, a correct command line should work as expected.

Before running, please set the path to your gmp-ecm executable. You can also set a default number of threads to use if you dont' want to specify that on the command line. Also, you can run multiple copies of this script in the same directory. There shouldn't be any collisions between separate jobs being run.

If anyone is interested in running this, I'd like to hear how well it works for you. If you encounter a problem, I can try to find a solution and post updated versions as I have time. Now, let the fun begin!
WraithX is offline   Reply With Quote
Old 2011-04-20, 12:46   #3
WraithX
 
WraithX's Avatar
 
Mar 2006

47210 Posts
Default

Okay, I have made a couple of changes to ecm.py, which are:

New feature:
- Print more info if a factor is found.
Originally I only printed the 'Factor found' line. Now I include the two lines after that, which includes info like primality/compositeness/length of the factor and cofactor.

Bug fix:
- I have just found that once in a while, my script might print the wrong sigma for a factor found. Fixed that.

Here is the updated version 0.02.

(P.S. could a mod please remove the zip file for 0.01 in post #2? Thank you.)

Last fiddled with by fivemack on 2011-04-21 at 19:14
WraithX is offline   Reply With Quote
Old 2011-04-21, 12:24   #4
WraithX
 
WraithX's Avatar
 
Mar 2006

23×59 Posts
Default

Announcing ecm.py version 0.03, the changes are:

New Features:
- I've added in a -pollfiles N option which will only read data from the job files once every N seconds. This will help the user control the amount of disk I/O based on how long this job will probably run. I currently have the following recommended settings for this option:
# For quick jobs (less than a couple of hours): between 3 and 15 seconds
# For small jobs (less than a day): between 15 and 45 seconds
# For medium jobs (less than a week): between 45 and 120 seconds
# For large jobs (less than a month): between 120 and 360 seconds
(the default value is 15 seconds)

- The program now updates the info line every second, to more easily tell that it is running.

Bug fix:
- I had made a mistake in displaying how many curves had been completed. This didn't affect how many curves were actually completed or how long the program would run. Displayed curve counts and times are accurate now.

Last fiddled with by fivemack on 2011-04-30 at 16:26
WraithX is offline   Reply With Quote
Old 2011-04-30, 12:56   #5
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

I'm getting this error with version 0.03 of your script running on Python 3.1.3:
Code:
C:\Files\Prime\aliquot>echo 82330147092934135248787768444414746486667369907934935367302358
441167097150481130751779642415163252839011 | python \files\prime\bin\ecmpy.py -c 10 -maxme
m 500 -out out.txt 1000000
-> ___________________________________________________________________
-> | Running ecm.py, a Python driver for distributing GMP-ECM work   |
-> | on a single machine.  It is Copyright, 2011, David Cleaver and  |
-> | is a conversion of factmsieve.py that is Copyright, 2010, Brian |
-> | Gladman.   Version 0.03 (Python 2.6 or later) 21st Apr 2011.    |
-> |_________________________________________________________________|

-> Number(s) to factor:
-> 823301470929341352487877684444147464866673699079349353673023584411670971504811307517796
42415163252839011
Traceback (most recent call last):
  File "\files\prime\bin\ecmpy.py", line 1077, in <module>
    parse_ecm_options(sys.argv, set_args = True)
  File "\files\prime\bin\ecmpy.py", line 1015, in parse_ecm_options
    ecm_args1 += ' -c {0:d}'.format(ecm_c/intNumThreads)
ValueError: Unknown format code 'd' for object of type 'float'
I surrounded the numbers in that line and the three similar lines after it in int() and it made it work.
This is my Python version:
Code:
Python 3.1.3 (r313:86834, Nov 27 2010, 18:30:53) [MSC v.1500 32 bit (Intel)] on win32
Does anyone know how much work it would be to make this aliqueit-compatible, or make aliqueit ecm-py-compatible? I think that could greatly improve its use.
Mini-Geek is offline   Reply With Quote
Old 2011-04-30, 14:37   #6
WraithX
 
WraithX's Avatar
 
Mar 2006

7308 Posts
Default

Code:
  File "\files\prime\bin\ecmpy.py", line 1015, in parse_ecm_options
    ecm_args1 += ' -c {0:d}'.format(ecm_c/intNumThreads)
ValueError: Unknown format code 'd' for object of type 'float'
Interesting, it looks like Python 3.x is returning a float from that integer division, instead of returning an int like in Python 2.6. Thank you for the report. This has been fixed in ecm.py v0.04.

As for making it work with aliqueit, I saw in a different thread where you mention that the configuration looks like the following:

--------------------------
"echo " + input_number + " | " + cfg.ecm_cmd + " -one -c " + tostring( curves )
+ " -B2scale " + tostring( cfg.b2scale_ecm ) + " " + tostring( b1 ) + " > " + cfg.ecm_tempfile ).c_str()
e.g.: "echo 123456789000123333 | ecm -one -c 214 -B2scale 1.0 50000 > aliqueit_ecm_temp.log"
--------------------------

If you can set cfg.ecm_cmd to "python.exe ecm.py" or "/path/to/python ecm.py",
and then change the command line to look like the following:

--------------------------
"echo " + input_number + " | " + cfg.ecm_cmd + " -one -c " + tostring( curves )
+ " -B2scale " + tostring( cfg.b2scale_ecm ) + " -out " + cfg.ecm_tempfile ).c_str() + " " + tostring( b1 )
e.g.: "echo 123456789000123333 | python.exe ecm.py -one -c 214 -B2scale 1.0 -out aliqueit_ecm_temp.log 50000"
--------------------------

I think it might work. Well, it's worth a try, at least. ;)
WraithX is offline   Reply With Quote
Old 2011-04-30, 14:47   #7
WraithX
 
WraithX's Avatar
 
Mar 2006

23×59 Posts
Default

Announcing ecm.py version 0.04, the changes are:

Bug fix:
- Updated a print statement to be compatible with Python 3.x.
Attached Files
File Type: zip ecm-py_v0.04.zip (9.8 KB, 186 views)

Last fiddled with by fivemack on 2011-04-30 at 16:26
WraithX is offline   Reply With Quote
Old 2011-05-01, 18:56   #8
Walter Nissen
 
Walter Nissen's Avatar
 
Nov 2006
Terra

2·3·13 Posts
Default

Quote:
Originally Posted by WraithX View Post
The command line has to have /path/to/python.exe before ecm.py. This is needed because of a limitation in Windows (possibly Linux also) where just writing ecm.py will not allow redirection to work properly. This seems to be based on a mishandling of starting python scripts through file associations. However, if the program is called as python.exe ecm.py (if python is in your path), or /path/to/python.exe ecm.py, redirection will work correctly.
I haven't run ecm.py , but I simply followed the instructions in
http://gilchrist.ca/jeff/factoring/n...ers_guide.html
for factmsieve.py where I saw nothing about a path to
python.exe , and simply proceeded with :
..\factMsieve.py example

Those instructions lump ggnfs , python , msieve and factmsieve
all together in a single directory which also contains a
working directory .
But when I saw :
..\factMsieve.py example
it didn't seem plausible DOS would find python , but it did .
There's a python27.dll in my path , but no python.exe .

Could the difference between command.com and cmd.exe
account for your experience , and mine ?
I have only the vaguest idea how they differ .
Very early in my Windows XP experience , cmd.exe seemed
better , so I've been using it since , although in a few ways
it's not as good as the command.com in Windows '98 .
Walter Nissen is offline   Reply With Quote
Old 2011-05-01, 20:30   #9
mklasson
 
Feb 2004

25810 Posts
Default

Hi,

Mini-Geek pointed me to this thread and asked for a small fix to aliqueit to make it work with ecm.py. Could someone try it out and see if it seems ok?

Change use_ecmpy to true in aliqueit.ini and make sure ecmpy_cmd is set correctly.

http://mklasson.com/aliqueit110a.zip
mklasson is offline   Reply With Quote
Old 2011-05-02, 01:31   #10
WraithX
 
WraithX's Avatar
 
Mar 2006

7308 Posts
Default

Quote:
Originally Posted by Walter Nissen View Post
Those instructions lump ggnfs , python , msieve and factmsieve
all together in a single directory which also contains a
working directory .
But when I saw :
..\factMsieve.py example
it didn't seem plausible DOS would find python , but it did .
There's a python27.dll in my path , but no python.exe .
factmsieve.py does not use file redirection or command line piping to get any input. Like you have seen, it is run like 'factmsieve.py example', it then looks for different files that start with the name example (like, example.n, or example.poly, or example.fb, etc) So, you don't have to put python.exe in front of this because 1) there is a file association built into windows (when you install python) (so you can just type factmsieve.py and it will run) and 2) it does not use redirection.

Now, you can just type ecm.py at the command line, but this starts ecm.py through the file association mechanism, and in that case neither file redirection nor command line piping work correctly. This is why you need to have python.exe in front of ecm.py. This problem may exist in a Linux environment also, but I have no way to test that so I'm not sure.
WraithX is offline   Reply With Quote
Old 2011-05-05, 18:22   #11
Walter Nissen
 
Walter Nissen's Avatar
 
Nov 2006
Terra

2×3×13 Posts
Default

Thank you for this excellent explanation .

Very curious .
Then again , a lot of things are curious in DOS boxes .
Maybe something about Windows features being cobbled onto DOS .

Cheers ,

Walter
Walter Nissen is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Python Coding Help? kelzo Programming 3 2016-11-27 05:16
PHP vs. Python vs. C (all with GMP) daxmick Programming 2 2014-02-10 01:45
Python... Xyzzy Programming 20 2009-09-08 15:51
using libecm from python yqiang GMP-ECM 2 2007-04-22 00:14
Help w/ python. a216vcti Programming 7 2005-10-30 00:37

All times are UTC. The time now is 08:24.

Sun Jul 5 08:24:41 UTC 2020 up 102 days, 5:57, 1 user, load averages: 1.35, 1.22, 1.18

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.