mersenneforum.org > EdH How I Run a Larger Factorization Using Msieve, gnfs and factmsieve.py on Several Ubuntu Machines
 Register FAQ Search Today's Posts Mark Forums Read

 2018-03-17, 19:38 #1 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 5·17·53 Posts How I Run a Larger Factorization Using Msieve, gnfs and factmsieve.py on Several Ubuntu Machines (Note: I expect to keep the first post of each of these "How I..." threads up-to-date with the latest version. Please read the rest of each thread to see what may have led to the current set of instructions.) This thread will explain the steps I use to run msieve and gnfs on several computers which are already running Ubuntu and have msieve and the ggnfs package installed* and tested per: How I Install msieve onto my Ubuntu Machines and How I Install ggnfs onto my Ubuntu Machines *In this instance "install" is referring to the acquiring and compilation of the msieve and ggnfs packages only. The binaries and scripts will have to be called using their respective paths. I will be creating the folders Math/factorMain and Math/factorMain/factorWork on every machine for this example. All of my machines are able to communicate via ssh and I will be using sshfs to map the factorWork drive of the main machine to the factorWork drives of all the others. You can use other forms of mapping, but basically, you need each machine to see the factorWork folder of the main machine. Again adjust anything you need to for local folders, etc. In my case, I have developed scripts for all my machines, but for this thread, I will only supply the command lines that will run everything because every script is machine specific. A reader can build their own scripts easily from the commands given. The sieving and Linear Algebra are controlled and driven by separate factmsieve.py scripts on each machine, but it has some limitations along with its assets. One of the limitations is that it only runs one machine for polynomial pair selection. Assets include aggregation of relations and automatically running the LA and subsequent stages. Since factmsieve.py only runs a single machine for polynomial pair selection, I use msieve on each machine to generate poly pairs that are later combined and a selection of the best is made via a script from user chris2be8 in this thread. I also restrict the poly time instead of letting msieve choose, because msieve isn't aware of the other machines. Note, that I sometimes don't have the best poly that I might acquire with more time and this method may not provide the best results for >150 dd composites. Where I use a value of "poly_deadline=300" in my example, a larger value may well be worth using. First, let's make our folders and sub-folders. Open a terminal on each machine and type: Code: mkdir Math/factorMain mkdir Math/factorMain/factorWork Now, acquire the factmsieve.py script from: http://gladman.me.uk/computing/factmsieve.py Place a copy into each Math/factorMain folder on every machine. On every machine, using a text editor, open each factmsieve.py file and look for the following section: Code: # Set binary directory paths GGNFS_PATH = '/home//Math/ggnfs/bin/' MSIEVE_PATH = '/home//Math/msieve/' # Set the number of CPU cores and threads NUM_CORES = 4 THREADS_PER_CORE = 2 USE_CUDA = False GPU_NUM = 0 MSIEVE_POLY_TIME_LIMIT = 0 MIN_NFS_BITS = 264 Make sure that the PATHs, COREs and CUDA are all set properly for each individual machine. Replace above with your username. Save/close the factmsieve.py file(s). Go to the post referenced above and acquire a copy of refindpoly.pl.txt and remove the .txt from the name to leave refindpoly.pl. Place this file in the factorMain folder on the main machine. Let's choose a composite and run it using three machines. For comparisons, I will use the 94 digit composite chosen for the CADO-NFS multi-machine example: Code: 1975636228803860706131861386351317508435774072460176838764200263234956507563682801432890234281 In a terminal, on the main machine, go to the Math/factorMain/factorWork folder and enter: Code: echo "n: 1975636228803860706131861386351317508435774072460176838764200263234956507563682801432890234281 > comp94.n" Now go to the other machines and map the main machine's factorWork directory into the local machine's factorWork directory. I use sshfs in the following manner: Code: sshfs :Math/factorMain/factorWork ~/Math/factorMain/factorWork Check to see if the map is working by going to the Math/factorMain folder and typing: Code: ls factorWork If it is working you should see comp94.n listed. If it isn't, work out the trouble before continuing. On the main machine, move into the factorWork folder. If you've run anything in this folder previously, there may be a hidden file that will cause trouble later. Use the following command to remove it: Code: rm .params Now, let's start creating poly candidates from all three machines. On the main machine enter: Code: ../../msieve/msieve -i comp94.n -s comp94.1 -nf comp94-1.fb -t 8 -np "poly_deadline=300 1,3000" On the other machines, while in the factorWork folders, enter: Code: ../../msieve/msieve -i comp94.n -s comp94.2 -nf comp94-2.fb -t 8 -np "poly_deadline=300 3001,6000" and Code: ../../msieve/msieve -i comp94.n -s comp94.3 -nf comp94-3.fb -t 8 -np "poly_deadline=300 6001,9000" For more information on the above commands see the readme.nfs file in the msieve folder. Five minutes after the start of the last machine, they should all be finished. Now, we're going to want a perl script which can be downloaded from this post by chris2be8. Place this file in the main machine's factorMain folder and call it while in the factorWork folder: Code: perl ../refindpoly.pl comp94 This will choose the best polynomial and create the file comp94.poly. Now we're ready to run the factmsieve.py script to finish factoring the composite. On the main machine, while in the factorWork folder, enter: Code: python ../factmsieve.py comp94.n 1 3 The 1 and 3 signify that this is the first machine of three. It will sieve and perform post sieving activities. On the two other machines, from within the factorWork folder, enter: Code: python ../factmsieve.py comp94.n 2 3 and Code: python ../factmsieve.py comp94.n 3 3 They will sieve and stop after completing their last sieving processes. The first machine will continue through the final processing and the factors will be placed in the comp94.log file. You can review the log or you can see the factors with the following command: Code: cat comp94.log | grep -i "factor:" Code: p45 factor: 179231227423414197451601378315047105853969879 p50 factor: 11022834899950977366949652871606409040980556071039 Last fiddled with by EdH on 2022-01-04 at 04:11
 2019-08-19, 10:31 #2 aokle   Aug 2019 3 Posts I do not know about the max_coeff=9000, Help Me Hello EdH, "msieve -i comp94.n -s comp94.3 -nf comp94-3.fb -t 8 -np "poly_deadline=300 6001,9000"". The max_coeff can set any integer? "-t 8"=(NUM_CORES = 4)*(THREADS_PER_CORE = 2) ? when I calculate RSA 512bit, poly_deadline=300 is acceptable？ Thank You.
2019-08-19, 13:34   #3
xilman
Bamboozled!

"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

101100000111112 Posts

Quote:
 Originally Posted by EdH ... This thread will explain the steps I use to run msieve and gnfs on several computers which are already running Ubuntu and have msieve and the ggnfs package installed* and tested per: ...
Excellent!

I've been using factMsieve.pl et al. for years but haven't yet accumulated enough round tuits to automate it to the degree which you have achieved.

2019-08-19, 21:50   #4
EdH

"Ed Hall"
Dec 2009

5×17×53 Posts

Quote:
 Originally Posted by aokle Hello EdH, "msieve -i comp94.n -s comp94.3 -nf comp94-3.fb -t 8 -np "poly_deadline=300 6001,9000"". The max_coeff can set any integer? "-t 8"=(NUM_CORES = 4)*(THREADS_PER_CORE = 2) ? when I calculate RSA 512bit, poly_deadline=300 is acceptable？ Thank You.
Hi aokle,

RSA-512 would mean a bit larger number than my example, so you would need to adjust your parameters accordingly. poly_deadline would most assuredly not achieve a good enough polynomial pair in 5 minutes for your number. I'm not sure I understand the other question, but you should adjust all the parameters based on how many machines will be used and what range you would like to use for your polynomial pair search.

Having said all the above, I've moved to using CADO-NFS across several machines and let it do "almost" all the parameter choice. All but the Linear Algebra phase will be run pretty much automatically across all machines. LA will be run on the server machine only. I have the setup for CADO-NFS in a similar thread to this one.

Ed

2019-08-19, 21:56   #5
EdH

"Ed Hall"
Dec 2009

5×17×53 Posts

Quote:
 Originally Posted by xilman Excellent! I've been using factMsieve.pl et al. for years but haven't yet accumulated enough round tuits to automate it to the degree which you have achieved.
Thanks xilman. I've never really played with the .pl version. I came on the scene when Brian was just about finalizing his .py code and got more familiar with it. Now I pretty much just use CADO-NFS, although I can factor a large composite quicker using a hybrid CADO-NFS/msieve script I occasionally play with.

 2019-08-20, 12:20 #6 aokle   Aug 2019 3 Posts Hi EdH, Think you for your help. I'm a little uncertainty: "poly_deadline=300 1,3000" "poly_deadline=300 3001,6000" "poly_deadline=300 6001,9000" when I excute "./msieve --help". some like that: poly_deadline=X stop searching after X seconds (0 means search forever) X,Y same as 'min_coeff=X max_coeff=Y' but I stil do not understand: the coeff range is 1-9000 ? max_coeff(9000) can be another integer at your case? Think you.
2019-08-20, 13:26   #7
EdH

"Ed Hall"
Dec 2009

106318 Posts

Quote:
 Originally Posted by aokle Hi EdH, Think you for your help. I'm a little uncertainty: "poly_deadline=300 1,3000" "poly_deadline=300 3001,6000" "poly_deadline=300 6001,9000" when I excute "./msieve --help". some like that: poly_deadline=X stop searching after X seconds (0 means search forever) X,Y same as 'min_coeff=X max_coeff=Y' but I stil do not understand: the coeff range is 1-9000 ? max_coeff(9000) can be another integer at your case? Think you.
Hi aokle,

If you run an instance without the min/max_coeff, the coefficient value is chosen at random from 1 through the max_coeff that is chosen by msieve based on the composite. If you are running multiple machines there is a "slight" chance of more than one running the same coeff. To prevent this in my example, I set each machine to use a unique range. That way all the randoms of one are outside all the randoms of the others. As the composite gets larger, the max_coeff for the entire group of machines gets larger, but it is divided between all the machines using the min_coeff=,max_coeff==.

The poly_deadline chosen was just for the example size composite. With three machines searching, that was roughly equal to 15 minutes of searching across all three machines. In practice, I choose the total time I want to search and divide that by the number of machines I'll be using, similar to the max/min_coeffs.

Ed

2019-08-21, 02:26   #8
aokle

Aug 2019

310 Posts
Excellent!

Quote:
 Originally Posted by EdH Hi aokle, If you run an instance without the min/max_coeff, the coefficient value is chosen at random from 1 through the max_coeff that is chosen by msieve based on the composite. If you are running multiple machines there is a "slight" chance of more than one running the same coeff. To prevent this in my example, I set each machine to use a unique range. That way all the randoms of one are outside all the randoms of the others. As the composite gets larger, the max_coeff for the entire group of machines gets larger, but it is divided between all the machines using the min_coeff=,max_coeff==. The poly_deadline chosen was just for the example size composite. With three machines searching, that was roughly equal to 15 minutes of searching across all three machines. In practice, I choose the total time I want to search and divide that by the number of machines I'll be using, similar to the max/min_coeffs. Ed
Excellent!

2022-01-03, 20:43   #9
Yup

Jan 2022

1 Posts

Quote:
 Originally Posted by EdH (Note: I expect to keep the first post of each of these "How I..." threads up-to-date with the latest version. Please read the rest of each thread to see what may have led to the current set of instructions.) .... Factoring Large Composite Numbers
Hi Edh and everybody!
Please send link where I can get a factmsieve.py for factoring large numbers. Because this link doesn't work.

Last fiddled with by Uncwilly on 2022-01-04 at 00:41

2022-01-07, 16:31   #10
EdH

"Ed Hall"
Dec 2009

5·17·53 Posts

Quote:
 Originally Posted by Yup Hi Edh and everybody! Please send link where I can get a factmsieve.py for factoring large numbers. Because this link doesn't work.
Thanks for the note. The link has been updated.

 Similar Threads Thread Thread Starter Forum Replies Last Post EdH EdH 0 2018-02-25 18:00 EdH EdH 0 2018-02-23 14:43 FelicityGranger Msieve 2 2016-12-04 10:44 Romuald Msieve 24 2015-11-09 20:16 D. B. Staple Factoring 6 2011-06-12 22:23

All times are UTC. The time now is 21:56.

Thu May 19 21:56:27 UTC 2022 up 35 days, 19:57, 1 user, load averages: 1.09, 1.64, 1.71