View Single Post
Old 2018-03-17, 19:38   #1
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

5×691 Posts
Default How I Run a Larger Factorization Using Msieve, gnfs and factmsieve.py on Several Ubuntu Machines

(Note: I expect to keep the first post of each of these "How I..." threads up-to-date with the latest version. Please read the rest of each thread to see what may have led to the current set of instructions.)

This thread will explain the steps I use to run msieve and gnfs on several computers which are already running Ubuntu and have msieve and the ggnfs package installed* and tested per:

How I Install msieve onto my Ubuntu Machines
and
How I Install ggnfs onto my Ubuntu Machines

*In this instance "install" is referring to the acquiring and compilation of the msieve and ggnfs packages only. The binaries and scripts will have to be called using their respective paths.

I will be creating the folders Math/factorMain and Math/factorMain/factorWork on every machine for this example. All of my machines are able to communicate via ssh and I will be using sshfs to map the factorWork drive of the main machine to the factorWork drives of all the others. You can use other forms of mapping, but basically, you need each machine to see the factorWork folder of the main machine. Again adjust anything you need to for local folders, etc.

In my case, I have developed scripts for all my machines, but for this thread, I will only supply the command lines that will run everything because every script is machine specific. A reader can build their own scripts easily from the commands given.

The sieving and Linear Algebra are controlled and driven by separate factmsieve.py scripts on each machine, but it has some limitations along with its assets. One of the limitations is that it only runs one machine for polynomial pair selection. Assets include aggregation of relations and automatically running the LA and subsequent stages.

Since factmsieve.py only runs a single machine for polynomial pair selection, I use msieve on each machine to generate poly pairs that are later combined and a selection of the best is made via a script from user chris2be8 in this thread. I also restrict the poly time instead of letting msieve choose, because msieve isn't aware of the other machines. Note, that I sometimes don't have the best poly that I might acquire with more time and this method may not provide the best results for >150 dd composites. Where I use a value of "poly_deadline=300" in my example, a larger value may well be worth using.

First, let's make our folders and sub-folders. Open a terminal on each machine and type:
Code:
mkdir Math/factorMain
mkdir Math/factorMain/factorWork
Now, acquire the factmsieve.py script from:

Factoring Large Composite Numbers

Place a copy into each Math/factorMain folder on every machine. On every machine, using a text editor, open each factmsieve.py file and look for the following section:
Code:
# Set binary directory paths
GGNFS_PATH = '/home/<user>/Math/ggnfs/bin/'
MSIEVE_PATH = '/home/<user>/Math/msieve/'

# Set the number of CPU cores and threads
NUM_CORES = 4
THREADS_PER_CORE = 2

USE_CUDA = False
GPU_NUM = 0
MSIEVE_POLY_TIME_LIMIT = 0
MIN_NFS_BITS = 264
Make sure that the PATHs, COREs and CUDA are all set properly for each individual machine. Replace <user> above with your username.

Save/close the factmsieve.py file(s).

Go to the post referenced above and acquire a copy of refindpoly.pl.txt and remove the .txt from the name to leave refindpoly.pl. Place this file in the factorMain folder on the main machine.

Let's choose a composite and run it using three machines. For comparisons, I will use the 94 digit composite chosen for the CADO-NFS multi-machine example:
Code:
1975636228803860706131861386351317508435774072460176838764200263234956507563682801432890234281
In a terminal, on the main machine, go to the Math/factorMain/factorWork folder and enter:
Code:
echo "n: 
1975636228803860706131861386351317508435774072460176838764200263234956507563682801432890234281 > comp94.n"
Now go to the other machines and map the main machine's factorWork directory into the local machine's factorWork directory. I use sshfs in the following manner:
Code:
sshfs <mainmachine@IP>:Math/factorMain/factorWork ~/Math/factorMain/factorWork
Check to see if the map is working by going to the Math/factorMain folder and typing:
Code:
ls factorWork
If it is working you should see comp94.n listed. If it isn't, work out the trouble before continuing.

On the main machine, move into the factorWork folder. If you've run anything in this folder previously, there may be a hidden file that will cause trouble later. Use the following command to remove it:
Code:
rm .params
Now, let's start creating poly candidates from all three machines. On the main machine enter:
Code:
../../msieve/msieve -i comp94.n -s comp94.1 -nf comp94-1.fb -t 8 -np "poly_deadline=300 1,3000"
On the other machines, while in the factorWork folders, enter:
Code:
../../msieve/msieve -i comp94.n -s comp94.2 -nf comp94-2.fb -t 8 -np "poly_deadline=300 3001,6000"
and
Code:
../../msieve/msieve -i comp94.n -s comp94.3 -nf comp94-3.fb -t 8 -np "poly_deadline=300 6001,9000"
For more information on the above commands see the readme.nfs file in the msieve folder.

Five minutes after the start of the last machine, they should all be finished. Now, we're going to want a perl script which can be downloaded from this post by chris2be8. Place this file in the main machine's factorMain folder and call it while in the factorWork folder:
Code:
perl ../refindpoly.pl comp94
This will choose the best polynomial and create the file comp94.poly. Now we're ready to run the factmsieve.py script to finish factoring the composite. On the main machine, while in the factorWork folder, enter:
Code:
python ../factmsieve.py comp94.n 1 3
The 1 and 3 signify that this is the first machine of three. It will sieve and perform post sieving activities. On the two other machines, from within the factorWork folder, enter:
Code:
python ../factmsieve.py comp94.n 2 3
and
Code:
python ../factmsieve.py comp94.n 3 3
They will sieve and stop after completing their last sieving processes. The first machine will continue through the final processing and the factors will be placed in the comp94.log file. You can review the log or you can see the factors with the following command:
Code:
cat comp94.log | grep -i "factor:"
Code:
p45 factor: 179231227423414197451601378315047105853969879
p50 factor: 11022834899950977366949652871606409040980556071039

Last fiddled with by EdH on 2019-08-31 at 21:15
EdH is offline   Reply With Quote