View Single Post
 2020-06-19, 18:47 #2 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 338810 Posts How I install Aliqueit onto my Ubuntu Machines (continued) Setting up Aliqueit to distribute both ECM and GNFS across several machines For this section, if you don't already have a working ecmpi cluster, you should follow: How I Install and Run ecmpi Across Several Ubuntu Machines Also make sure your Aliqueit (server) main machine can freely comunicate with the main ecmpi node. You will need to send (scp) files back and forth between the two. You will also need a working set of CADO-NFS machines to distribute outside the local machine. In this case, if you set up the CADO-NFS main machine, it can factor by itself, but you should also, easlily be able to add clients. Again, if you need to set up a CADO-NFS system, you can review the information here: How I Run a Larger Factorization Via CADO-NFS on Several Ubuntu Machines I'll cover ecmpi first. The suggested use for the ECM programs is to pipe the composite into the ECM program with various switches to provide additional control. All that is needed to redirect this call to an ecmpi cluster is a script to capture the call, transform it to what ecmpi needs and translate any factors found into somewhere that Aliqueit can recognize it when the script returns control. In practice, I have two scripts and they might not be well formed. But, they work for me as is and I do adjust them, if needed. I would consider them a starting point that you can build from for your own desires. The first script (aliECM.sh) works directly with Aliqueit. It captures the call for ECM and translates the call into composite, curves and B1 value, which are then written to an intermediate file (aliqueitECM). The intermediate file is sent (via scp) to the ecmpi controlling node. Then, aliECM.sh waits for the results (alqueitECMresults) from the ecmpi run. When the results are returned, the script looks in the last line for an asterisk. If one is found, this signifies a factor was returned. In actuality, at least two are returned if any are found, but I just work with the last one. This factor is reformatted to lines that Aliqueit can recognize and placed in a temp file (the name of which was also provided in the ECM call) for Aliqueit to find. The waiting for both scripts uses inotifywait. You will need to install inotify-tools to use it. The script to call the ecmpi process (aliqueitECM.sh) sits in a loop waiting for the aliqueitECM file to appear. When it does, the values from the intermediate file are used to call the ecmpi cluster to action and the output is piped into aliqueitECMresults. When the full ecmpi run is complete, the result file is sent to the Aliqueit machine where it is processed as described in the previous paragraph. The first thing you'll want to do is to change the following in aliqueit.ini: Code: ecmpy_cmd = bash /Math/Aliqueit/aliECM.sh aliECM.sh: Code: #/bin/bash/ # delete results file so inotifywait will wait rm aliECMresults # capture composite from pipe read lastpipe in # send composite to aliqueitECM file echo "N: $lastpipe" >aliqueitECM # send # of curves to run to aliqueitECM file # The actual number run will be the next multiple # of cores above the # of curves requested echo "curves:$3" >>aliqueitECM # send B1 value to aliqueitECM echo "B1: $6" >>aliqueitECM # send aliqueitECM to the master node of the # ecmpi cluster scp aliqueitECM mpi@:ecmpi/work/. # wait for aliECMresults echo "Waiting for results. . ." while [ ! -e aliECMresults ] do inotifywait -qq ./ done # check aliECMresults to see if factors were returned check=$(cat aliECMresults | grep "\*") # if yes, process the last line into Aliqueit familiar # lines and tack them to the end of the temp filename # provided by Aliqueit with the call if [ ${#check} -gt 1 ] then ind=$(echo expr index "$check" \*) factor=${check:$ind+1} echo "********** Factor found in step 2:$factor" >>$5 echo "Found prime factor of${#factor} digitis: $factor" >>$5 fi # give everything a chance to settle before returning control # to Aliqueit. (not really needed) sleep 1 Here's aliqueitECM.sh, the script that handles the ecmpi process: Code: #!/bin/bash/ # this finds out how many cores (threads, actually) are available function countcores { ccount=0 exec <"../hostfile" while read line do let ccount=${ccount}+${line:${#line}-1} done # echo "$ccount cores available" } # This provides an "endless" loop while [ ! -e aliqueitECMstop ] do # This waits for aliqueitECM to show up with the values while [ ! -e aliqueitECM ] do inotifywait -qqt 600 ./ done # clear the previous results rm aliECMresults # another check that aliqueitECM has appeared if [ -e aliqueitECM ] then # This extracts the composite, curves and B1 to use exec <"aliqueitECM" while read line do case $line in "curves:"*) curves=${line:8};; esac case $line in "N: "*) comp=${line:3};; esac case $line in "B1: "*) b1=${line:4};; esac done # echo "comp is ${comp:0:7}...${comp:-2}<${#comp}>" countcores echo "Running$curves curves with $ccount cores @$b1 on ${comp:0:7}...${comp:${#comp}-2}<${#comp}>" # This invokes ecmpi and sends the output to the result file mpirun -np $ccount --hostfile ../hostfile ../ecmpi -N$comp -nb $curves -B1$b1 >>aliECMresults # This sends the result file back to the Aliqueit machine scp aliECMresults :Math/GamAli/. # Clear current aliqueitECM to wait for the next one rm aliqueitECM fi done Let's move to the GNFS process now. Aliqueit uses a call to either factmsieve.py or factMsieve.pl to perform GNFS, We're going to use that call to invoke CADO-NFS instead. In this case, Aliqueit creates a directory of the form "ggnfs_" and places a file (test.n) in the directory. The file test.n has only "n: " in it. Then it changes to the ggnfs... directory and calls the chosen program with the name "test" as part of the call. Then Aliqueit waits for factors to show up in the ggnfs... directory in a file called "ggnfs.log" upon returned control. The script alitest.sh will be used to call CADO-NFS with the composite and then retrieve the factors from the log file and place them in ggnfs.log. First, you need a working setup of CADO-NFS. If you need to review anything to get your setup running, you can check: How I Install CADO-NFS onto my Ubuntu Machines and How I Run a Larger Factorization Via CADO-NFS on Several Ubuntu Machines Next, edit the aliqueit.ini file: Code: ggnfs_cmd = bash /Math/Aliqueit/alitest.sh CADO-NFS, by default, uses a temporary directory, which it creates with a random name. It also chooses the base name for all the files, from the name given in the appropriate parameters file. This is very useful in keeping jobs separated, but it makes it challenging for an external program to harvest data. On the positive side, keeping jobs separated helps CADO-NFS to perform smoothly and keep from crashing. In order to make retrieval of the factors easier, we will provide the working directory name and base name for our runs, but we also need to ensure we delete whatever may be left to corrupt a new run. The alitest.sh script will take care of this: alitest.sh: Code: #!/bin/bash/ #retrieve the composite from the call comp=$(cat$1.n) comp=${comp:3} #remove work from prior number so we can use the same names rm -R /tmp/alicado #move into CADO-NFS directory cd$HOME/Math/cado-nfs #run CADO-NFS (change options as desired) #CADO-NFS will provide a line to use to attach clients ./cado-nfs.py $comp server.whitelist= server.port= tasks.workdir=/tmp/alicado name=alicado #harvest factors for CADO-NFS's log file factors=$(cat /tmp/alicado/alicado.log | grep "Factors: ") factors=${factors:54} IFS=' ' read -r -a farray <<< "$factors" numfacts=${#farray[@]} cd$HOME/Math/GamAli/ggnfs_${comp} #create ggnfs.log file for Aliqueit to find the factors echo "Number: test" >ggnfs.log echo "N =$comp" >>ggnfs.log fcount=1 while [ $fcount -lt$numfacts ] do echo "factor: ${farray[$fcount]}" >>ggnfs.log let fcount=\${fcount}+1 done At this point a fully working setup should have been achieved and you can set Aliqueit to work. Note that using several machines in the above fashion means that a failure of any of the machines during ecmpi or the polyselect portion of CADO-NFS can interrupt the factoring process. Last fiddled with by EdH on 2020-08-09 at 15:54