View Single Post
Old 2020-06-19, 18:47   #2
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

338810 Posts
Default How I install Aliqueit onto my Ubuntu Machines (continued)

Setting up Aliqueit to distribute both ECM and GNFS across several machines

For this section, if you don't already have a working ecmpi cluster, you should follow:

How I Install and Run ecmpi Across Several Ubuntu Machines

Also make sure your Aliqueit (server) main machine can freely comunicate with the main ecmpi node. You will need to send (scp) files back and forth between the two.

You will also need a working set of CADO-NFS machines to distribute outside the local machine. In this case, if you set up the CADO-NFS main machine, it can factor by itself, but you should also, easlily be able to add clients. Again, if you need to set up a CADO-NFS system, you can review the information here:

How I Run a Larger Factorization Via CADO-NFS on Several Ubuntu Machines

I'll cover ecmpi first. The suggested use for the ECM programs is to pipe the composite into the ECM program with various switches to provide additional control. All that is needed to redirect this call to an ecmpi cluster is a script to capture the call, transform it to what ecmpi needs and translate any factors found into somewhere that Aliqueit can recognize it when the script returns control. In practice, I have two scripts and they might not be well formed. But, they work for me as is and I do adjust them, if needed. I would consider them a starting point that you can build from for your own desires.

The first script (aliECM.sh) works directly with Aliqueit. It captures the call for ECM and translates the call into composite, curves and B1 value, which are then written to an intermediate file (aliqueitECM). The intermediate file is sent (via scp) to the ecmpi controlling node. Then, aliECM.sh waits for the results (alqueitECMresults) from the ecmpi run. When the results are returned, the script looks in the last line for an asterisk. If one is found, this signifies a factor was returned. In actuality, at least two are returned if any are found, but I just work with the last one. This factor is reformatted to lines that Aliqueit can recognize and placed in a temp file (the name of which was also provided in the ECM call) for Aliqueit to find.

The waiting for both scripts uses inotifywait. You will need to install inotify-tools to use it. The script to call the ecmpi process (aliqueitECM.sh) sits in a loop waiting for the aliqueitECM file to appear. When it does, the values from the intermediate file are used to call the ecmpi cluster to action and the output is piped into aliqueitECMresults. When the full ecmpi run is complete, the result file is sent to the Aliqueit machine where it is processed as described in the previous paragraph.

The first thing you'll want to do is to change the following in aliqueit.ini:
Code:
ecmpy_cmd = bash <whatever $HOME shows>/Math/Aliqueit/aliECM.sh
aliECM.sh:
Code:
#/bin/bash/

# delete results file so inotifywait will wait
rm aliECMresults
# capture composite from pipe
read lastpipe in
# send composite to aliqueitECM file
echo "N: $lastpipe" >aliqueitECM
# send # of curves to run to aliqueitECM file
# The actual number run will be the next multiple
# of cores above the # of curves requested
echo "curves: $3" >>aliqueitECM
# send B1 value to aliqueitECM
echo "B1: $6" >>aliqueitECM
# send aliqueitECM to the master node of the
# ecmpi cluster
scp aliqueitECM mpi@<IP address>:ecmpi/work/.
# wait for aliECMresults
echo "Waiting for results. . ."
while [ ! -e aliECMresults ]
  do
    inotifywait -qq ./
  done
# check aliECMresults to see if factors were returned
check=$(cat aliECMresults | grep "\*")
# if yes, process the last line into Aliqueit familiar
# lines and tack them to the end of the temp filename
# provided by Aliqueit with the call
if [ ${#check} -gt 1 ]
  then
    ind=$(echo `expr index "$check" \*`)
    factor=${check:$ind+1}
    echo "********** Factor found in step 2: $factor" >>$5
    echo "Found prime factor of ${#factor} digitis: $factor" >>$5
fi
# give everything a chance to settle before returning control
# to Aliqueit. (not really needed)
sleep 1
Here's aliqueitECM.sh, the script that handles the ecmpi process:
Code:
#!/bin/bash/

# this finds out how many cores (threads, actually) are available
function countcores {
  ccount=0
  exec <"../hostfile"
    while read line
      do
        let ccount=${ccount}+${line:${#line}-1}
      done
#  echo "$ccount cores available"
}

# This provides an "endless" loop
while [ ! -e aliqueitECMstop ]
  do
# This waits for aliqueitECM to show up with the values
    while [ ! -e aliqueitECM ]
      do
        inotifywait -qqt 600 ./
      done
# clear the previous results
    rm aliECMresults
# another check that aliqueitECM has appeared
    if [ -e aliqueitECM ]
      then
# This extracts the composite, curves and B1 to use
        exec <"aliqueitECM"
          while read line
            do
              case $line in
                "curves:"*) curves=${line:8};;
              esac
              case $line in
                "N: "*) comp=${line:3};;
              esac
              case $line in
                "B1: "*) b1=${line:4};;
              esac
            done
#        echo "comp is ${comp:0:7}...${comp:-2}<${#comp}>"
        countcores
        echo "Running $curves curves with $ccount cores @ $b1 on ${comp:0:7}...${comp:${#comp}-2}<${#comp}>"
# This invokes ecmpi and sends the output to the result file
        mpirun -np $ccount --hostfile ../hostfile ../ecmpi -N $comp -nb $curves -B1 $b1 >>aliECMresults
# This sends the result file back to the Aliqueit machine
        scp aliECMresults <Aliqueit machine>:Math/GamAli/.
# Clear current aliqueitECM to wait for the next one
        rm aliqueitECM
    fi
  done
Let's move to the GNFS process now. Aliqueit uses a call to either factmsieve.py or factMsieve.pl to perform GNFS, We're going to use that call to invoke CADO-NFS instead. In this case, Aliqueit creates a directory of the form "ggnfs_<composite>" and places a file (test.n) in the directory. The file test.n has only "n: <composite>" in it. Then it changes to the ggnfs... directory and calls the chosen program with the name "test" as part of the call. Then Aliqueit waits for factors to show up in the ggnfs... directory in a file called "ggnfs.log" upon returned control. The script alitest.sh will be used to call CADO-NFS with the composite and then retrieve the factors from the log file and place them in ggnfs.log.

First, you need a working setup of CADO-NFS. If you need to review anything to get your setup running, you can check:

How I Install CADO-NFS onto my Ubuntu Machines
and
How I Run a Larger Factorization Via CADO-NFS on Several Ubuntu Machines

Next, edit the aliqueit.ini file:
Code:
ggnfs_cmd = bash <whatever $HOME shows>/Math/Aliqueit/alitest.sh
CADO-NFS, by default, uses a temporary directory, which it creates with a random name. It also chooses the base name for all the files, from the name given in the appropriate parameters file. This is very useful in keeping jobs separated, but it makes it challenging for an external program to harvest data. On the positive side, keeping jobs separated helps CADO-NFS to perform smoothly and keep from crashing. In order to make retrieval of the factors easier, we will provide the working directory name and base name for our runs, but we also need to ensure we delete whatever may be left to corrupt a new run. The alitest.sh script will take care of this:

alitest.sh:
Code:
#!/bin/bash/

#retrieve the composite from the call
comp=$(cat $1.n)
comp=${comp:3}
#remove work from prior number so we can use the same names
rm -R /tmp/alicado
#move into CADO-NFS directory
cd $HOME/Math/cado-nfs
#run CADO-NFS (change options as desired)
#CADO-NFS will provide a line to use to attach clients
./cado-nfs.py $comp server.whitelist=<IP addresses of potential clients> server.port=<port#> tasks.workdir=/tmp/alicado name=alicado
#harvest factors for CADO-NFS's log file
factors=$(cat /tmp/alicado/alicado.log | grep "Factors: ")
factors=${factors:54}
IFS=' ' read -r -a farray <<< "$factors"
numfacts=${#farray[@]}
cd $HOME/Math/GamAli/ggnfs_${comp}
#create ggnfs.log file for Aliqueit to find the factors
echo "Number: test" >ggnfs.log
echo "N = $comp" >>ggnfs.log
fcount=1
while [ $fcount -lt $numfacts ]
  do
    echo "factor: ${farray[$fcount]}" >>ggnfs.log
    let fcount=${fcount}+1
  done
At this point a fully working setup should have been achieved and you can set Aliqueit to work.

Note that using several machines in the above fashion means that a failure of any of the machines during ecmpi or the polyselect portion of CADO-NFS can interrupt the factoring process.

Last fiddled with by EdH on 2020-08-09 at 15:54
EdH is offline   Reply With Quote