mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2013-11-04, 23:25   #23
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11×347 Posts
Default

Quote:
Originally Posted by fivemack View Post
You definitely need a gigabit switch! Each iteration on a 2M matrix involves transferring about sixteen megabytes, twice.

In fact, gigabit isn't really fast enough, and 10Gbit is still distinctly expensive - half the price of a Haswell machine for each network adaptor and the price of a whole Haswell machine for the cheapest 8-port switch.

Sadly nobody seems to have decommissioned a supercomputer with DDR Infiniband interconnect and put the bits up on eBay recently; there is a decent-priced 8-port 10Gbit Infiniband switch at http://www.ebay.co.uk/itm/Flextronic...item1e818160c7 and 10Gbit cards at http://www.ebay.co.uk/itm/Mellanox-M...item27d7741abc
Bummer that gigabit is still not enough. I have an inexpensive one on the way to play with, since all three of the machines I wish to tie in already support gigabit on board. The switch claims 10 gigabit (but, only 2 per port). But, the machines only claim 1 gigabit capacity anyway. This is all kind of why I was hoping to get multiple threads across two or (optimally) three machines. If I gain any time in LA, it's an overall success, because when I'm working on an aliquot sequence, after the sieving is completed, all the other machines are idle until the LA and square root steps finish.

On a side note, I'm not sure what version of msieve was running for my previous testing (this was before the SVN 923), but with it, I had to reduce my threads from 4 to 2 to gain the best LA. With SVN 947 I see I can run all 4 threads for the best ETA.

Thanks for putting up with my antics.

and, all the help...
EdH is offline   Reply With Quote
Old 2013-11-05, 08:10   #24
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

72×131 Posts
Default

Yes, SVN947 has really very much better multi-threading.

My solution to idleness when running aliquot sequences is simply to run multiple aliquot sequences, with lots of scripts of the form

Code:
~/msieve/trunk/msieve -v -nc -t 4 & while [ ! -e msieve.dat.mat ]; do sleep 120; done; killall -STOP python gnfs-lasieve4I14e; while [ ! -e msieve.dat.dep ]; do sleep 120; done; killall -CONT gnfs-lasieve4I14e python
(IE let the sieving on something else run except when msieve wants all four processors)

Last fiddled with by fivemack on 2013-11-05 at 08:10
fivemack is offline   Reply With Quote
Old 2013-11-05, 14:38   #25
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11·347 Posts
Default

Quote:
Originally Posted by fivemack View Post
Yes, SVN947 has really very much better multi-threading.

My solution to idleness when running aliquot sequences is simply to run multiple aliquot sequences, with lots of scripts of the form

Code:
~/msieve/trunk/msieve -v -nc -t 4 & while [ ! -e msieve.dat.mat ]; do sleep 120; done; killall -STOP python gnfs-lasieve4I14e; while [ ! -e msieve.dat.dep ]; do sleep 120; done; killall -CONT gnfs-lasieve4I14e python
(IE let the sieving on something else run except when msieve wants all four processors)
Actually, it's my own fault for waiting for the next number in the sequence. I even have a separate machine that runs the tasking for ECM - a P-III that won't run gmp-ecm. And, there is plenty of ECM to be done.

Right now I'm getting some timings for the recent c138 with all kinds of various settings, so I can compare ETAs when my new switch comes in.

Thanks for all.
EdH is offline   Reply With Quote
Old 2013-11-05, 15:01   #26
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3×1,171 Posts
Default

Quote:
Originally Posted by EdH View Post
Actually, it's my own fault for waiting for the next number in the sequence. I even have a separate machine that runs the tasking for ECM - a P-III that won't run gmp-ecm. And, there is plenty of ECM to be done.


Maybe you should turn off all of those PIII's and PIV's for a few months, and with the electrical bill savings buy a haswell system. It will be as fast as all of the rest of your PCs combined.
bsquared is offline   Reply With Quote
Old 2013-11-05, 17:26   #27
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11×347 Posts
Default

Quote:
Originally Posted by bsquared View Post


Maybe you should turn off all of those PIII's and PIV's for a few months, and with the electrical bill savings buy a haswell system. It will be as fast as all of the rest of your PCs combined.
It's quite possible that you are correct, but the power isn't too outrageous here, yet. And, if I were to "turn them off," I might pursue my other interests more and not return to "pester" all of you.
EdH is offline   Reply With Quote
Old 2013-11-05, 17:41   #28
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

66718 Posts
Default

Quote:
Originally Posted by EdH View Post
And, if I were to "turn them off," I might pursue my other interests more and not return to "pester" all of you.
My intent was not at all to shoo you away. And I appreciate the desire to keep perfect functional hardware around - my only TV is still an old tube-based one. But a PIII! Your phone could run circles around it .
bsquared is offline   Reply With Quote
Old 2013-11-06, 02:24   #29
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

73518 Posts
Default

Quote:
Originally Posted by bsquared View Post
My intent was not at all to shoo you away. And I appreciate the desire to keep perfect functional hardware around - my only TV is still an old tube-based one. But a PIII! Your phone could run circles around it .
Indeed! Its only function currently, is to control ECM tasks for all the others and pause while a separate one handles gnfs taskings. I do have another probable purpose for the P-III, though and that will mean adding the ECM assignment scripts to one of the ECM machines...
EdH is offline   Reply With Quote
Old 2013-11-07, 22:36   #30
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11×347 Posts
Default gigabit comaprisons

I have installed the gigabit switch and have done a bunch of testing. In case anyone is interested in my results, I have put together the following summary and added the measured timings for each test below.

Background:
machine 1: core2 quad no threading 2.67GHz 4GB
machine 2: core2 quad no threading 2.33GHz 2GB
machine 3: i7 dual core 4 hyperthreads 2.67GHz 4GB
all three machines have on board gigabit networking
10/100 tests were done with a simple 10/100 switch
gigabit testing was done with the three machines connected to a 5-port gigabit switch. The switch was linked via 10/100 to its source router and supplied other 10/100 switches. The relations set was from a c138 run and prior to each test the existing .dat.mat and .dat.mat.idx files were removed.

Discussion:
The gigabit switch definitely made a difference, but in the long run, the best improvement was only about 20% over running the LA on a single machine, with the best times (as expected) resulting from max thread use with minimum mpi comm. Oddly, I had almost identical ETAs with 4 threads on 3 machines as 4 threads on 2 machines. I suspected this might be due to the third machine having only two cores, so I ran a test with two threads and the mpi hostfile adjusted to 2,2,1 slots. This showed no real improvement. I then ran the same test with the 1,5 grid parameters swapped for 5,1 to see if it showed any change - Nope. Currently, I am running the exact same msieve instance on all three machines. It may be possible to adjust further by altering the individual parameters for the three machines. I will probably be testing this somewhat later.

Test runs:
Code:
./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 2 -nc2
10/100:   ETA 6h30m (only run once, at the very start of testing)
gigiabit: ETA 6h53m (this is unexplained - I ran it more times with the same result)

./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 4 -nc2
10/100:   ETA 5h 3m
gigiabit: ETA 5h 3m

mpirun --hostfile mpiHosts -n 2 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -nc2 1,2
host slots=1
slave1 slots=1
10/100:   ETA 18h24m
gigiabit: ETA 7h35m

mpirun --hostfile mpiHosts -n 2 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 2 -nc2 1,2
host slots=1
slave1 slots=1
10/100:   ETA 15h57m
gigiabit: ETA 5h 4m

mpirun --hostfile mpiHosts -n 2 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 4 -nc2 1,2
host slots=1
slave1 slots=1
10/100:   ETA 15h12m
gigiabit: ETA 3h59m

mpirun --hostfile mpiHosts -n 4 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -nc2 2,2
host slots=2
slave1 slots=2
10/100:   ETA 15h55m
gigiabit: ETA 4h42m

mpirun --hostfile mpiHosts -n 4 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 2 -nc2 2,2
host slots=2
slave1 slots=2
10/100:   ETA 15h13m
gigiabit: ETA 4h19m

mpirun --hostfile mpiHosts -n 4 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 4 -nc2 2,2
host slots=2
slave1 slots=2
10/100:   ETA 15h32m
gigiabit: ETA 4h42m

mpirun --hostfile mpiHosts -n 8 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -nc2 2,4
host slots=2
slave1 slots=2
10/100:   ETA 22h27m
gigiabit: ETA 5h47m

mpirun --hostfile mpiHosts -n 3 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -nc2 1,3
host slots=1
slave1 slots=1
slave2 slots=1
10/100:   ETA 22h54m
gigiabit: ETA 6h 9m

mpirun --hostfile mpiHosts -n 3 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 2 -nc2 1,3
host slots=1
slave1 slots=1
slave2 slots=1
10/100:   ETA 21h21m
gigiabit: ETA 4h42m

mpirun --hostfile mpiHosts -n 3 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 4 -nc2 1,3
host slots=1
slave1 slots=1
slave2 slots=1
10/100:   ETA 21h21m
gigiabit: ETA 3h58m

mpirun --hostfile mpiHosts -n 6 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -nc2 2,3
host slots=2
slave1 slots=2
slave2 slots=2
10/100:   ETA 39h49m
gigiabit: ETA 5h24m

mpirun --hostfile mpiHosts -n 6 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 2 -nc2 2,3
host slots=2
slave1 slots=2
slave2 slots=2
10/100:   ETA 39h48m
gigiabit: ETA 4h41m

mpirun --hostfile mpiHosts -n 6 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 4 -nc2 2,3
host slots=2
slave1 slots=2
slave2 slots=2
10/100:   ETA 39h 1m
gigiabit: ETA 5h 3m

mpirun --hostfile mpiHosts -n 5 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 2 -nc2 1,5
host slots=2
slave1 slots=2
slave2 slots=1
10/100:   n/a
gigiabit: ETA 5h 4m

mpirun --hostfile mpiHosts -n 5 ./msieve -i number.ini -s number.dat -l number.log -nf number.fb -t 2 -nc2 5,1
host slots=2
slave1 slots=2
slave2 slots=1
10/100:   n/a
gigiabit: ETA 5h 4m
EdH is offline   Reply With Quote
Old 2013-11-08, 02:27   #31
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

7·1,373 Posts
Default

coma? prisons?
LaurV is offline   Reply With Quote
Old 2013-11-08, 03:11   #32
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11×347 Posts
Default

Quote:
Originally Posted by LaurV View Post
Someone does read the titles! (But, I guess it wasn't me.) And, I wouldn't change it, if I could. back at ya'
EdH is offline   Reply With Quote
Old 2013-11-08, 17:57   #33
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11×347 Posts
Default

Can someone tell me why running two machines via mpi vs. three machines via mpi, with all else the same, gives "virtually" no difference in ETA?
Code:
mpirun --hostfile mpiHosts -n 2 ./msieve ... -t 4 -nc2 2,1
linear algebra at 0.1%, ETA 3h57m

mpirun --hostfile mpiHosts -n 3 ./msieve ... -t 4 -nc2 3,1
linear algebra at 0.1%, ETA 3h58m
It would seem hard to imagine that the communications/added computer so closely offset each other. Is it somethnig to do with the matrix size? Maybe I can realize a difference with a larger matrix?

I will play eventually with a larger composite, but maybe not for a while...

Thanks for all.
EdH is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
How I Run a Larger Factorization Using Msieve, gnfs and factmsieve.py on Several Ubuntu Machines EdH EdH 7 2019-08-21 02:26
How I Install msieve onto my Ubuntu Machines EdH EdH 0 2018-02-23 14:43
Help need to running Msieve appleseed Msieve 12 2016-04-10 02:31
Running on multiple machines Helfire Software 8 2004-01-14 00:09
Running a LL test on 2 different machines lycorn Software 10 2003-01-13 19:34

All times are UTC. The time now is 00:51.


Sat Jul 17 00:51:45 UTC 2021 up 49 days, 22:39, 1 user, load averages: 1.39, 1.49, 1.41

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.