![]() |
EMG-C188 factored
1 Attachment(s)
[code]
Sat Nov 28 13:19:23 2015 p91 factor: 1455229648108768594552694966205142453019168989838313852033836936726828258873327877743335007 Sat Nov 28 13:19:23 2015 p98 factor: 60531294671960669735077411626867118292354916934357577613958209636477806174025280318238132795431621 [/code] About 212 hours on 7 threads Xeon E5-2650v2 for a 21.1M matrix (the machine had to be rebooted a few times) Log attached |
Taking
[B]C154_P182_plus_1 [/B][B]C155_P209_plus_1 C153_P233_plus_1[/B] for post-processing. They probably take only about a day each. |
You have decent horsepower, so I'd say much less than a day for each of them, though one day for all three might be a stretch.
|
W_790
1 Attachment(s)
[CODE]prp88 factor: 7672236958518363816567697109832643079838636749631210749387737419588642636816013574367653
prp100 factor: 6562282240936037358815422484225511323795135516019451448000923739689561470840832334996806359588917617[/CODE] |
C184_HP2_4496
The C184 blocking HP2(4496) splits as:
[CODE]prp83 factor: 18629234615651511444939975064252892061546608854100819750912782705794351301276248439 prp101 factor: 90074244593568724732372840988999455713334654010427512152214073432054138168692117445761801991568405847 elapsed time 110:46:38[/CODE] Elapsed time is misleading as there was a power outage midway through. Actual total time is somewhere on the order of 120-140 hours. Matrix was 15331633 x 15331859 with TD=128. |
Running 4261-67; ETA is 21 hours, but that's the weekend so I won't see the answer until Monday.
Also taking 2269-67 and 2789-67 which should fit in over the weekend. |
What are best parameters to run MPI version of msieve on single computer? Have Dual Xeon E5-2620, so 6cores/12threads * 2. I compiled msieve v1.52 with OpenMPI 1.8.1 and newest GMP 6.1.0.
I tried MPI version on C165 GNFS-job with many different options (-bind-to-core/-bind-to-socket, -bycore/-bysocket, -cpu-set, etc, running with and without taskset command) and the best what I could receive was 60 hours (taskset -c 0-11 mpirun -np 12 -bind-to-core msieve -t 12 -nc2 2,6). Running without MPI I got much better results: taskset -c 0-11 msieve -t 12 -> 36 hours (running only on one cpu) taskset -c 0-5,12-17 msieve -t 12 -> 40 hours (running on both cpus) msieve -t 12 -> 43 hours (without taskset command) taskset -c 0-5 msieve -t 6 -> 63 hours (only 6 threads) I'm disappointed with these results, what's I'm doing wrong? |
Try, perhaps with a -bysocket if it helps,
mpirun -np 2 msieve -nc2 1,2 -t 12 |
On a 48-core (4 sockets x 2 chips per socket x 6 cores per chip) Opteron machine I found that it was very helpful to have a 'numactl -l' in the command line, as well as the taskset, to ensure that the memory was allocated on the node on which the process was running. I got mpirun to run a script which contained a taskset command, rather than trying to taskset the mpirun itself - see next post.
I am slightly surprised that you're finding -t12 faster than -t6 on a hyperthreaded system, I should redo that measurement with the next linear algebra job I run. |
For the two-layer approach I did something like
[code] mpirun -n 8 run.2,4.6.sh [/code] where run.2,4.6.sh was [code] msieve_real='/home/nfsworld/msieve-svn-again-mpi/trunk/msieve -v' CPUL=$[6*$OMPI_COMM_WORLD_RANK] CPUR=$[6*$OMPI_COMM_WORLD_RANK+5] taskset -c $CPUL-$CPUR numactl --cpunodebind=$OMPI_COMM_WORLD_RANK -l $msieve_real -t 6 -nc2 2,4 [/code] |
[QUOTE=unconnected;418280]taskset -c 0-11 mpirun -np 12 -bind-to-core msieve -t 12 -nc2 2,6[/quote]
I'm surprised that worked at all in as little as 60 hours; it causes mpirun to start up 12 copies of sieve [I]each of which[/I] tries to use twelve threads, so I'd have thought you'd see the machine load average going into the hundreds. |
| All times are UTC. The time now is 23:03. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.