![]() |
![]() |
#1 |
(loop (#_fork))
Feb 2006
Cambridge, England
2·7·461 Posts |
![]()
I recall some reports of enormously large second-stage B values for p +- 1 factorisations using multi-CPU machines.
Since the B2 values for sixty-digit ECM are large enough that I don't have enough memory to run one per CPU, I was wondering if the second stage of ECM can be made to use more than one CPU? |
![]() |
![]() |
![]() |
#2 |
Einyen
Dec 2003
Denmark
19×181 Posts |
![]()
I learned recently from Alex, that stage2 of ECM, P-1 and P+1 can all be done in stages and on different machines.
If you do stage1 with the -chkpnt or -save option to make a checkpoint file at the end of stage1, you can distribute that file to several machines. Then you can do stage2 like this, assuming B1=85e7: ecm.exe -resume chkpnt.txt 85e7 85e7-1e14 < number.txt ecm.exe -resume chkpnt.txt 85e7 1e14-2e14 < number.txt ecm.exe -resume chkpnt.txt 85e7 2e14-3e14 < number.txt . . . and so on. Last fiddled with by ATH on 2010-08-20 at 16:51 |
![]() |
![]() |
![]() |
#3 | |
Bamboozled!
"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across
1166810 Posts |
![]() Quote:
Paul |
|
![]() |
![]() |
![]() |
#4 |
(loop (#_fork))
Feb 2006
Cambridge, England
2·7·461 Posts |
![]()
Thanks. Now I can run sixteen copies at a time, each taking 6800+2800 seconds so 600s per curve, rather that six copies each taking 3550+675 = 704s/curve aggregated.
|
![]() |
![]() |
![]() |
#5 |
Sep 2009
22·607 Posts |
![]()
I've found a way to overlap stage 1 and stage 2 ecm. From a shell script:
Code:
mkfifo ecm.fifo ecm -resume ecm.fifo -nn 11e6 | tee -a $LOG.2 | grep [Ff]actor & echo $N | ecm -c 4480 -savea ecm.fifo -nn 11e6 1 | tee -a $LOG | grep [Ff]actor I've tested it on Linux. It should work for other UNIX variants. Porting to windows would be "an exercise for the reader". On a single core with hiperthreading it's saving about 1/3 the time it took to do stage 2 when only running 1 thread (I don't have enough RAM on this box to run two stage 2s at once). It would work better if I set B2 to a value that took nearly as long as stage 1, but I havn't worked that out yet.How does run time for stage 2 vary with B2 for different numbers? And how does it vary when maxmem becomes significant? It should also work for p-1 against a list of numbers and p+1. But I've not tested them. Chris K |
![]() |
![]() |
![]() |
#6 |
May 2008
21078 Posts |
![]() |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Parallel sieving with newpgen | fivemack | And now for something completely different | 3 | 2017-05-16 17:55 |
What's the best way to use PFGW in parallel? | CRGreathouse | Information & Answers | 4 | 2016-03-06 00:21 |
Parallel version of Prime for HPC? | dtripp | Software | 3 | 2008-09-30 19:52 |
Parallel memory bandwidth | fivemack | Factoring | 14 | 2008-06-11 20:43 |
Parallel Prime Search | DonaldTripp | Software | 2 | 2007-02-17 19:35 |