20040405, 06:56  #1 
Mar 2003
New Zealand
485_{16} Posts 
ECM using gmpecm and mprime on a P4
(This post follows from the discussion in the Hardware forum, see http://www.mersenneforum.org/showthread.php?t=2307).
These are the times I got for curves on the C204 of M787, the C169 of P787 and their product C373 of M1574 using the precompiled Debian gmpecm package (gmpecm 5.0.3): Code:
Number  Stage 1  Stage 2  Total per  Peak  Time for  B1=43e6  B2=180e9  curve  memory  9000 curves +++++ M787 (C204)  2290 sec.  1899 sec.  4189 sec.  188 MB  436 days. P787 (C169)  2112 sec.  1380 sec.  3492 sec.  159 MB  364 days. M1574 (C373)  5855 sec.  3798 sec.  9653 sec.  317 MB  1006 days. Code:
Number  Stage 1  Stage 2  Total per  Peak  Time for  B1=44e6  B2=4.29e9  curve  memory  19300 curves +++++ M787  541 sec.  246 sec.  787 sec.  50 MB  176 days. P787  1354 sec.  491 sec.  1845 sec.  50 MB  412 days. M1574  684 sec.  304 sec.  988 sec.  88 MB  221 days. 
20040405, 20:26  #2 
"Phil"
Sep 2002
Tracktown, U.S.A.
2^{5}·5·7 Posts 
Thanks for the interesting data. I don't know the answer to your question and hope someone else can answer it. Your mprime data is much like my prime95 data, in that a curve on M1574 does not take much more time than a curve on M787. So we really do get a big boost when we can work on pairs of numbers simultaneously. I found the memory usage info interesting. gmpecm uses a memory intensive second stage, so the large memory usage there was no surprise, but I didn't expect mprime's memory usage to be quite so high. What were your memory settings? Sometimes with small exponents it is possible to reach a point of diminishing returns where additional memory does little to make the program run faster.
Comparing your data to that of wblipp, it almost looks like the best solution is to run stage 1 on a P4 and switch to an Athlon for stage 2 with gmpecm! Of course the ideal solution would be to combine the two programs... 
20040405, 23:49  #3 
Mar 2004
3×127 Posts 
If you are using the gmpecmstage1hook of prime 95, prime95 works on the mersenne/fermat number but writes just the composite (in hexadecimal) into the results.txt file from which you can resume with GMPECM.
In case gmpecm is faster with the whole mersenne/fermat number instead of just the composite, it might notice that it is a divisor of the mersenne/fermat and use that one instead. Actually I couldn's manage to make gmpecm 5.1 beta run with that file. I does not accept a hexadecimal input number, and the variable with the intermediate result has the wrong name (QX instead of Q, I think). It is possible to correct that but unwieldy with large files. 
20040406, 07:59  #4 
"Nancy"
Aug 2002
Alexandria
2,467 Posts 
You can do the stage 1 on M1574 and then stage 2 in GMPECM on M787 and P787 individually, but it's a little awkward. Produce the results.txt file with stage 1 residues as usual, then copy it to two files (i.e. results.m787 and results.p787). Get the remaining cofactors of M787 and P787 in decimal or hexadecimal form (GMPECM can parse either) and in a text editor, substitute in results.m787 the "N=0x....;" by "N=<cofactor of M787 here>;", similarly for results.p787. GMPECM will then reduce the stage 1 residue modulo the cofactor and all should work out nice.
Better try on a number with a know factor and accompanying curve parameters first, though. Alex 
20040406, 10:14  #5  
Mar 2003
New Zealand
13×89 Posts 
Here are the times using mprime for stage 1 and gmpecm for stage 2, the command line was 'ecm resume results.txt 1 44e64.29e9'.
Code:
Number  Stage 1  Stage 2  Total per  Peak  Time for  B1=44e6  B2=4.29e9  curve  memory  19300 curves +++++ M787  541 sec.  114 sec.  655 sec.  28 MB  146 days. P787  1354 sec.  89 sec.  1443 sec.  23 MB  322 days. M1574  684 sec.  244 sec.  928 sec.  46 MB  207 days. I will try to use this Maple program http://www.mersenneforum.org/showpo...06&postcount=19 to work out what the best B2 bound is and post the results for comparison. Quote:


20040407, 14:46  #6 
Mar 2003
New Zealand
2205_{8} Posts 
The following table shows the time to do N 50 digit curves with B1 fixed at 43e6. Stage 2 times are for gmpecm on a single curve. The P787 total is calculated using the stage 1 from M1574 since this is always faster than doing stage 1 for P787. The M+P787 column is calculated using stage 1 of M1574 plus stage 2 for both M787 and P787.
Where f(B1,B2) is the probability of success at the 50 digit level defined by running the Maple program with parameters p=50, B1, B2, I calculated the number N of curves needed at bounds B1,B2 to be N = ceil(9000*f(43e6,180e9)/f(B1,B2)). The times for stage 1 (mprime) were: M787=530 sec. M1574=674 sec. Code:
Stage 2 time (sec.) Time for N curves (days) B2 N  M787 P787 M1574  M787 P787 M1574 M+P787 ++ 4.29e9 19073  114 89 244  142 168 203 194 6e9 17625  154 120 326  140 162 204 193 8e9 16512  201 158 419  140 159 209 197 10e9 15719  224 176 468  137 155 208 195 12e9 15112  237 187 491  134 146 204 192 14e9 14626  274 212 586  136 150 213 196 16e9 14223  286 221 613  134 147 212 194 18e9 13880  329 255 699  138 149 221 202 20e9 13584  351 273 745  139 149 223 204 30e9 12525  486 380 1008  147 153 244 223 180e9 9000  1899 1380 3798  253 214 466 412 Since the chance of finding a factor in M787 or P787 is equal while the P787 stage 2 computation is cheaper, I think this could be improved by using split B2 bounds to make the stage 2 times equal. I plan to do these calculations for a few other exponents for comparison, but first I will compile optimised gmpecm and libgmp binaries to see how much difference that makes over the generic Debian binaries. 
20040408, 16:25  #7 
Mar 2003
New Zealand
13·89 Posts 
I recompiled libgmp with P4 specific optimisations and it has made a huge difference to the gmpecm times. This should make a larger B2 worthwhile now.
Code:
Generic P4 Improvement  Stage 1 B1=43e6 M787 2290s 1202s 91% P787 2112s 1041s 103% M1574 5855s 2842s 106% Stage 2 B2=180e9 M787 1899s 1088s 75% P787 1380s 817s 69% M1574 3798s 2015s 88% 
20040408, 17:51  #8 
"Nancy"
Aug 2002
Alexandria
9A3_{16} Posts 
I think your expected number of curves do not account for the effect of the BrentSuyama extension. I've run a program to count how many smooth number there are in the interval [2.5e48, 2.5e48+1e8], these are the results.
Code:
B2 deg B1sm. B2sm. Br.Su. N 4.29e9 12 51 464 61 17361 1e10 12 51 579 64 14409 2e10 12 51 669 79 12500 18e10 12 51 1019 107 8491 18e10 30 51 1019 151 8188 Good luck! Alex 
20040409, 08:11  #9  
"Sander"
Oct 2002
52.345322,5.52471
29×41 Posts 
Quote:


20040410, 02:01  #10  
Mar 2003
New Zealand
13×89 Posts 
Quote:
Quote:


20040410, 15:42  #11  
"Sander"
Oct 2002
52.345322,5.52471
29×41 Posts 
Quote:


Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
mprime  Unregistered  Information & Answers  3  20110805 09:17 
mPrime over SSH  Vincanis  Information & Answers  4  20110630 13:36 
./mprime B in 257  Lazlow  PrimeNet  3  20081029 20:40 
64 bit mprime?  aaronl  Linux  1  20051110 16:50 
Problem with mprime (Fixed with mprime d)  antiroach  Software  2  20040719 04:07 