20050419, 15:16  #1 
"William"
May 2003
New Haven
2373_{10} Posts 
How Much Memory at Various Sizes?
How much memory does GMPECM use for default stage 2 for ECM at sizes from 50 digits to 70 digits? How about P1 and P+1? Is there a simple way to find this out?

20050423, 18:47  #2 
"Nancy"
Aug 2002
Alexandria
2,467 Posts 
Paul Zimmermann wrote an estimate for the memory use which is in the CVS version. It looks like this:
Code:
lgk = ceil_log2 (dF); mem = 9.0 + (double) lgk; #if (MULT == KS) mem += 24.0; /* estimated memory for kronecker_schonhage */ mem += 1.0; /* for the wrapcase in PrerevertDivision */ #endif mem *= (double) dF; mem *= (double) mpz_size (modulus>orig_modulus); mem *= (double) mp_bits_per_limb / 8.0; B2  B2min = k * dF * d and d/dF ~= 11...12 for large B2. Lets assume 12. So dF ~= sqrt((B2B2min)/k/12). I'm a little surprised that the memory requirements for KroneckerSchönhage are that high... Alex 
20050423, 23:21  #3 
"William"
May 2003
New Haven
100101000101_{2} Posts 
Hmmm.
Could I get a worked example to help me understand how to apply this? I'm considering plans to work on the 244 digit composite 5^{349}1 and the 246 composite 19^{193}1, possibly at B1=260M or 850M. In preparation for this, I'm considering upgrading my PC to have 1, 2, or 3 Gigabytes of RAM. How much is "enough"? William 
20050424, 09:03  #4 
May 2004
2^{4}·5 Posts 
Something I've been thinking about:
The treefile option stores the product tree of F on disk; this reduces the memory consumption from O(dF * log(dF) * log(n)) to O(dF * log(n)) at the cost of more expensive IO. However with a threaded approach we could run a new stage 1 at low priority during a (treefile) stage 2; this should 'use up' most of the wasted cycles doing disk seeks. So IMO, ecm probably doesn't require lots of memory, it's just easiest to implement that way. Dave 
20050424, 17:42  #5 
"Nancy"
Aug 2002
Alexandria
2,467 Posts 
Since B1=B2min<<B2, we can take B2B2min ~= B2 and
dF ~= sqrt(2.4e11/k/12) ~= 447213/sqrt(k), where 2.4e11 is the default B2 for B1=260M. Then ceil(log_2(dF))=19k/2. So you need dF* (9+25+19k/2) residues in memory. Each number has 26 dwords plus 3 for the mpz_t, or 116 bytes. Hence you can expect something line 447213/sqrt(k) * (53k/2)*116 bytes, or about 50* (53k/2)/sqrt(k) Megabytes. For k=2, that is ~1.8 GB. With k=4, 1.3GB, and with k=8 866 MB. With the treefile option (this eliminates the ceil(log_2(dF))k/2 term) you'd need 1.2GB, 850MB or 600 MB respectively. For B1=260M, 2GB should let you use default settings. With 1GB, you'll have to increase the number of blocks and perhaps use treefile which will cost some speed. Disabling KroneckerSchönhage multiplication would save a lot of memory (the +25 term), but also cost a lot of speed. BTW, George mentioned in an email that he plans to look into DWT for bases != 2. This will greatly speed up stage 1 for your numbers. Alex 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
"Hybrid Memory Cube" offers 1 Tb/s memory bandwith at just 1.4 mW/Gb/s  ixfd64  Hardware  4  20111214 21:24 
Variable FFT sizes  MiniGeek  Software  7  20080114 17:33 
FFT sizes  Cruelty  Riesel Prime Search  3  20060712 15:15 
Different word sizes/accuracy for forward/inverse transform?  Dresdenboy  Math  2  20031229 22:55 
Cache Sizes  Unregistered  Hardware  4  20031206 13:43 