mersenneforum.org  

Go Back   mersenneforum.org > Fun Stuff > Lounge

Reply
 
Thread Tools
Old 2003-12-07, 13:00   #1
andi314
 
andi314's Avatar
 
Nov 2002

2·37 Posts
Default FFT-Size

How can i calculate the FFT size which i have to use for the different exponents???

andi314
andi314 is offline   Reply With Quote
Old 2003-12-08, 00:09   #2
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

265518 Posts
Default

http://www.mersenneforum.org/attachments/pdfs/F24.pdf

Equation 8.
ewmayer is offline   Reply With Quote
Old 2003-12-08, 14:35   #3
Reboot It
 
Reboot It's Avatar
 
Aug 2002
London, UK

5B16 Posts
Default

On a somewhat related note, has anybody produced a table of RAM used by Prime95 during a LL test vs. FFT size?
Reboot It is offline   Reply With Quote
Old 2003-12-08, 20:22   #4
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

3×53×31 Posts
Default

Not sure exactly what auxiliary arrays (besides the main residue array) Prime95 allocates, but for a length-N FFT (taht is, N real elements) a typical number would be N*8 bytes, plus an additional 10-50% of that for the auxiliary-data arrays.

Last fiddled with by ewmayer on 2003-12-08 at 20:23
ewmayer is offline   Reply With Quote
Old 2003-12-09, 03:25   #5
markhl
 
Apr 2003
California

22·23 Posts
Default Re: FFT-Size

Quote:
Originally posted by andi314
How can i calculate the FFT size which i have to use for the different exponents???

andi314
http://www.mersenne.org/bench.htm
markhl is offline   Reply With Quote
Old 2003-12-09, 15:34   #6
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

3×53×31 Posts
Default Re: Re: FFT-Size

Quote:
Originally posted by markhl
http://www.mersenne.org/bench.htm
I thought Andi meant in an a priori fashion - Prime95's tables are a posteriori data, based on trial and (roundoff) error. Andi, in which way did you intend your question?
ewmayer is offline   Reply With Quote
Old 2003-12-09, 17:57   #7
hbock
 
hbock's Avatar
 
Feb 2003

163 Posts
Default

I think a table with FFT sizes for smaller exponents would be also helpful for people doing ECM factoring on those numbers, just to estimate the minimum of RAM needed. Usually the RAM spent for ECM should be at least 192 times the FFT size (see prime95 help).

In which way the RAM required for P-1 also depends on the FFT size ? I mean especiallly for higher B1/2 values.
hbock is offline   Reply With Quote
Old 2003-12-10, 16:42   #8
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

3×53×31 Posts
Default

Quote:
Originally posted by hbock
I think a table with FFT sizes for smaller exponents would be also helpful for people doing ECM factoring on those numbers, just to estimate the minimum of RAM needed. Usually the RAM spent for ECM should be at least 192 times the FFT size (see prime95 help).

In which way the RAM required for P-1 also depends on the FFT size ? I mean especiallly for higher B1/2 values.
Since most PCs nowadays have at least 256MB of RAM, 192 times the FFT size only becomes an issue when the number being factored takes up > 1MB in floating-point form, which translates to exponents ~> 2.5M .

Both p-1 and ECM are similar in terms of their stage-2 memory usage vs. efficiency trends (though I believe stage 2 ECM to a given bound needs appreciably more memory than stage 2 p-1 to the same bound), so I'll use p-1 to explain it: in stage 2 of p-1 you need to power the stage 1 residue (call it R) to the product of the various stage 2 primes, and to do so efficiently you need to calculate R raised to various even powers corresponding to the various gaps between the stage 2 primes. Ideally you'd simply precalculate R raised to all the prime gaps <= B2 and store those in memory, but for B2 large the gaps can be so large as to make this practically impossible. However, since the larger gaps occur quite rarely, it costs very little to multiply together a few smaller prestored powers to cover the larger gaps when they are encountered. For CPU-efficiency reasons we want all our precomputed powers to fit into RAM, but we also want there to be enough of these so we can cover nearly all the prime gaps that actually occur using the stored powers, i.e. without needing extra multiplies. Practically speaking, a few dozen precomputed powers suffices to cover all but a very small percentage of the gaps, and beyond that you get diminishing returns. In other words, there's a big difference in running p-1 stage 2 on a ~20M-sized exponent using just 64MB RAM (which would only allow the active stage 2 residue and 7 precomputed powers (R^2,4,6,8,10,12,14) to be stored (and there are many prime gaps > 14, so we'd be doing a lot of extra muls, e.g. R^2*R^14 to get R^16) and doing stage 2 using 256 MB of RAM, which would allow ~32 precomputed powers, i.e. R^2 through R^64. There will only be a few % added speedup in going from 256MB to 512MB, and even less in going from 512 to 1024.
ewmayer is offline   Reply With Quote
Old 2003-12-10, 19:06   #9
andi314
 
andi314's Avatar
 
Nov 2002

2×37 Posts
Default

i meant the fft size concerning a ll-test
andi314 is offline   Reply With Quote
Old 2003-12-11, 03:16   #10
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

53138 Posts
Default

This is a post by George on the mailing list the day before these forums opened :)


This table shows FFT size, v21 x87 crossover, v22.8 x87 crossover,
v21 SSE2 crossover, v22.8 SSE2 crossover. As you can see v22
is more liberal with x87 crossovers and more conservative with
SSE2 crossovers.
Code:
262144          5255000         5255000         5185000         5158000
327680          6520000         6545000         6465000         6421000
393216          7760000         7779000         7690000         7651000
458752          9040000         9071000         8970000         8908000
524288          10330000        10380000        10240000        10180000
655360          12830000        12890000        12720000        12650000
786432          15300000        15340000        15160000        15070000
917504          17850000        17890000        17660000        17550000
1048576         20400000        20460000        20180000        20050000
1310720         25350000        25390000        25090000        24930000
1572864         30150000        30190000        29920000        29690000
1835008         35100000        35200000        34860000        34560000
2097152         40250000        40300000        39780000        39500000
2621440         50000000        50020000        49350000        49100000
3145728         59400000        59510000        58920000        58520000
3670016         69100000        69360000        68650000        68130000
4194304         79300000        79300000        78360000        77910000
Now the gotcha. In v22.8, FFT crossovers are flexible. If you test an
exponent within 0.2% of a crossover point, then 1000 sample iterations
are performed using the smaller FFT size and the average roundoff
error calculated. If the average is less than 0.241 for a 256K FFT or
0.243 for a 4M FFT, then the smaller FFT size is used.

Brian Beesley has been a great help in investigating revised crossover
points and analyzing the distribution of round off errors. We noticed
that consecutive exponents can have a pretty big difference in average
roundoff error (e.g. one exponent could be 0.236 and the next 0.247).
This is why I elected to try the flexible approach described above. The
0.241 to 0.243 average was chosen hoping for about 1 iteration in a
million generating a roundoff error above 0.4. We might change the 0.241
to 0.243 constants with more data - it is hard to get enough data points
to accurately measure 1 in a million occurrences.

One downside is the server does not know which FFT size is used and
will credit you based on the v21 x87 crossovers. Thus, if you are a lucky
person, you might get "bonus" CPU credit where you test the exponent
at a smaller FFT size and the server credits you based on the larger
FFT's timing.

Last fiddled with by garo on 2003-12-11 at 03:17
garo is offline   Reply With Quote
Old 2006-11-24, 01:26   #11
jinydu
 
jinydu's Avatar
 
Dec 2003
Hopefully Near M48

2·3·293 Posts
Default

I'm using version 24.13, and judging by an increase in the per iteration time of well over 10%, there is a FFT length change somewhere between 17440543 and 17776357.

Unfortunately, I suspect that Primenet may not recognize this jump and could underestimate my credit by over 10% when M17,776,357 finally finishes (it just started less than an hour ago) . Time will tell...

EDIT: After playing around with the Time function in the Advanced menu, I have found that the jump occurs at precisely 17.55M. Oddly enough, the per iteration times I got during the Time test were far better than my usual times. Then, when I resumed the usual test, my per iteration time improved dramatically!

Last fiddled with by jinydu on 2006-11-24 at 02:01
jinydu is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Max FFT Size? aketilander Software 44 2018-12-19 17:58
Force FFT-size to be used kruoli Software 4 2017-11-17 18:14
Pi(x) value for x at 10^16 size edorajh Computer Science & Computational Number Theory 6 2017-03-08 20:28
Size optimization Sleepy Msieve 14 2011-10-20 10:27
Exponent Size Gap Mini-Geek PrimeNet 8 2007-03-25 07:29

All times are UTC. The time now is 16:46.

Sun Apr 11 16:46:38 UTC 2021 up 3 days, 11:27, 1 user, load averages: 1.68, 1.75, 1.78

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.