![]() |
|
|
#1 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
2·53·71 Posts |
A recent bug report has revealed an error in my calculation for the expected number of bits in an FFT result word prior to the carry propagation step.
When working on Mersenne numbers, 2^p-1, we put n=p/FFTlen bits in each FFT input word. Since we use balanced representation each result word is the sum of many (n-1) times (n-1) bit values. Some of these "cancel-out" as the quantities are both positive and negative. Lots of data has shown that I can expect the result words to have at most (n-1) + (n-1) + 0.6 * log2 (FFTlen) bits. When I test k*2^p-1 numbers, there an extra log2(k)/2 bits in each input word. This adds an extra log2(k) bits to each result word. Tests have shown this formula works quite well. The problem arises when I test numbers 2^p-c. In the Mersenne case the weights for each input word are between 1 and 2 (or is it 0.5 and 1.0 - I forget). In the 2^p-c case the input words are weighted with multipliers between 1 and c. In the Mersenne case every input word has roughly the same number of bits. In the 2^n-c case, the input words have anywhere between n and n+log2(c) bits. Any ideas on what my formula should be in this case? Ernst, help! |
|
|
|
|
|
#2 |
|
∂2ω=0
Sep 2002
República de California
103×113 Posts |
Well, assuming that the input-word-multipliers in the mod-(2n-c) DWT are approximately randomly [or uniformly] distributed in the interval [1,c], this leads to a multiplier (1+c)/2 on the average input wordsize W. It seems like you should just be able to rework formulae 5-8 in the F24 paper by replacing W with W*(1+c)/2 [which reduces to W in the Mersenne-mod case, by way of a basic sanity check] to get the generalized error estimate you need. I have a simple C routine which I've been using [starting with the Mlucas 3.0 beta] that implements formula (8) in the above paper, and which has saved me much grief in terms of calculating FFT-length breakover points. Here it is - feel free to use and customize as desired. let me know if this, together with the above W-fiddling works for you:
Code:
/*
For a given FFT length, estimate maximum exponent that can be tested.
This implements formula (8) in the F24 paper (Math Comp. 72 (243), pp.1555-1572,
December 2002) in order to estimate the maximum average wordsize for a given FFT length.
For roughly IEEE64-compliant arithmetic, an asymptotic constant of 0.6 (log2(C) in the
the paper, which recommends something around unity) seems to fit the observed data best.
*/
uint32 given_N_get_maxP(uint32 N)
{
const double Bmant = 53;
const double AsympConst = 0.6;
const double ln2inv = 1.0/log(2.0);
double ln_N, lnln_N, l2_N, lnl2_N, l2l2_N, lnlnln_N, l2lnln_N;
double Wbits, maxExp2;
ln_N = log(1.0*N);
lnln_N = log(ln_N);
l2_N = ln2inv*ln_N;
lnl2_N = log(l2_N);
l2l2_N = ln2inv*lnl2_N;
lnlnln_N = log(lnln_N);
l2lnln_N = ln2inv*lnlnln_N;
Wbits = 0.5*( Bmant - AsympConst - 0.5*(l2_N + l2l2_N) - 1.5*(l2lnln_N) );
maxExp2 = Wbits*N;
/* 3/10/05: Future versions will need to loosen this p < 2^32 restriction: */
ASSERT(HERE, maxExp2 <= 1.0*0xffffffff,"given_N_get_maxP: maxExp2 <= 1.0*0xffffffff");
return (uint32)maxExp2;
}
|
|
|
|
|
|
#3 |
|
∂2ω=0
Sep 2002
República de California
103×113 Posts |
It seems the suitably modified inequality (8) leads to a generalized version of the above function of this form: I've also added a super-basic test utility, you should be able to copy this into a .c file and build:
Code:
#include <stdio.h>
#include <math.h>
typedef unsigned int uint32;
/*
For a given FFT length N [= number of floats in the input data array],
estimate maximum exponent P that can be tested in a DWT modulo (2^n - C).
*/
uint32 given_N_and_C__get_maxP(uint32 N, uint32 C)
{
const double Bmant = 53;
const double AsympConst = 0.6;
const double ln2inv = 1.0/log(2.0);
double ln_N, lnln_N, l2_N, lnl2_N, l2l2_N, lnlnln_N, l2lnln_N, l2cplus1;
double Wbits, maxExp2;
ln_N = log(1.0*N);
lnln_N = log(ln_N);
l2_N = ln2inv*ln_N;
lnl2_N = log(l2_N);
l2l2_N = ln2inv*lnl2_N;
lnlnln_N = log(lnln_N);
l2lnln_N = ln2inv*lnlnln_N;
l2cplus1 = ln2inv*log(1.0 + C);
Wbits = 1.0 - l2cplus1 + 0.5*( Bmant - AsympConst - 0.5*(l2_N + l2l2_N) - 1.5*(l2lnln_N) );
maxExp2 = Wbits*N;
return (uint32)maxExp2;
}
int main()
{
uint32 c, i, n;
printf("Enter subtractive constant C in modulus (2^n - C) >\n");
scanf("%u", &c);
printf("Max p for various power-of-2 FFT lengths:\n");
for(i=10;i<=25;i++)
{
n = 1 << i;
printf("N =2^%2d, Max p = %10u\n",i,given_N_and_C__get_maxP(n,c));
}
return 0;
}
|
|
|
|
|
|
#4 | |
|
"Lucan"
Dec 2006
England
145128 Posts |
Quote:
Doesn't a simple application of the normal (Gaussian) distribution inform the argument a bit? David Last fiddled with by davieddy on 2007-10-19 at 13:29 |
|
|
|
|
|
|
#5 | |
|
∂2ω=0
Sep 2002
República de California
265678 Posts |
Quote:
Sans DWT weighting - or at least with the simpler kinds of DWT weights we use to effect negacyclics as for Fermat numbers [weights are the set of Nth complex roots of unity] and Mersennes [weights are fractional powers of 2 roughly randomly in [1, 2] things seem to follow the predictions of the random-walk heuristic very well. What George is saying that in his more-general case something seems to go awry with the above approach, which may require deeper analysis to understand. |
|
|
|
|
|
|
#6 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
2×53×71 Posts |
Quote:
I'm not sure I agree that plugging the (1+c)/2 average weighting in your formula is OK. Just doing a little thought experiment: If I have a length 1024 FFT, and each FFT word has 10 bits after weighting in a Mersenne-mod scenario. Then each output word is the sum of 1024 20-bit products. We add 0.6 * log2 (1024) to hold carry bits as the 1024 values are both positive and negative. Thus, we feel safe with 26 bits of precision. Now lets say c is such that the DWT weights now add 0,1,...,8 bits to each input word (so the input words are now 10-18 bits). If we use the average bit length of inputs, you'd think we'd need 14*2+6 = 34. However, occasionally we are adding 18-bit times 18-bit products, so this is clearly insufficient! |
|
|
|
|
|
|
#7 |
|
∂2ω=0
Sep 2002
República de California
103×113 Posts |
Hmm, yes, perhaps the dynamic range of the input multipliers is key here, as you say - for Fermat-mod the DWT weights are all of complex norm 1, so as uniform as can be, and for Mersenne-mod the dynamic range is from 1 to 2, which is also very small, so perhaps neglecting it [or treated it by way of an average] doesn't hurt us - at least a whole lot of computation tells us that seems to be the case. You could always try C rather than (1+C)/2 as the multiplier and see which of the two models seems to give a more-reasonable ROE size - the "right" formula should give a nearly-constant max. ROE with increasing FFT length, for moduli around the max-p at each length.
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Longest Worm | mfgoode | Lounge | 108 | 2020-04-20 22:06 |
| Random bit in a word | Prime95 | Programming | 19 | 2012-10-06 09:34 |
| My last name is a word. I had no idea. | jasong | jasong | 3 | 2007-10-05 18:02 |
| Word Problem | mfgoode | Lounge | 11 | 2006-04-01 16:16 |
| The only word | mfgoode | Puzzles | 11 | 2004-09-14 19:08 |