mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Math (https://www.mersenneforum.org/forumdisplay.php?f=8)
-   -   FFT explanation for non math guys (https://www.mersenneforum.org/showthread.php?t=120)

creative1 2002-09-25 10:27

FFT explanation for non math guys
 
Hi!

I´m looking for the simple fft explanation for guys that have no idea about math. That´s 13/14 years old guys.

As programmer I can read the source code and understand every line but i can´t grasp the general concept.

I tried seaching for a sipmle explanation and i got algorithms of 10 lines long but i couldn´t find anything that gives a simple example of what´s happening... what i mean let´s try to multiply 2 smalls int
23 and 12 for example... what does happens on a simple fft algorithm?
i tried to follow the steps on a paper with a pencil and i couldn´t get it

any help?
Creative1

Prime95 2002-09-25 20:01

Re: fft explanation for non math guys
 
[quote="creative1"]I´m looking for the simple fft explanation for guys that have no idea about math. That´s 13/14 years old guys.[/quote]

I hate to be skeptical, but I'm not aware of any texts or web pages that explain FFTs in extremely elementary terms. It took me a long time for me to grasp FFT concepts well enough to code one. FFTs are complex beasts. Most FFT articles and textbooks quickly get sidetracked with optimization techniques which contribute nothing to your understanding of what they are and why they work.

You really want to find a text that talks about Fourier Transforms, rather than Fast Fourier Transforms. Good luck in your search, hopefully someone can make a recommendation

xtreme2k 2002-09-26 03:19

FT is at least a first year uni subject. I think it is rather diffcult to explain to someone with a mathematic knowledge of a 13/14 year old though :)

Prime Monster 2002-09-26 13:04

There are some fairly practical enginering maths books that are not too bad. They do not go too far into theory, but more into application. makes it a tiny bit easier to understand. ;)

Alf

ewmayer 2002-09-26 17:08

Re: fft explanation for non math guys
 
[quote="creative1"]Hi!

I´m looking for the simple fft explanation for guys that have no idea about math. That´s 13/14 years old guys.

As programmer I can read the source code and understand every line but i can´t grasp the general concept.

I tried seaching for a sipmle explanation and i got algorithms of 10 lines long but i couldn´t find anything that gives a simple example of what´s happening... what i mean let´s try to multiply 2 smalls int 23 and 12 for example... what does happens on a simple fft algorithm? i tried to follow the steps on a paper with a pencil and i couldn´t get it

any help?
Creative1[/quote]

Hmm, it's difficult to explain the deeper theory underlying transforms to even a good student at that age level, but let me try to illustrate the basic features with an analogy you should be able to grasp. The basic idea behind most (if not all) mathematical transforms is to take some problem that appears (or simply is) difficult in its original form and turn it into a problem that is easier (or even trivial) to solve when transformed.

A simple example is vector arithmetic in two dimensions. Vectors (x1,y1) and (x2,y2) are easily added in x,y coordinates - simply add their components to get (x1+x2, y1+y2) - but are harder to multiply together. (I'm thinking here of multiply as defined for complex numbers, i.e. with product = (x1*x2-y1*y2, x1*y2+x2*y1).) OTOH, if I transform to polar coordinates and write each one as a pair (r, theta) (with r = sqrt(x^2+y^2) and theta = atan(y/x)), then they are easy to multiply- simply multiply the r-parts together, and add the theta-parts together to get the product (still in polar coordinates), (r1*r2, theta1+ theta2). By using polar form I've reduced the cost of getting the product from 4 (scalar) multiplies and 2 adds to just 1 multiply and one add. Of course I've also made it harder to do further vector additions - to do that efficiently I'll want to transform the result back to (x,y) coordinates - but that's the tradeoff.

Now consider multiplying together two numbers A and B, each having N digits. The way one learns to do this in grammar school is to multiply each digit of B by the rightmost digit of A, then shift B to the left one place (i.e. add a 0 at the right end) and multiply that by the next-higher-order digit of A, and so forth, then sum all the N intermediate products. That needs on the order of N^2 digit multiplications and a similar number of digit additions - in mathematical terms, it's an O(N^2) ("big oh of N squared" or "order of N squared") algorithm. Now this type of procedure has another name: discrete convolution. It's also used a lot in completely different fields, for instance signal processing - you typical cell phone does a lot of it. Now there's a famous transform which allows us to do discrete convolutions really fast, and it's called the Fast Fourier Transform. If we treat each of our input numbers A and B as a vector with N components (the digits of the number) and do an FFT on each of them, we get a pair of vectors A^ and B^ (that generally look nothing like the inputs) which have a very special property:
[b][i]
in Fourier space (which is where we consider the transform to have taken our input vectors), convolution looks just like digit-by-digit multiplication.
[/i][/b]
In other words, if we multiply each individual component (just a number) of A^ with the corresponding one of B^, the result is guaranteed to be the Fourier-transformed version of the convolution of A and B. That is, in Fourier space, convolution costs just O(N) operations, much better than our original O(N^2). To get back the result we want (i.e. convolution(A, B), not in Fourier-transformed form) we do an inverse transform on the single vector resulting from the digit- by-digit multiply of A^ and B^. (Using our x,y-versus-polar analogy, this is analogous to converting the polar-form product of our two vectors back to x,y coordinates, using the inverse transform x = r*cos(theta), y = r*sin(theta).)

Now, it's an interesting feature of the FFT approach that doing the transform actually costs more than doing the digit-by-digit multiply form of convolution that it enables. But doing the transform costs O(N*log2(N)) operations (and the "oh of" notation hides a constant of proportionality which is larger than one - typically around 5 in a good implementation), and of course we need to do two such transforms, so a reasonable estimate of the FFT cost of multiplication is around 10*N*log2(N). But for N sufficiently large, this is cheaper than the grammar-school way. And as N continues to get larger, the speed advantage continues to grow, since the ratio N/log2(N) grows without bound as N tends to infinity. In mathematical terms, we say that FFT-based multiply is "asymptotically faster" than grammar school. To put numbers on this: let's consider the grammar-school way to cost 2N^2 operations (N^2 digit multiplies and roughly the same number of adds, plus the carries in the output digits, which only cost O(N) work, i.e. which don't contribute appreciably to the operation count.) Here is a small table of 2N^2 vs. the opcount for FFT:

[code]
N 2N^2 10*n*log2(N)
---- ------- ---------------
10 200 332
100 20000 6640
10^3 2*10^6 ~10^5
10^6 2*10^12 ~2*10^8
10^9 2*10^18 ~3*10^11
[/code]

In words:

For 10-digit numbers, FFT is only around 60% as fast as GS.
For 100-digit numbers, FFT is about 3 times faster.
For 1000-digit numbers, FFT is about 20 times faster.
For million-digit numbers, FFT is about 10,000 times faster.
For billion-digit numbers, FFT is nearly ten million times faster.

Now to your small example: let's multiply 12 and 23 the FFT way, representing them as base-10 vectors, i.e. we form our input vectors simply by separating out the digits of the above base-10 representation of the numbers. One important practical detail here, which I glossed over in my earlier discussion: if we want to FFT-multiply two numbers, each having N digits, we expect the product to have as many as 2*N digits, so we actually need to do a length-2N FFT to leave room for the digits at the high end that will "fill in" when we do the multiply. That means we need to pad our input vectors with N zeros at the upper end. Thus, our input vectors are A = (2,1,0,0) and B = (3,2,0,0), where the least-significant digit of each number is leftmost. The Fourier transform of a length-4 vector (a,b,c,d) is given in matrix-multiply form as

[code]
|+1 +1 +1 +1| |a| a+b+c+d
|+1 +i -1 -i|*|b|=(a-c)+i*(b-d)
|+1 -1 +1 -1| |c| a-b+c-d
|+1 -i -1 +i| |d| (a-c)-i*(b-d)
[/code]

where i = sqrt(-1) is the usual imaginary constant. Doing this for our two input vectors, our transformed vectors are A^ = (3, 2+i, 1, 2-i) and B^ = (5, 3+2i, 1, 3-2i). The Fourier-transformed discrete convolution of A and B is then simply A^*B^, where the '*' means component-by-component (the fancy word is 'dyadic') multiply, which gives (15, 4+7i, 1, 4-7i). To get the digits of the product we're after, we need to inverse-Fourier-transform this vector. For length-4 vectors, the IFT looks just like the FT, but with the signs on the i-terms in the above matrix switched and a factor of one-fourth multiplying the whole thing. (We always have the factor of 1/N for a length-N inverse transform, since we require that doing an FT of a vector followed by an IFT simply give us back our original vector).

That is, the length-4 IFT of our vector (15, 4+7i, 1, 4-7i) has entries

[15+(4+7i)+1+(4-7i)]/4 = 24/4 = 6,
[(15-1)-i*(4+7i-4+7i)]/4 = [14+14]/4 = 7,
[15-(4+7i)+1-(4-7i)]/4 = 8/4 = 2,
[(15-1)+i*(4+7i-4+7i)]/4 = [14-14]/4 = 0.

Since all the output digits happen to be less than 10 and nonnegative, we don't need to do any carry step, and since the outputs are least-significant first, our result (written in normal decimal order) is 276.

Hope this helped,
-Ernst

p.s.: I was rather cavalier in my use of the term "FFT" above - the proper name for the transform we use is the Discrete Fourier Transform (DFT); FFT is simply a way of efficiently effecting a DFT. For example, a non- FFT version of the above calculation would use a standard O(N^2) matrix multiply algorithm to do the DFT. But if you look at the right-hand side of the matrix multiply, you see many repeated terms. To get the RHS, all we need to do is the following sequence:

w = a+c
x = a-c
y = b+d
z = b-d

Then,

output 1 = w+y
output 2 = x+i*z
output 3 = w-y
output 4 = y-i*z

Exploiting those types of symmetries in the DFT matrix is what the FFT does.

creative1 2002-10-27 18:36

sorry i delayed so much with this answer but i just realized that there was an answer back here...
thanks alot for the explaination but there still one thing i couldn´t grab

how did you create the 'i' matrix to be able to make the transform? everything is was very nice explained but i couldn´t get how you filled that matrix with 1, -1, i, -i values
where did that come from?

and in the case the numbers weren´t 23 and 12 and there was a carry on the multiplication... how would i handle that?

Creative1

wpolly 2002-10-28 14:18

[quote="creative1"]
how did you create the 'i' matrix to be able to make the transform? everything is was very nice explained but i couldn´t get how you filled that matrix with 1, -1, i, -i values
where did that come from?
[/quote]
[code:1]
|+1 +1 +1 +1| |i^0 i^0 i^0 i^0|
|+1 +i -1 -i|=|i^0 i^1 i^2 i^3|
|+1 -1 +1 -1| |i^0 i^2 i^4 i^6|
|+1 -i -1 +i| |i^0 i^3 i^6 i^9|
[/code:1]
For FFT size 2N, just replace the i with cos(pi/N)+i sin(pi/N)(That is, a 2Nth root of 1)and constuct the matrix as above.......


P.S. I am only 14 years old!

creative1 2002-10-28 14:35

[quote="wpolly"]
For FFT size 2N, just replace the i with cos(pi/N)+i sin(pi/N)(That is, a 2Nth root of 1)and constuct the matrix as above.......
[/quote]

I didn´t understand that last part... the fft that we were using was size 2N right?
and 'replace the i with cos(pi/N)+i sin(pi/N)' you mean:
cos(pi/N)+ sqrt(-1) sin(pi/N) ?

Kevin 2002-10-28 20:24

I believe for the example given, a FFT size of 2 was used. The general formula uses cos(pi/n)+i sin(pi/n) , which reduces to i for a FFT size of 2.

P.S. I'm only 15. Guess I'm not the only young cruncher around here.

ewmayer 2002-10-28 21:03

[quote="Kevin"]I believe for the example given, a FFT size of 2 was used. The general formula uses cos(pi/n)+i sin(pi/n) , which reduces to i for a FFT size of 2.
[/quote]

No, the (n)th complex roots of unity are given by
exp(i*2*pi/n) = cos(2*pi/n)+i*sin(2*pi/n). The reason one sees the i's
in my example is that to DFT-multiply two general length-2 numbers, one
must zero-pad the input vectors to double the number of input digits, i.e.
one must use a length-4 DFT. Note that for [i]modular multiplication[/i]
for certain moduli of special form (which happen to include the Mersenne
numbers), one can use additional clever tricks to avoid the zero-padding.
That's because a length-n discrete cyclic convolution (working with base-X
inputs, where X need not be 10) is really a polynomial multiplication
modulo X^n - 1. That means that in my example, if I had used a length-2
DFT instead of length-4 (and recall that I was using X = 10), I would have
still obtained the correct result, but only in the sense that it is correct
modulo 10^2 - 1 = 99. Let's try it:

We again DFT-multiply 12 and 23 using base-10 arithmetic,
but this time we don't pad our input vectors with zeros.
Thus, our input vectors are A = (2,1) and B = (3,2), where the
least-significant digit of each number is leftmost. The Fourier transform
of a length-2 vector (a,b) is given in matrix-multiply form as

[code:1]
|+1 +1| |a| a+b
|+1 -1|*|b|=a-b .
[/code:1]

Doing this for our two input vectors, our transformed vectors are
A^ = (3, 1) and B^ = (5, 1). The Fourier-transformed discrete
convolution of A and B is then simply A^*B^, where the '*' means dyadic
multiply, which gives (15, 1). To get the digits of the product
we're after, we need to inverse-Fourier-transform this vector. For length-2
vectors, the IFT looks just like the FT, but with a factor of one-half
multiplying the whole thing.

That is, the length-2 IFT of our vector (15, 1) has entries (16, 14)/2 = (8, 7).

Since all the output digits happen to be < 10 and nonnegative, we don't
need to do any carry step, and since the outputs are least-significant
first, our result (written in normal decimal order) is 78. The exact
(full) product is 12*23 = 276, which indeed == 78 modulo 99.

-Ernst

wpolly 2002-10-29 05:12

[quote="creative1"][quote="wpolly"]
For FFT size 2N, just replace the i with cos(pi/N)+i sin(pi/N)(That is, a 2Nth root of 1)and constuct the matrix as above.......
[/quote]

I didn´t understand that last part... the fft that we were using was size 2N right?
and 'replace the i with cos(pi/N)+i sin(pi/N)' you mean:
cos(pi/N)+ sqrt(-1) sin(pi/N) ?[/quote]

Yes
an 2Nth root of 1 is cos(2pi/2N)+ i*sin(2pi/2N).
for the example given, we were using FFT size 4 so we'll have to use the fourth root of 1---------That is, i.

MAD-ness 2002-11-06 20:40

I have 10 years on you guys and I don't undestand any of that, so congrats.

:)

Deamiter 2002-11-06 21:54

lol... it's mostly experience. The longer you've been playing with numbers (especially primes and how to use mods to test for primes) the better you'll get it. My (now retired) physics prof is around 85, and he's amazing at some of this stuff even though he can't multiply in his head any more (due to a form of Parkinsen's I think).

PageFault 2002-11-09 01:00

ahhh, experience ....

About two years ago I programmed a PLC to perform a linear regression. I needed a sturdy tool, able to withstand life on the mill floor, so I could process data while recording it. So, I wrote this crazy algorithm to do the regression without overflowing (Allen Bradley cpu instruction set is pretty weak) and brought it to the instrumentation shop and told them to insert is as one of the final rungs.

The electrical engineer in charge of the shop nearly threw a rod when he realized what I was doing with the PLC :shock:

PiepPiep 2002-11-10 18:40

Tnx for the info!
I've really searched a lot for some nice information like this but it was
always with difficult formulas i didn't understand or some
100kB source code files.
I'm going to give it a try now to implement a fft.
The only thing i need now is some test case bignumbers to test with.
I can multiply some 100 digit numbers myself with just to 'normal'
multiply in a little program, but where can i find some test cases with
10.000 digits numbers or much bigger?

flava 2003-02-10 22:39

So you want to multiply big numbers? If you have a windows based system you might be lucky. I just compiled (a lib for windows) the latest GMP library ant it allows you to play with really big integer numbers (I tested it over 10M digits). It comes with a nice header (for C). Another thing, I compiled it for Athlon (the blended one is slower) so I have no ideea if it works on other CPU's....
BTW, I wanted to see how fast is the integer FFT code in GMP so I wrote a LL and ... it's 6X slower than Prime95 :surprised:ops:

ET_ 2003-02-11 16:24

[quote] I just compiled (a lib for windows) the latest GMP library ant it allows you to play with really big integer numbers[/quote]

Where is it? :-D

Luigi

cheesehead 2003-02-11 19:04

Where to find the GMP library
 
[quote="ET_"][quote] I just compiled (a lib for windows) the latest GMP library ant it allows you to play with really big integer numbers[/quote]
Where is it? :-D
Luigi[/quote]
Try The GNU MP Home Page at [url]http://www.swox.com/gmp/[/url]

flava 2003-02-11 21:42

[quote]
Where is it? :-D

Luigi[/quote]

I can upload the compiled lib on my web "site" if someone wants it. It's about 1Mb and works fast on Athlon and Duron.
The source code can be found at http://www.swox.com/gmp/ but yo will need a Unix emulator and a bit of patience to compile it on Windows.

mephisto 2003-02-15 00:20

Thanks to ewmayer for a readable intro to Fourier/DFT/FFT - I've also looked for that, and failed.

I have a Master in AI/pattern recognition, and I've actually used Fourier. I never understood the math, though, and I couldn't really see any natural connection between "split up a signal to its constutients, and use the resulting distribution to compress and/or do pattern recognition on the original signal" and the "find large primes" problem.

I have too little math, of course - even six years at university didn't give me enough math to understand the Fourier transform. Shame on me, perhaps, but it's not just "barely 14's" who benefit from a readable explanation of "what's happening here?". I'm happy to finally understand what I did at university :)

Citrix 2004-07-02 04:46

ewmayer,

A quick question.
If I wanted to multiply two, 3 digit numbers then how do I do the transform? What is the general forumula to do transforms for two n digit numbers? Also what about the inverse?What is the general formula for this?

Also as I understand that after multiplying the numbers, you have to take the mod. How does one achieve this real fast. Is there a faster algorithm than dividing and then looking at the remainder?

Also, let me tell you that your explanation is really good, the best I can find on the web.

Thanks,
Citrix
:cool: :cool: :cool:

Citrix 2004-07-02 05:12

I am also curious, that if someone found a faster algorithm to multiply numbers, then will it have any applications in any other fields like FFT, or will it only help mathematics (number theory).

Citrix
:cool: :cool: :cool:

Templus 2004-07-08 20:45

Thank you ewmayer for your great explanation , though it's an old topic :smile: . I'm astonished with the fact how "easy" it is to calculate in Fourier-space (with your example).

@Citrix:

If the numbers are 3-digit, then let the Vectors A and B of length 6 (2N)

make the 'i'-matrix as:

2N*2N matrix
[CODE]
|+1 +1 +1 +1 +1 +1| |i^0 i^0 i^0 i^0 i^0 i^0|
|+1 +i -1 -i +1 +i|=|i^0 i^1 i^2 i^3 i^4 i^5|
|+1 -1 +1 -1 +1 -1| |i^0 i^2 i^4 i^6 i^8 i^10|
|+1 -i -1 +i +1 -i| |i^0 i^3 i^6 i^9 i^12 i^15|
|+1 +1 +1 +1 +1 +1| |i^0 i^4 i^8 i^12 i^16 i^20|
|+1 +1 +1 -i +1 +i| |i^0 i^5 i^12 i^15 i^20 i^25|
[/CODE]
or: M(j,k) = i^(k*j) and begin counting at 0.
j = row-number
k = column-number

And then follow the procedures (and divide by 6, the length of the Vector A (and B))

Am I right ? :rolleyes:

Templus 2004-07-08 22:14

some stupid mistakes:
M(5,1) = i^5 = +i
M(5,3) = i^10 = -1

Sorry, I couldn't find how to edit my post...

ewmayer 2004-07-15 00:06

[QUOTE=Templus]Am I right ? :rolleyes:[/QUOTE]

No, for length-6 DFT-based multiply your fundamental root of unity is not e^(i*2*pi/4) = i but rather e^(i*2*pi/6) = (1 + i*sqrt(3))/2. Call this quantity E and replace all the i's in your matrix by E's and that's the desired DFT matrix. Of course there are several symmetries you can exploit to simplify the the structure of the matrix. If E := e^(i*2*pi/n), then

1) E^j = E^(j%n), e.g. E^25 = E^(25%6) = E^1 = E;

2) E^3 = -1.

Using these, the 6x6 DFT matrix can be written as

[code]
| +1 +1 +1 +1 +1 +1 |
| +1 E E^2 -1 -E -E^2 |
| +1 E^2 -E +1 E^2 -E |
| +1 -1 +1 -1 +1 -1 |
| +1 -E E^2 +1 -E E^2 |
| +1 -E^2 -E -1 E^2 E |
[/code]

xenon 2004-07-21 01:28

lucas-lehmer with dft/fft
 
Can we go one step closer to LL testing using the multiplication methods described above? What is FFT length, eg 1024K, 1792K? What is radix?

jfollas 2004-07-29 13:58

I'm still trying to get my head around multiplying using FFT's, but I have a couple questions that popped up:

As I understand, to multiply using FFT, you need to do a transform, perform the multiplication math, and then do an inverse transform to get the results back into a "real number".

What if I wanted to do repeated squarings? Would I need to perform the inverse transform and then another transform before multiplying the second and sequential times? Or, once I have the number in the frequency domain, can I keep performing the multiplication math for as many squarings as I need, and then do the inverse transform?

Also along the same lines, does Prime95 need to perform the inverse transform before subtracting 2 each iteration of the LLT?

xenon 2004-07-30 01:09

Performing multiple multiplications in frequency domain
 
I believe it is very logical to perform the multiplications without inverse transforms in between. But not for addition (or subtractions).

ColdFury 2004-07-30 04:07

You need to do the inverse transform and rounding step each time if you want to keep the precision errors under control.

jfollas 2004-07-30 15:11

[QUOTE=ColdFury]You need to do the inverse transform and rounding step each time if you want to keep the precision errors under control.[/QUOTE]
Is there a FFT implementation that would be better suited for repeatedly performing the multiplication step while remaining in the transform (i.e., where precision errors are not an issue), or at least not requiring an inverse transformation with every iteration? An Integer FFT perhaps?

R.D. Silverman 2004-07-30 19:24

[QUOTE=jfollas]Is there a FFT implementation that would be better suited for repeatedly performing the multiplication step while remaining in the transform (i.e., where precision errors are not an issue), or at least not requiring an inverse transformation with every iteration? An Integer FFT perhaps?[/QUOTE]

My integer FFT (actually polynomial convolution) code from
"AN FFT Extension to P-1" did exactly that.

The code won't do anyone much good now; it was written in ALLIANT
Fortran.

xenon 2004-12-16 03:01

Can anyone explain "irrational base" as in IBDWT? I really don't understand the modulo step "for free".

The maths page of mersenne.org has dead links.
Also, I can't find the original paper: Crandall, R. E. and Fagin, B. 1994, Discrete weighted transforms and large-integer arithmetic, Math. Comp., 62, 205, 305-324(January)
Why is it not available for download?

Guilherme 2004-12-27 16:44

This page provides an Excel spreadsheet that shows FFT multiplication:

[URL=http://www.gweep.net/%7Eshifty/portfolio/fftmulspreadsheet/index.html]http://www.gweep.net/%7Eshifty/portfolio/fftmulspreadsheet/index.html[/URL]

Chaos 2005-09-20 22:53

I have written a program that takes (N*log2(N)) bit operations to multiply 2 N digit numbers. My question is that what else does Prime95 use to multiply so fast. Could some one explain how to make the program a bit faster? What are the techniques called, any good resources?

DJones 2006-11-03 19:38

[QUOTE=ewmayer;35266]No, for length-6 DFT-based multiply your fundamental root of unity is not e^(i*2*pi/4) = i but rather e^(i*2*pi/6) = (1 + i*sqrt(3))/2. Call this quantity E and replace all the i's in your matrix by E's and that's the desired DFT matrix. Of course there are several symmetries you can exploit to simplify the the structure of the matrix. If E := e^(i*2*pi/n), then

1) E^j = E^(j%n), e.g. E^25 = E^(25%6) = E^1 = E;
2) E^3 = -1.

Using these, the 6x6 DFT matrix can be written as
[code]
| +1 +1 +1 +1 +1 +1 |
| +1 E E^2 -1 -E -E^2 |
| +1 E^2 -E +1 E^2 -E |
| +1 -1 +1 -1 +1 -1 |
| +1 -E E^2 +1 -E E^2 |
| +1 -E^2 -E -1 E^2 E |
[/code][/QUOTE]

I'm new to all this and never had the option of studying natural logs, imaginary numbers, and so on - I know what they all are but can't explain them and have no comprehension about how they interact. I'm currently trying to track down good information on the web, but most of it seems to assume that you know all about it already, know what all the symbols they are using actually mean, and just need to be reminded of the formulas again. I will be resorting to getting some decent books at some point, but money's tight. Anyway, back to the point.
Does the above generation of the transform matrix need to be altered for working in a different base, [b]or[/b] is it simply a case of converting your decimal number to the desired base and taking the length of the number in the new base as the defining factor* for the size of the transform matrix which in turn is still generated as above?

*If you'll pardon the pun.

ewmayer 2006-11-03 19:58

[QUOTE=DJones;90581]Does the above generation of the transform matrix need to be altered for working in a different base, [b]or[/b] is it simply a case of converting your decimal number to the desired base and taking the length of the number in the new base as the defining factor for the size of the transform matrix which in turn is still generated as above?[/QUOTE]

The transform part of the FFT-based multiply is base-independent . The base appears in the transform-based multiply in two ways:

1) The base appears explicitly in the carry step following the IFFT, and

2) The base appears implicitly in the precision-dependent limits on the sizes of inputs to the FFT.

But the actual core FFT routines are base-independent.

DJones 2006-11-04 10:09

Marvellous, thank you. Now any chance of being pointed at a suitable site / book (preferably the former) which explains why powers of e^(i * 2 * pi / n) work as a Fourier Transform matrix. Yes, I appreciate I could google this and have done so, but mathematical functions and definitions tend to be very flexible on the web, which isn't surprising when you take into account the fact that any idiot can put a webpage up.

I also appreciate I'm coming at this from the wrong end - I stumbled across the Mersenne Primes when looking for a DC project to get involved in, and have tried to backtrack through the proofs and methods so that I understand what's going on. Unfortunately backtracking through proofs, as opposed to building on proofs you already have, is proving to be rather convoluted.

Prime95 2006-11-04 13:54

[QUOTE=DJones;90620]Now any chance of being pointed at a suitable site / book (preferably the former) which explains why powers of e^(i * 2 * pi / n) work as a Fourier Transform matrix. Yes, I appreciate I could google this and have done so, but mathematical functions and definitions tend to be very flexible on the web.[/QUOTE]

Try googling "roots of unity" instead.

fetofs 2007-06-16 22:51

I tried to use the fft function that came with the matrix package (numpy) of python (If I can't even use a pre-made, how can I write one?). First I made a stupid error, and then I got it to work. I then realized it returned results like:

[code]
myfft(987654321, 123456789) -> 121932631112626272
987654321*123456789 -> 121932631112635269[/code]

Are the imprecisions normally that big? If so, how do you fix it? Or it just doesn't matter?

fivemack 2011-02-19 14:37

[QUOTE=fetofs;108402]I tried to use the fft function that came with the matrix package (numpy) of python (If I can't even use a pre-made, how can I write one?). First I made a stupid error, and then I got it to work. I then realized it returned results like:

[code]
myfft(987654321, 123456789) -> 121932631112626272
987654321*123456789 -> 121932631112635269[/code]

Are the imprecisions normally that big? If so, how do you fix it? Or it just doesn't matter?[/QUOTE]

No, the calculation is generally exact - if any digit in the output ever comes out as further than 0.4 from its nearest integer then you start again with a longer FFT (that is, a smaller base).

I wonder whether you're transforming back from an array of digits to an integer using (double-precision) FP rather than exact integer arithmetic; could you post your code?

You should be doing the FFT on an array of at least eighteen entries if you're working with nine-digit inputs, set up with contents like
[0,0,0,0,0,0,0,0,0,9,8,7,6,5,4,3,2,1]

science_man_88 2011-02-19 15:08

I said i couldn't understand but I've heard of FFT and next thing I know it's not even here. I'd love to learn it but I fear any explanation is above me. This thread shows up 7th for fft explanation on google already.

science_man_88 2011-02-19 15:19

best thing I've found for visually seeing it is [url]http://www.gweep.net/~shifty/portfolio/fftmulspreadsheet/index.html[/url]

ciic 2011-08-28 15:34

Books 'N Bagels
 
As Bagels usually have one - a hole - also, I might think, there's one in my own knowledge about the notation of the math.

'Cose: got me two books, first the famous one authored by Richard Crandall & Carl Pomerance: "Prime Numbers - A Computational Perspective" and 2nd a rather old book authored by Bruce, J.W, Giblin P.J., Rippon, P.J "Microcomputers and Mathematics" - with I found rather helpful and interesting.

My problem comes more with the first mentioned book by Crandall et al. it uses mathm. notations which I do not understand, e.g. an "=" with three horizontal lines and sum-signs and pi-signs and brackets...

Could someone please give me a hint, where I probaply could get information on this? Some "search-strings" for an extended google/amazon/wikipedia-search would also be very much helpful to me.

Cheers, ciic.

axn 2011-08-28 16:59

[url]http://en.wikipedia.org/wiki/List_of_mathematical_symbols[/url]

See: congruence, summation, product.

Mini-Geek 2011-08-28 17:11

[QUOTE=ciic;270250]My problem comes more with the first mentioned book by Crandall et al. it uses mathm. notations which I do not understand, e.g. an "=" with three horizontal lines and sum-signs and pi-signs and brackets...

Could someone please give me a hint, where I probaply could get information on this? Some "search-strings" for an extended google/amazon/wikipedia-search would also be very much helpful to me.

Cheers, ciic.[/QUOTE]

[url]http://en.wikipedia.org/wiki/List_of_mathematical_symbols[/url] is a useful reference. And, e.g., if you saw a pi-sign used in a way you don't understand, you can look up other uses of [URL="http://en.wikipedia.org/wiki/Pi_(disambiguation)"]Pi[/URL] on Wikipedia.

ciic 2011-09-09 14:17

[B]Thank you!
I found the fitting keyword for me: "Set Theory".

Will go w/ "[/B]Goldrei, Derek (1996), [I]Classic Set Theory[/I], London: [URL="http://en.wikipedia.org/wiki/Chapman_and_Hall"]Chapman and Hall[/URL], p. 3, [URL="http://en.wikipedia.org/wiki/International_Standard_Book_Number"]ISBN[/URL] [URL="http://en.wikipedia.org/wiki/Special:BookSources/0-412-60610-0"]0-412-60610-0[/URL][B]" ...

Cheers, ciic.

---

P.S.:
"Them ain't Bagels, they doughnuts[/B]": See [URL]http://en.wikipedia.org/wiki/Bagels[/URL] , please.
"PS Can't fool me: there ain't no sanity clause." : 't me either!

R.D. Silverman 2011-09-09 16:00

[QUOTE=DJones;90620]Marvellous, thank you. Now any chance of being pointed at a suitable site / book (preferably the former) which explains why powers of e^(i * 2 * pi / n) work as a Fourier Transform matrix. Yes, I appreciate I could google this and have done so, but mathematical functions and definitions tend to be very flexible on the web, which isn't surprising when you take into account the fact that any idiot can put a webpage up.

I also appreciate I'm coming at this from the wrong end - I stumbled across the Mersenne Primes when looking for a DC project to get involved in, and have tried to backtrack through the proofs and methods so that I understand what's going on. Unfortunately backtracking through proofs, as opposed to building on proofs you already have, is proving to be rather convoluted.[/QUOTE]

Read Nussbaumer's book. It should be required reading.

Maximus 2011-11-06 20:46

I found an algorithm for polynomial multiplying. The algorithm is relative to FFT and has a simple explanation. Is anybody interesting?

LaurV 2011-11-07 02:20

[QUOTE=Maximus;277383]I found an algorithm for polynomial multiplying. The algorithm is relative to FFT and has a simple explanation. Is anybody interesting?[/QUOTE]

I am confused here. Did you find an algorithm to multiply polynomials (in whatever complexity time) or did you find an algorithm to multiply integers in polynomial time?

Christenson 2011-11-07 02:43

Let's have it, with a small example. Be mathematically precise in the sense you mean it...and post in the homework thread, since your post (and my reply) is off-topic, and it will draw fewest flames there.

henryzz 2012-01-29 18:36

I have been fiddling around trying to make small FFTs optimal. Here is my psuedo code for 4x4 multiplication using FFT.

Input: (a,b,c,d,0,0,0,0)*(e,f,g,h,0,0,0,0)

FFT:
[CODE]brt=b*1/sqrt(2)
drt=d*1/sqrt(2)
mtmp1=a+c
mtmp2=b+d
mtmp3=brt-drt
mtmp4=brt+drt
m1=mtmp1+ntmp2
m2=a+mtmp3
m3=c+mtmp4
m4=a-c
m5=b-d
m6=a-mtmp3
m7=c-mtmp4
m8=mtmp1-mtmp2

frt=f*1/sqrt(2)
hrt=h*1/sqrt(2)
ntmp1=e+g
ntmp2=f+h
ntmp3=frt-hrt
ntmp4=frt+hrt
n1=ntmp1+ntmp2
n2=e+ntmp3
n3=g+ntmp4
n4=e-g
n5=f-h
n6=e-ntmp3
n7=g-ntmp4
n8=ntmp1-ntmp2[/CODE]This makes up the parts of the FFTs. Each of the 2 FFTs is 6 adds, 6 subs and 2 muls. 1/sqrt(2) must also have been precalculated.

The multiplication can be done by:
[CODE]r1=m1*n1
r2=m2*n2-m3*n3
r3=m4*n4-m5*n5
r4=m6*n6-m7*n7
r5=m8*n8
r6=m2*n3+m3*n2
r7=m4*n5+m5*n4
r8=m6*n7+m7*n6[/CODE]This is 3 adds, 3 subs and 14 muls.

Inverse FFT:
[CODE]tmp1=(r1+r5)/8
tmp2=r3/4
tmp3=tmp1+tmp2
tmp4=tmp1-tmp2
tmp5=(r2+r4)/4
tmp6=(r6+r8)/4
tmp7=(r1-r5)/8
tmp8=r7/4
tmp9=tmp7+tmp8
tmp10=tmp7-tmp8
tmp11=(r2-r4)*sqrt(2)/8
tmp12=(r6-r8)*sqrt(2)/8
output1: tmp3 + tmp5
output2: tmp9 + tmp11 + tmp12
output3: tmp4 + tmp6
output4: tmp10 - tmp11 + tmp12
output5: tmp3 - tmp5
output6: tmp9 - tmp11 - tmp12
output7: tmp4 - tmp6
output8: tmp10 + tmp11 - tmp12[/CODE]This is 11 adds, 11 subs, 2 muls and 6 shifts. sqrt(2)/8 must also have been precalculated.

Total for 4x4 multiplication by FFT is 26 adds, 26 subs, 20 muls and 6 shifts. This is 78 operations which is slightly less than the 10*N*log2(N) estimate of 80.

Can anyone see any mistakes or any improvements? I am going to move onto 8x8 multiplication next.

ewmayer 2012-01-29 20:58

Without delving into your code in much detail, several general notes:

1. You need to be clear what type of transform you are working on - real-vector, complex, modular, what? (It seems from your examples you intend a real-signal covolution-based multiply, is that right?

2. You should never use 1/sqrt(N) normalizations - just apply 1/N during the final normalize-and-carry step on the convolution outputs.

3. "Shifts" are not an option on most floating-point hardware. What you call shifts should instead be muls-by-inverse-powers-of-2, should never appear explicitly. These should be absorbable into the other multiplicative constants appearing in the algorithm.

4. You need to think about whether this kind of small-transform-optimization approach will generalize well to larger signal lengths. It appears to be useful for the small-DFTs surrounding the dyadic-mul step, but large FFTs will need more-generic-radix passes (mostly small powers of 2, perhaps one odd-radix pass per fFFT and iFFT) preceding and following the dyadic-mul step. Those will have variable strides, and need to accommodate array accesses with varying strides, as well as complex "twiddle" muls.

R.D. Silverman 2012-01-30 00:39

[QUOTE=ewmayer;287684]Without delving into your code in much detail, several general notes:

1. You need to be clear what type of transform you are working on - real-vector, complex, modular, what? (It seems from your examples you intend a real-signal covolution-based multiply, is that right?

2. You should never use 1/sqrt(N) normalizations - just apply 1/N during the final normalize-and-carry step on the convolution outputs.

3. "Shifts" are not an option on most floating-point hardware. What you call shifts should instead be muls-by-inverse-powers-of-2, should never appear explicitly. These should be absorbable into the other multiplicative constants appearing in the algorithm.

4. You need to think about whether this kind of small-transform-optimization approach will generalize well to larger signal lengths. It appears to be useful for the small-DFTs surrounding the dyadic-mul step, but large FFTs will need more-generic-radix passes (mostly small powers of 2, perhaps one odd-radix pass per fFFT and iFFT) preceding and following the dyadic-mul step. Those will have variable strides, and need to accommodate array accesses with varying strides, as well as complex "twiddle" muls.[/QUOTE]

Read Nussbaumer's book!!! It contains a lot of optimized small FFT's
for various domains.

henryzz 2012-01-30 00:58

[QUOTE=ewmayer;287684]Without delving into your code in much detail, several general notes:

1. You need to be clear what type of transform you are working on - real-vector, complex, modular, what? (It seems from your examples you intend a real-signal covolution-based multiply, is that right?
[/QUOTE]
All inputs and outputs should be real. All the variables are real however there are some implied imaginary parts. I just changed the sign when necessary when using them later on. I started from your example in the fifth post and used e^(i*2*pi/8)=1/sqrt(2)+1/sqrt(2)*i as E in a 8x8 matrix. Once I had done that I just looked for where there were repeat operations and optimised them out.
[QUOTE]
2. You should never use 1/sqrt(N) normalizations - just apply 1/N during the final normalize-and-carry step on the convolution outputs.
[/QUOTE]Are you refering to the 1/sqrt(2)s in the code or something about my iFFT? N is 4 in my case and I am doing a length 2N FFT.
[QUOTE]
3. "Shifts" are not an option on most floating-point hardware. What you call shifts should instead be muls-by-inverse-powers-of-2, should never appear explicitly. These should be absorbable into the other multiplicative constants appearing in the algorithm.
[/QUOTE]I didn't realize they didn't exist for floating point. Previously I have almost exclusively used integer arithmatic which does support shifts.
[QUOTE]
4. You need to think about whether this kind of small-transform-optimization approach will generalize well to larger signal lengths. It appears to be useful for the small-DFTs surrounding the dyadic-mul step, but large FFTs will need more-generic-radix passes (mostly small powers of 2, perhaps one odd-radix pass per fFFT and iFFT) preceding and following the dyadic-mul step. Those will have variable strides, and need to accommodate array accesses with varying strides, as well as complex "twiddle" muls.[/QUOTE]Not sure what you mean by generic-radix and odd-radix though it might be the names fooling me. Do you mean that calulating r1 and r5 are odd-radix steps and the others generic radix steps?
I realize that the optimisations will get more complex. With a length 8 FFT E was 1/sqrt(2)+1/sqrt(2)*i and real(E)=im(E). With higher powers real(E)!=im(E).
The code above worked once I had changed m1=mtmp1+ntmp2 to m1=mtmp1+mtmp2.
I have gleaned most of my information on ffts from this thread. I have studied basic linear transformations at uni but nothing like this. I am currently half way through my first year studying maths at Leicester University. I will look at this again tomorrow. Need to get to sleep:smile:

ewmayer 2012-01-30 20:40

[QUOTE=R.D. Silverman;287698]Read Nussbaumer's book!!! It contains a lot of optimized small FFT's for various domains.[/QUOTE]
It was one of my main references when I was writing my own early-stage FFT, although I found it most useful for the optimized short-length-DFT algos, rather than the overall implementation of a large-signal FFT. Or were to talking to Henry?

[QUOTE=henryzz;287699]All inputs and outputs should be real. All the variables are real however there are some implied imaginary parts. I just changed the sign when necessary when using them later on. I started from your example in the fifth post and used e^(i*2*pi/8)=1/sqrt(2)+1/sqrt(2)*i as E in a 8x8 matrix. Once I had done that I just looked for where there were repeat operations and optimised them out.
Are you refering to the 1/sqrt(2)s in the code or something about my iFFT? N is 4 in my case and I am doing a length 2N FFT.[/QUOTE]
Now that you made clear you are writing a real-vector FFT, the reason for the sqrt(2)s is clear to me. I mis-read that part of your code, partly because I didn't have time to delve deeply into your implementation yesterday, and instead focused on big-picture stuff.

[QUOTE]Not sure what you mean by generic-radix and odd-radix though it might be the names fooling me. Do you mean that calulating r1 and r5 are odd-radix steps and the others generic radix steps?[/QUOTE]
No, I mean that in general one seeks to write code that will be able to handle a wide variety of transform lengths. So e.g. if length-64 is too small to permit exact multiply of one's target inputs, one needs to be able to go easily to a larger transform length. One generally assembles these out of highly-optimized smaller-DFT routines. For example in my code I found it best to support power-of-2 complex-DFT-pass routines of lengths 8, 16 and 32. Lengths < 8 are inefficient because they lead to too many passes through the data array, but lengths > 32 yields little further gain in that regard. So for n=64 I would use 2 radix-8 passes, for n=128 I would do one pass at radix 8 ad another at radix16.

Once one needs to get seriously efficient one alno needs to support non-powr-of-2 intermediate runlengths, e.g. n=80, which one could do via radix [5,16] or [8,10] combos. That gets nontrivial because one needs to consider how to generalize the concept of "bit reversal reordering" to auch runlengths.

But you seem like you're on the right track, so let us know how things go.

emily 2012-03-01 17:45

Well I also want to read more about FFT. It helps to keep in mind that Fast Fourier Transform is simply an optimization for the real thing: the Discrete Fourier Transform (DFT). If it's difficult to find material in English, imagine how difficult it is to find good material in Greek!

jasonp 2012-03-01 18:05

Lately I've found [url="http://cnx.org/content/col10550/latest/"]this free book[/url] a fascinating read on FFTs, since I've become interested in number-theoretic transforms and so need to re-apply all of the theory it describes but in domains without complex numbers. The web site has other works by Burrus as well, that complement topics that the book glosses over a little.

It's really surprising how difficult it is to get the original papers mentioned in the bibliography. Most are from the 1980s, only Burrus' notes are more recent that I could find. Either you find a copy of Nussbaumer's book in a university library or you pay the IEEE $10 a page for a PDF. At that rate my collection of hobby papers and books would have cost me $100,000.

legendarymudkip 2014-12-03 23:27

[QUOTE=ewmayer;1478]Since all the output digits happen to be less than 10 and nonnegative, we don't need to do any carry step[/QUOTE]Is the carry step like with the GS method? So say if you had (48,52,52,52), would you then get the result 52+520+5200+48000=53772? If not, how else would you do it?

ewmayer 2014-12-06 03:15

[QUOTE=legendarymudkip;389049]Is the carry step like with the GS method? So say if you had (48,52,52,52), would you then get the result 52+520+5200+48000=53772? If not, how else would you do it?[/QUOTE]

Assuming digit significance (power of 10) decreases from left to right, yes. But in practice we would start with the 0-index output term, the rightmost 52 in your notation, and work our way upward (leftward) like so, with the left-column index indicating power of 10, and /= denoting integer (truncating) divde and %= indicating (nonnegative-output) modulo:

0: 52, /= 5 (carry), %= 2;
1: 5+52, /= 5 (carry), %= 7;
2: 5+52, /= 5 (carry), %= 7;
3: 5+48, /= 5 (carry), %= 3;
4: 5+0, no carry, hence done.

Now *really* in practice we would use a modulo based on nearest-int rounding of the DIV result, yielding a balanced-digit representation of the result, 53772, which has much better numerical properties as far as the next FFT-squaring (assuming one occurs) is concerned. Same as above, but everytime the %= result is > 5 (i.e. half the base) we subtract 10 from the current digit and increment the carry by 1 to account for the -10:

0: 52, /= 5 (carry), %= 2;
1: 5+52, /= 6 (carry), %= -3;
2: 6+52, /= 6 (carry), %= -2;
3: 6+48, /= 5 (carry), %= 4;
4: 5+0, no carry, hence done.

Check the result: 2 + 10*( -3 + 10*( -2 + 10*( 4 + 10*(5)))) = 53772, as expected.

In the case where the "coin lands on its edge" (%= 5 in base-10) you can either use an IEEE-compliant round-to-nearest-even (if your hardware can do that efficiently), or just take whichever direction your preferred NINT emulation (e.g. NINT(x) = (x + c) - c, where the "magic constant" c = 0.75*2^b and b = #bits of the mantissa in your floating point type) - which of the 2 is not crucial, in my experience with these types of computations.

CadeBrown 2016-04-06 14:44

How exactly does the FFT reduce to order NlogN ? It seems as though you are still doing a NxN matrix multiplied by a scalar, which seems to be at least N^2 Is there something most programs do to cut it down?

ewmayer 2016-04-07 04:04

[QUOTE=CadeBrown;430862]How exactly does the FFT reduce to order NlogN ? It seems as though you are still doing a NxN matrix multiplied by a scalar, which seems to be at least N^2 Is there something most programs do to cut it down?[/QUOTE]

The core aspect of the FFT is that no matrix multiplies are done ... consider the length-4 vector Fourier transform example I give in post #5, in matrix-multiply form. See the obvious redundant operations? Adding a few () and a few rearrangements of the inputs to make things more obvious, our 4 DFT outputs are

out0 = (a+c) + (b+d)
out1 = (a-c)+i*(b-d)
out2 = (a+c) - (b+d)
out3 = (a-c)-i*(b-d)

So instead of a naive matrix multiply, we first compute the following intermediates - these are the famous radix-2 "butterflies":

y0 = (a+c)
y1 = (a-c)
y2 = (b+d)
y3 = (b-d) ,

which we can do in-place, overwriting the original inputs. (Though there are intricacies such as bit-reversal reordering and schemes for avoiding the need for it involved in the deployment of an in-place FFT scheme.)

Then combine these to obtain the outputs:

out0 = y0 + y2
out1 = y1+i*y3
out2 = y0 - y2
out3 = y1-i*y3 .

Thus radix-4 needs 2 passes through the data, each pass doing just linear work, i.e. O(n). For length n = 2^k we need k = lg(n) (lg = base-2 logarithm) such passes, each of O(n) cost, hence O(n log n) overall. For n not a power of 2 things are bit more involved, but one can still prove the O(n log n) property, just with a higher implied constant of proportionality. The procedure can be done recursively - it is perhaps most naturally explained that way - or not. Any decent FFT reference has more details, though I found such small worked examples very useful way back when I was first learning this stuff, and especially working out the mechanics of non-power-of-2-length FFTs, which relatively few references cover in any useful fashion.

tServo 2019-04-05 16:39

Youtube FFT by hand !
 
FWIW, here is a Youtube video I found posted Autumn 2018 in which a guy multiplies 41 * 37 by hand with paper and pencil using FFT!
How cool-y retro.
It's 7 minutes long and has just the basics.
Enjoy:
[URL="https://www.youtube.com/watch?v=YDhsLhTK3Bs"]https://www.youtube.com/watch?v=YDhsLhTK3Bs[/URL]


All times are UTC. The time now is 17:36.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.