Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Arcane Algorithm Archive (algorithm-archive.org)
88 points by pplonski86 on Dec 1, 2018 | hide | past | favorite | 7 comments


Since this is on HN, I'll add a request to anyone that is able and willing. I want to understand the Cooley-Tukey FFT [0]. This algorithm holds a special place in my heart because I just don't _get_ it. I've used it and implemented it. So I suppose I understand how it works, but I don't understand _why_ it works. There's something deeper and elegant about the bit reversals that I think will reveal something to me about what an FFT is. Anyways, if you've never looked into it, it's beautiful.

[0] https://en.m.wikipedia.org/wiki/Cooley%E2%80%93Tukey_FFT_alg...


The Cooley-Tukey algorithm is nicely understood as a "4 step FFT". Both the decimation in time and decimation in frequency algorithms, and the "bit reversals" they use, are really just special cases of the more general (and simpler) thing.

Say you have an array of size N, where N = P*Q. You can treat your 1D array as a 2D array in row major form. Then the 4 steps are:

    do vertical FFTs of the columns
    "twiddle" the data
    do horizontal FFTs of the rows
    transpose the array
The twiddle step is really just multiplying each element in the array by a complex number. See:

    https://en.wikipedia.org/wiki/Cooley%E2%80%93Tukey_FFT_algorithm#General_factorizations
This breaks the N-size larger FFT into smaller P and Q sized FFTs. If P or Q is factorable, then you can recursively apply the algorithm to perform each of the smaller FFTs. Eventually, you'll reach a base case where you need a different algorithm. For instance, if you get to a factor of 2 or 3, you'll just do the simple DFT. If you get a large prime, there are other approaches.

If N is a power of 2, then you can always choose one of P or Q to be 2. These are the decimation in time or decimation in frequency algorithms that most people are taught. I think restricting it to this special case, and jumbling it all up with the "butterfly" diagrams commonly shown really obscures an elegant and easily understood algorithm. If you save all the transposes, and the transposes in the recursions until the end, it ends up being the bit (or digit) reversal you're familiar with. The reason bit reversal seems magical is because it hides the fact that it was really just a bunch of transposes.


You probably already know that the Fourier transform can be seen as a polynomial evaluation: The nth coefficient f_n of the Fourier transform of a series (x)_k can be obtained by seeing the series (x)_k as the coefficients of a polynomial, and f_n is the value of this polynomial at ω_n, the nth root of unity.

If done in a naive way this takes on the order of m^2 time if your series is m samples long. By cleverly re-using results, the evaluation can be done in the order of m log(m) time.

The re-use of intermediate results can be done by observing that P(x) = P_even(x^2) + x P_odd(x^2), and using this recursively. The FFT of a sequence can now be computed by dividing the sequence into even and odd coefficients, taking the FFT of those subsequences, and re-combining them into the FFT of the whole sequence. The combination of the FFT's to the FFT of the whole sequence can be done in O(n) time, and total time is then O(n log(n)). The recursion is similar to that done in mergesort. One important difference is that it is essential that the length of the sequence is a power of two.

If P(x) = a + bx + cx^2 + dx^3 + ex^4 + fx^5, then P_odd(x) = b + dx + fx^2 and P_even(x) = a + cx + ex^2

This is the basis of the algorithm. The bit reversals are only needed when you do everything in place. And honestly, I don't understand that either.


I had the same question when i was undergrad and it led me to do masters! It works because of symmetry, if calculations are symmetric you don't need to perform them twice which results in "fast computation". DFT matrix can be decomposed into multiple matrices in such a way that most of its elements are zeros and this representation is what is called FFT. This can be done because DFT structure has certain symmetries which are exploited to decompose its matrix it is also the reason why any arbitrary matrix cannot be made fast. Why DFT has this structure comes from the theory of linear shift invariant systems.

search for "pushchel" and "symmetry" if you want to go deeper and to understand what is meant by symmetry in mathematical terms.



How is this different from Rosetta Code?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: