It is well known that AVX processors are very dependent on memory bandwidth (especially for larger FFTs). My 3GHz Ivy bridge can do a 448K FFT of similar bit length (SR5) in just over half the time (about 13500s).

My best guess is that your memory is not running in dual-channel mode. Or it is just very slow. What is your memory spec/configuration?

