View Single Post
Old 2002-09-14, 04:52   #7
dswanson
 
dswanson's Avatar
 
Aug 2002

20010 Posts
Default

Nick, you're right, the new ranges were in the mailing list. Here's George's message from that list, for those who don't subscribe...

[code:1]This table shows FFT size, v21 x87 crossover, v22.8 x87 crossover,
v21 SSE2 crossover, v22.8 SSE2 crossover. As you can see v22
is more liberal with x87 crossovers and more conservative with
SSE2 crossovers.

262144 5255000 5255000 5185000 5158000
327680 6520000 6545000 6465000 6421000
393216 7760000 7779000 7690000 7651000
458752 9040000 9071000 8970000 8908000
524288 10330000 10380000 10240000 10180000
655360 12830000 12890000 12720000 12650000
786432 15300000 15340000 15160000 15070000
917504 17850000 17890000 17660000 17550000
1048576 20400000 20460000 20180000 20050000
1310720 25350000 25390000 25090000 24930000
1572864 30150000 30190000 29920000 29690000
1835008 35100000 35200000 34860000 34560000
2097152 40250000 40300000 39780000 39500000
2621440 50000000 50020000 49350000 49100000
3145728 59400000 59510000 58920000 58520000
3670016 69100000 69360000 68650000 68130000
4194304 79300000 79300000 78360000 77910000

Now the gotcha. In v22.8, FFT crossovers are flexible. If you test an
exponent within 0.2% of a crossover point, then 1000 sample iterations
are performed using the smaller FFT size and the average roundoff
error calculated. If the average is less than 0.241 for a 256K FFT or
0.243 for a 4M FFT, then the smaller FFT size is used.

Brian Beesley has been a great help in investigating revised crossover
points and analyzing the distribution of round off errors. We noticed
that consecutive exponents can have a pretty big difference in average
roundoff error (e.g. one exponent could be 0.236 and the next 0.247).
This is why I elected to try the flexible approach described above. The
0.241 to 0.243 average was chosen hoping for about 1 iteration in a
million generating a roundoff error above 0.4. We might change the 0.241
to 0.243 constants with more data - it is hard to get enough data points
to accurately measure 1 in a million occurrences.

One downside is the server does not know which FFT size is used and
will credit you based on the v21 x87 crossovers. Thus, if you are a lucky
person, you might get "bonus" CPU credit where you test the exponent
at a smaller FFT size and the server credits you based on the larger
FFT's timing.[/code:1]
dswanson is offline   Reply With Quote