mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2005-03-02, 23:29   #1
Unregistered
 

22·5·73 Posts
Default FFT size question

I've been wondering recently about the ranges covered by the various FFT sizes. On the benchmarks page the same FFT size increases between columns are colour coded i.e. green for 64K increase between ranges, yellow for 128K and lilac for 256K - is there any easy way to increase the number of ranges i.e. decrease the stepping size between ranges as exponents get bigger to effect an overall speed increase for the Gimps project? (I'm guessing there isn't a way to implement this, but just thought I'd ask) Perhaps there is a problem with accuracy - maybe a certain margin of safety is necessary at higher exponents? (then again - is accuracy/error rate a function of FFT size only? or is time taken also involved, as a slightly smaller FFT relatively would be faster).

I don't know much about FFT's themselves, so can't really say much, but it would seem that much smaller stepping (if technically possible), would give a real boost to project speed particularly at the higher exponents and also bearing in mind that Prime95 on a P4 is probably maxed out now in terms of pure processing power. (I'm doing a 34 million exponent, I've been at it over 2 months when the PCs on anyway, and I'm only 4 million in!)
Anyway, here's to M43, wherever it may lie
  Reply With Quote
Old 2005-03-03, 00:35   #2
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

25·367 Posts
Default

There's no mathematical reason one can't decrease the FFT stepsize as you propose - my Mlucas code uses 8 equal-size steps between adjacent powers of 2, for example. But some notes about this:

1) The relative payoff decreases with decreasing stepsize - pictorially, think of approximating the area under the slope-1 line segment (y = x) from x = N to x = 2N with a series of equal-width rectangles. The powers-of-2-only FFT case corresponds to approximating the line segment with a single rectangle of height 2N, which has an area 2*N^2, compared to the exact area (N x N square plus right triangle with short sides of length N) of 1.5*N^2, for an excess (dividing out the common N^2) of 0.5. Using two rectangles of width N/2 and heights 1.5*N and 2*N, respectively, corresponds to allowing FFT lengths of form {2,3}*2^k, and has a total area 1.75*N^2, for an excess of 0.25. Each added halving of the approximating-rectangle width (i.e. of the FFT-length stepsize) only saves half the total labor of the previous halving. For GIMPS, going from 4 steps between adjacent powers of 2 (as Prime95 currently allows) to 8 steps might save on the order of 10% total labor. And that estimate is likely generous, because...

2) Supporting 4 equal-length steps between adjacent powers of 2 means supporting FFT lengths of form {1,3,5,7}*2^k. Those odd multipliers are small enough that the corresponding short-length DFT (discrete Fourier transform) algorithms can be efficiently implemented - for Prime95 that also means without horrendous amounts of ugly x86-style assembly code, and without needing overly large numbers of floating-point registers to store intermediate quantities in the DFT computation. Going to 8 equal-length steps between adjacent powers of 2 means supporting FFT lengths of form {1,3,5,7,9,11,13,15}*2^k. Those added larger DFT radices are trickier to code, generally less relatively efficient in terms of average floating-point opcount-per-input (especially for the prime radices 11 and 13), and need lots of FP registers to store temporaries. They'd also be a bitch to implement in assembly code.
ewmayer is offline   Reply With Quote
Old 2005-03-03, 12:38   #3
Unregistered
 

2×3×389 Posts
Default

Thanks very much for taking the time to go through the FFT stepping stuff - it's really appreciated, and can give us all confidence that prime95 is pretty much on the limit in terms of FFT related project throughput. I did a quick check of likely benefit to exponent test time and it was about 6% at most (for half the step size at the top end, and that's ignoring the difficulties of point 2).

Something that prompted me to post was that the (SSE2) 34.56 million crossover is rapidly approaching and would mean for me that an exponent would take 4.5 days more (over 100 hours at full speed) for a couple of exponents over the limit as opposed to under - it's something that seems a little crazy at first. Now there's an incentive to grab one of the 10 million digit exponents soon if ever I heard one!
Thanks again
  Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Max FFT Size? aketilander Software 44 2018-12-19 17:58
Pi(x) value for x at 10^16 size edorajh Computer Science & Computational Number Theory 6 2017-03-08 20:28
Exponent Size Gap Mini-Geek PrimeNet 8 2007-03-25 07:29
FFT-Size andi314 Lounge 14 2007-01-22 00:21
Newbie question:Changing size of text in Mozilla Firefox jasong Linux 2 2006-02-24 03:55

All times are UTC. The time now is 14:11.


Mon Oct 3 14:11:46 UTC 2022 up 46 days, 11:40, 1 user, load averages: 1.38, 1.64, 1.66

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔