IIUC, you want to map the product in Z/(2^{[I]N[/I]}+1)Z to Z[x]/(x^{[I]l[/I]}+1) and then not to (Z/(2^{[I]n[/I]}+1)Z)[x]/(x^{l}+1) with n ≥ 2N/l+log_{2}(l), but to (R(Z/(2^{[I]n[/I]}+1)Z))[x]/(x^{[I]l[/I]}1), where R is some other ring that supports a lengthl transform, is large enough that the condition n ≥ 2N/l suffices and the correct product in Z[x]/(x^{[I]l[/I]}+1) can be reconstructed quickly by a CRT from separate convolutions in R[x]/(x^{[I]l[/I]}+1) and (Z/(2^{[I]n[/I]}+1)Z)[x]/(x^{[I]l[/I]}+1). In your FGT suggestion, R = GF(M_{[I]p[/I]}^{2}) for some Mersenne prime M_{[I]p[/I]}. Sounds perfectly feasible.
In fact, in a manuscript of the paper we mentioned an idea Robert Harley told us, but we had to remove it for the final paper due to the page limit. If I understood it correctly, he suggested using e.g. R=Z/2^{32}Z and doing some convolution algorithm (grammarschool, ToomCook, whatever), but of course there's no reason why we shouldn't use a ring that supports an FFT of the required length. According to Bernstein's "Multidigit Multiplication for Mathematicians" page 11, Karp came up with a similar idea, though I don't quite understand his description so I'm not sure he's talking about the same thing...
So basically, yeah, I think your idea should work.
Btw, if you do a 4step transform of length ≥2^{12}, actual shifts only occur in the "middle step" of applying twiddle factors between the row and columntransforms, and since the twiddle factor is ω^{ij} for matrix entry i,j, you often get an "even enough" exponent that you don't have to do an actual shift again  actual bitshifts are relatively rare in long SchönhageStrassen transforms.
Alex
Last fiddled with by akruppa on 20090302 at 14:13
