 mersenneforum.org Other mersenne progs, why so slow?
 Register FAQ Search Today's Posts Mark Forums Read  2003-03-20, 22:13   #23
pakaran

Aug 2002

3×83 Posts Re: Prime95 secret

Quote:
 Originally Posted by ewmayer 3) I'm not sure what you mean here. If you're referring to doing all the iterations entirely in Fourier space, that has been much discussed, but simply doesn't work - you need to come out of Fourier space to do error removal (in a floating-point-FFT-based scheme), normalization of digits and carry propagation. Then you need to take the resulting new set of input digits and get back into Fourier space to do the squaring, and so on. There's your two FFTs - no way around that, I'm afraid.
Has anyone considered doing a pure-integer implementation that stayed in Fourier space? Would this make the subtract-two difficult?   2003-03-21, 01:17 #24 cheesehead   "Richard B. Woods" Aug 2002 Wisconsin USA 170148 Posts Re: Prime95 secret Edited -- Well, it seems I made several mistakes in what I posted here, so I've erased it all to avoid misleading anyone.   2003-03-21, 02:02   #25
ewmayer
2ω=0

Sep 2002
República de California

32·1,297 Posts Re: Prime95 secret

Quote:
Quote:
 Originally Posted by pakaran Has anyone considered doing a pure-integer implementation that stayed in Fourier space?
Yes, there are pure-integer implementations, but they can't stay in Fourier space forever -- they still have to come out to do at least the subtraction-of-2 and carry propagation, for the same basic reason as the floating-point implementations have to: In Fourier space, multiplication becomes simpler, but addition/subtraction becomes so complicated that it's faster to come out of Fourier space to do them then go back in.
Actually, addition/subtraction of a scalar is easy in Fourier space: since the length-N DFT of a scalar c is just a vector of N copies of c, to add c to the DFT of a vector x (let x^ = DFT(x)), we simply add c to every element of x^.

Quote:
 FP implementations also have other reasons to go out of and back into Fourier space, but share some reasons with pure-integer implementations.
The main one of these being the fact that if we stay in Fourier space, the convolution coefficients (elements of our DFT vector) grow quadratically with every iteration, and rapidly overflow the bitfields used to store them (whether these are floating-point mantissas or integers.) The only way to avoid this is to make the input terms sufficiently small that we can stay in Fourier space for several iterations before risking overflow, but that means using a longer vector length. For instance, if we wanted to stay in Fourier space for 2 iterations rather than the usual one, we'd have to use double the vector length. That would (because of real-world effects like memory bandwidth and cache misses) more than double our runtime, wiping out the benefits of doing half as many transforms.   2003-03-21, 02:23   #26
ewmayer
2ω=0

Sep 2002
República de California

32·1,297 Posts Re: IBWDT

Quote:
 Originally Posted by Cyclamen Persicum *** ewmayer Would you be so kind as explaining me how IBWDT works?
Unfortunately I don't have time at the moment to write a detailed post about this, so I'll have to punt and refer you (a) to the original Crandall/Fagin paper (Mathematics of Computation 62 (205), pp.305-324, January 1994), or (b) Colin Percival's Math. Comp. paper, "Rapid multiplication modulo the sum and difference of highly composite numbers," to which Colin has provided a link (the one with www.sfu.ca in the URL.) All I'll say for now is:

1) The term "irrational base" is somewhat misleading, since one isn't actually using fractional bases; rather, one is using a mixture of two whole-number bases, which together have an effect equivalent to the use of a fractional base. In your base-10 example, if we were using IBDWT we might be using a mixture of bases 10 and 100, for example.

2) If you read my post in the thread "FFT explanation for non-math guys," you'll see the example where by simply doing a length-N DFT-based squaring with a fixed base X, we automatically get a square modulo X^N - 1. This is what I refer to as the "folding" property of DFT-based squaring: it has the same effect as if you wrote out your two input vectors as polynomials over X (i.e. the elements of the vectors would be the coefficients of the various powers of X, from 0 to N-1), calculated the full polynomial product (which would have powers of X ranging from 0 to 2*N-1, with the topmst coefficient, the one multiplying X^(2*N-1) being zero), then "folded" the upper N-1 coefficients in with the lower N-1 ones. What the IBDWT allows one to do is to use a DFT to accomplish this folding, EVEN WHEN THE POWER N DOES NOT DIVIDE THE DFT LENGTH. For Mersenne numbers, the exponents p are primes, so no possible DFT length &lt; p could divide p, yet we can still proceed as if it did, with a little bit of extra work (the pre-and-postmultpilication by the DWT weight factors, which needs just O(N) operations, i.e. is asymptotically cheaper than the DFT itself.)   2003-03-21, 03:31 #27 cperciva   Oct 2002 43 Posts It's worth noting that the "right-angle" FFT for multiplying two N bit integers over Z is exactly the same as multiplying modulo 2^N+(-1)i using the DWT I describe in my paper.   2003-03-21, 16:26   #28
Cyclamen Persicum

Mar 2003

8110 Posts Quote:
 Meaning "fourth grade" multiplication, or did you mean "scalar" multiplication?
Excuse my English as foreign languige ;)
I have ment "school-boy-multiplication",
in other words, "long multiplication"   2003-03-21, 17:36   #29
Cyclamen Persicum

Mar 2003

1218 Posts Quote:
 The term "irrational base" is somewhat misleading
Have you enjoyed my indeed-irrational-numeration-scale example?
It DOES really work!!!

Quote:
As far as I have understood, there was an example module 99,
but that is trivial and not pure the light about "additional clever tricks"

Please, listen to me. I'm quite stupid poor guy, even not American :(
Would you give me a short NUMERICAL example, for two digits or three,
module 13 or 127, or non-prime module at all?

I have the FFT procedure, and I can multiply by means of it,
but cannot multiply modularly. I dont want touch my FFT procedure
and build something in! I only want to call it without zero-padding
to achieve modular multiplication. Is it possible? Sure...
The gramm of practice is better than the tone of theory ;)
It wouldn't take you a lot of time to give a short numerical example,
would it? You needn't explain me what is FFT and you may write:
"convolution goes here". In principle, one can make i,j-stupid convolution
and summing two halves. The result will be the same as it were FFT-iFFT.
The most interesting
example probably is how to calculate
123*876 (mod 1000) . The FFT contains four digits (with one leading zero),
and the virtual base is 1001^(1/4), about 5.6................
Plzzz, give me this example.   2003-03-23, 14:29 #30 Cyclamen Persicum   Mar 2003 34 Posts *** 2 awmayer There is no answer... You probably is very busy, sorry. I have downloaded "Rapid Multiplication Modulo..." and I'm reading this article now. It's not very difficult to understand how vectors are weighted. But what do upper-angular brackets mean? ._ _. | j/N| - j/N Thank you in advancе.   2003-03-23, 18:04 #31 Cyclamen Persicum   Mar 2003 34 Posts Probably, it means [] - integer part. But if so, this DWT is useful for mersenne numbers, and useless for other modulo. For instance, I want multiply modulo 101, but for two digits [0/2]=0 and [1/2]=0. How can I build the polynomial? How can I use *truly irrational base* that is not drawback since we use floating point?   2003-03-24, 06:31 #32 Cyclamen Persicum   Mar 2003 34 Posts I want to say modulo (101 - 1)...   2003-03-24, 10:56 #33 Cyclamen Persicum   Mar 2003 34 Posts At the end of all ends I have understood how to perform IBWDT. Let's multiply modulo 100 by 2-element FFT, 12 by 97 mod 100. For instance, 100 = 128 - 28 = 2^7 - 2^2*7. Radix is ( 1 2^3*2^(-1) ) Weight is ( 1 2^(-0.5)*7^0.5 ) Our numbers in radix-form are (0 3) and (1 24) Convolution gives (252 3), that is indeed 64 mod 100 But it seems WDT is a piece-goods and it needs a quite difficult tuning for each particular modulo.   Thread Tools Show Printable Version Email this Page Similar Threads Thread Thread Starter Forum Replies Last Post Unregistered Information & Answers 19 2012-04-17 03:12 R.D. Silverman GMP-ECM 55 2011-10-16 17:28 petrw1 Hardware 13 2008-11-10 23:25 Housemouse Hardware 7 2008-02-15 18:18 Doorbasher Hardware 5 2004-08-23 22:18

All times are UTC. The time now is 06:28.

Sun Nov 28 06:28:44 UTC 2021 up 128 days, 57 mins, 0 users, load averages: 0.91, 0.95, 1.00