mersenneforum.org > Math Quick mod function ?
 User Name Remember Me? Password
 Register FAQ Search Today's Posts Mark Forums Read

 2003-12-31, 19:23 #1 dsouza123     Sep 2002 10100101102 Posts Quick mod function ? In writing a program that tests a factor for a mersenne number, in implementing the powering function the only issue left is a quick mod function. Using: P as the exponent R as the result from the previous iteration. N as the new number before the mod. F as the factor to mod by. The first attempt was repeated subtractions of F from N but is too slow for anything but the smallest numbers. An example using small number P = 11, 2^11 - 1 and F = 23 iteration 3 result is 9 iteration 4 9^2 = 81; bit = 1 so times 2 81*2= 162; N =162, N mod F = 1, 162 mod 23 = 1 Is there a short cut using the previous iterations result R = 9 ? Or the F - R, 23 - 9 ? Is there a way to get a rough estimate of a multiplier of F that would put it very close to N. Example: 23 - 9 = 14 14 / 2 = 7 F * 7 = 161 N = 9*9*2 = 162 162 - 161 = 1 162 mod 23 = 1 F - R = x x / 2 = y F * y = z N - z = new result = N mod F Is this workable (useable math) or just a fluke ? Only some of the basic operations are available square, addition, saturated subtraction ( small - big = 0 ), times 2, multiplication. A divide by 2 can/will be written. Only positive integers are used. The numbers are encoded as an array of longwords ( 32 bit unsigned whole numbers). The idea of repeatedly ( shifting F until just less than N, then subtracting ) until N is less than F is likely too slow. Any other ideas/math are welcome.
 2003-12-31, 22:18 #2 Cyclamen Persicum     Mar 2003 34 Posts As far as I have understood you need a^b mod c? have you ever heard about Montgomery modular reduction without any devision? The other effective way to calc a residue is a magic table (just like in CRC32 algorithm, but in p-field instead of GF(2))
 2003-12-31, 23:29 #3 dsouza123     Sep 2002 2·331 Posts Thanks for replying. Really b mod c at this point in the powering algorithm but with large positive integers ( upto about 512 bits ). Overall yes, a^b mod c that is why the powering algorithm is being used, the previous algorithm is too slow ( 30 seconds to minutes for larger numbers). I have seen Montgomery math mentioned but I haven't figured it out, yet. I don't have a background in number theory but I am trying to learn parts of it to use in factoring code but for now I'm using basic math with any simple speedup techniques. Would the magic field from a table work with different factors or would it have to be calculated for each one ? Is a p-field related to the values that a mod function can produce? My understanding of mod is it's the remainder after an integer division. What if division isn't available, then how can it be quickly done? Are there methods in modular math that are require very little computation, using some special properties ? I hope someone can verify or find a flaw in the method mentioned for quickly estimating a multiplier of the factor so a single multiplication and subtraction will give the mod. given n -> n*n*2 -> n*n*2 mod f => r will (f - n) / 2 => m -> ( n*n*2 ) - ( f *m) => r or when it is only n*n instead, what is needed ? Does anyone understand modular math well or have a link that explains/teaches it well ? The program I'm working on, both the source and executable are attached in a thread in the programming section. So far only the version with the very slow powmod function is posted. When this part of the powering function is working well the new version will be posted. I hope within a day or so.
2004-01-01, 00:12   #4
Wacky

Jun 2003
The Texas Hill Country

32·112 Posts

Quote:
 My understanding of mod is it's the remainder after an integer division.
That is correct.

Let me make a few observations that may help.

Strictly, the mod function need not be applied until the last step. In theory, you could calculate N and then do all the mod F calculations. I recognize that it is highly desirable to do reductions earlier in the calculation in order to limit the size of the intermediate results.

However, it is not necessary that they be accurate "mod F" calculations. At any point, you can replace "M" with "M-kF", rather than "M mod F" and still have a valid result.

Therefore, I suggest that you consider a cheap approximation to "M mod F" that reduces the value without trying to be exact.

In particular, we can note that numbers that require multi-precision can be approximated by only the most significant part of the number and an exponent that reflects the number of "words" required to express the number.

In particular, if a non-negative number is C(n)*b^(n)n + C(n-1)*b^(n-1) + ... + C(0), then C(n)*b^(n) is a lower bound and (C(n)+1)*b^(n) is an upper bound.

In order to reduce an intermediate result, compute (M(n)+1)/F(r) as an approximation to M/F and subtract that *F. You can easily check if the result uses more words than F. If not, don't bother with further reduction in the current step.

Go on to the next step with a number that is a bit larger than F, but still bounded to a reasonable level.

At the final step, you will have to do tthe last refinement accurately, but until then, be content to keep the intermediate results bounded to a size not much greater than F.

 2004-01-01, 01:34 #5 dsouza123     Sep 2002 2×331 Posts Wow, thanks I'll try that, using an approximate mod that is within reason for the intermediate steps. That is an important point, only the last mod must be exact. The others are done to keep the number from getting large ( or at least larger than the factor ).
 2004-01-01, 09:39 #6 Wacky     Jun 2003 The Texas Hill Country 32×112 Posts Correction I was a bit hasty in my previous post. We want to use M - F*(M(n)/(F(r)+1)*b^(n-r)) as the approximation to M mod F. Note that it is useful to have MP routines not only for (MP op MP) but also (MP op SP).
 2004-01-01, 11:19 #7 Cyclamen Persicum     Mar 2003 5116 Posts Well, guys, now I am gonna to explain you how to perform Montgomery modular reduction without any division and repetitive subtraction which to be ever worse =))). Did you ask for 2^11 mod 23? OK. The computation will be in the decimal numeration within two digits precision for demonstrativeness. Of course, on practice you should perform your computations in 2^32-ary numeration. 1) Let's calculate an inverse element, namely z = - 23^(-1) mod 100 = -87 = 13 2) Instead 2 we should take (2*100) mod 23 which is 16 - "the substitution of a variable" 3) Here all the precomputations are completed! Begin with powering by 11 which is 1011B(inary). Further, S are squares, A is accumulator, k is an auxiliary variable. 4) A=(1*100) mod 23 =100 - 23 = 77 (the substitution for UNIT, the full modulo is not required) 5) S=16^2=256; k=S*z mod 100 = 3328 mod 100 = 28; S=(S+k*23) Div 100 = (256 + 28*23) Div 100 = 900 Div 100 = 9. 6) A=A*S=77*9 = 693; k=A*z mod 100 = 9009 mod 100 = 9; A=(A+k*23) Div 100 = (693+9*23) Div 100 = 900 Div 100 = 9; 7) S=9^2=81; k=S*z mod 100 = 1053 mod 100 = 53; S=(S+k*23) Div 100 = (81 + 53*23) Div 100 =1300 Div 100 = 13. 8) A=A*S=9*13 = 117; k=A*z mod 100 = 1521 mod 100 = 21; A=(A+k*23) Div 100 = (117+21*23) Div 100 = 600 Div 100 = 6; 9) S=13^2=169; k=S*z mod 100 = 2197 mod 100 = 97; S=(S+k*23) Div 100 = (169 + 97*23) Div 100 = 2400 Div 100 = 24. 10) Accumulator update is not required here because of ZERO in 1011B 11) S=24^2=576; k=S*z mod 100 = 7488 mod 100 = 88; S=(S+k*23) Div 100 = (576 + 88*23) Div 100 = 2600 Div 100 = 26. 12) A=A*S=6*26 = 156; k=A*z mod 100 = 2028 mod 100 = 28; A=(A+k*23) Div 100 = (156+28*23) Div 100 = 800 Div 100 = 8; So, 8 is the right answer, but we still require to make "the inverse substitution of the variables": k=A*z mod 100 = 104 mod 100 = 4; A=(A+k*23) Div 100 = (8+4*23) Div 100 = 100 Div 100 = 1; Finally, 2^11 mod 23 = 1 The PowerMod() or other Mod problem is easy!!! 1) Substitution of variables 2) Calculation of FULL problem as it were modulo 100..00 without any division or actual remainder requirement (by cost of two long mults); 3) Inverse substitution of variables to obtain the final result But is the inverse element's start-up auxiliary task z = - m^(-1) mod 10....0 VERY-VERY expensive? The Evangel is that we indeed require ONLY THE INVERSE ELEMENT OF THE VERY SMALL DIGIT!!! Here above 3^(-1) mod 10 = 7; Think hard about it! I use the following procedure in my computations upon 4G-ary numeration: function Inverse32(a:Cardinal):Cardinal; var q,r,a1,a2,b1,b2,tmp:Cardinal; begin a1:=a; b1:=1; a2:=0-a1; b2:=0-b1; if a1
2004-01-01, 14:26   #8
Wacky

Jun 2003
The Texas Hill Country

32×112 Posts

Quote:
 Originally posted by Cyclamen Persicum without any division { snip } q:=a1 Div a2;
And just what do you think the "Div" operator does?

 2004-01-01, 14:49 #9 Cyclamen Persicum     Mar 2003 34 Posts And just what do you think the "Div" operator does? Such a clever guy! "Div" is division within machine precision and worth nothing. When we had made "subst of all vars" the LONG DIVISION is not required completely anymore.
 2004-01-01, 16:01 #10 dsouza123     Sep 2002 2×331 Posts Cyclamen, thanks for the detailed explanation and fully worked example of Montgomery modular reduction without any division and repetitive subtraction. I am sure there are more people who want to understand Montgomery modular math, I certainly do. I'm always glad to learn a new algorithm/ math technique. Coding in pascal is appreciated, the code I've been working in is in Delphi/pascal. I will code the powering using the 2^32 ary numbers, array of longword/cardinal numbers. Wacky, thanks I did code using your suggestion for the approximation of M but haven't finished testing it. When the code is working the program source/executable will be posted.
 2004-01-03, 04:07 #11 dsouza123     Sep 2002 2·331 Posts Wacky, tested the approximate mod function, my code has some bugs ( only works for values that fit in a single 32-bit longword ) but still working on it. The bugs may be in the functions and procedures that the modlw calls and not the modlw itself. Code: Procedure modlw(var r:lwarr; rst,fff:lwarr); // mod of binary array var i,c : integer; k : integer; f,t : lwarr; // lwarr = array[0..lwhi] of longword; lwhi = 31 hr,hf : integer; sld : integer; reps : longword; begin hr := 0; hf := 0; for i := 0 to lwhi do begin r[i] := rst[i]; if r[i] > 0 then hr := i; f[i] := fff[i]; if f[i] > 0 then hf := i; t[i] := 0; end; sld := 0; if (hr > hf) then begin // r larger if (r[hr] > f[hf]) then // r high > f high sld := hr - hf; if (r[hr] = f[hf]) then // r high = f high sld := hr - hf; if (r[hr] < f[hf]) then // r high < f high sld := hr - hf - 1; end; if (hr = hf) then // equal size sld := 0; if (hr < hf) then // r smaller sld := 0; if (sld > 0) then begin for i := 1 to sld do shiftleft32(f); if (r[hr] = f[hf+sld]) then begin if (compare(r,f) = -1) then begin // problem *f larger than r div2(f); // must reduce f if (f[hf+sld] > 0) then sld := sld else sld := sld - 1; end; end; end; reps := 0; // reps is the approximate times f is multiplied by i := 9; // junk init value valid 1, 0, - 1 > = < if (r[hr] > f[hf+sld]) then begin reps := r[hr] div (f[hf+sld]+1); i := 1; end; if (r[hr] = f[hf+sld]) then begin reps := 1; i := 0; end; if (r[hr] < f[hf+sld]) then begin reps := longword( ( (int64(r[hr]) * \$100000000) + int64(r[hr-1]) ) div int64(f[hf+sld]+1) ); i := -1; end; if (reps = 2) then begin times2(f); end; if (reps = 3) then begin for i := 0 to lwhi do t[i] := f[i]; times2(f); // f := f * 2; addition(f,f,t); // f := f + t; end; if (reps > 3) then begin for i := 0 to lwhi do t[i] := f[i]; mult(f,f,t); end; c := compare(r,f); // 1, 0, -1 > = < if (c > 0) then satsubtract(r,r,f); if (c = 0) then // r = f so r := 0 for i := 0 to lwhi do r[i] := 0; // if (c = -1) then // no need value already in mod range end;

 Similar Threads Thread Thread Starter Forum Replies Last Post Pegos Information & Answers 6 2016-08-11 14:39 Dubslow GPU Computing 2 2011-10-27 04:49 alkirah Msieve 2 2009-12-30 14:00 storm5510 Programming 37 2009-09-08 06:52 Unregistered Software 8 2006-10-13 23:35

All times are UTC. The time now is 09:57.

Tue Jan 26 09:57:58 UTC 2021 up 54 days, 6:09, 0 users, load averages: 2.41, 2.05, 1.89