20120729, 20:32  #1 
"Gang aft agley"
Sep 2002
2·1,877 Posts 
Lurid Obsession with Mod Inverse
E. Mayer is working on a paper "Efficient Long Division Via Montgomery Multiply." Among the things presented is an efficient Mod Inverse algorithm.
I've been interested in this algorithm and wanted to think aloud about it without being a serious distraction so I am moving my lurid obsession into the Miscellaneous Math forum. Here I will indulge myself with muttering and pattern idolotry and welcome people pointing out mistakes I make or offering suggestions. 
20120729, 20:56  #2 
"Gang aft agley"
Sep 2002
2×1,877 Posts 
For a given odd number q, what is the inverse within a power of two modulus?
q * qinv ≡ 1 (mod 2^{b}) for b = 1: q = 1, qinv = 1 for b = 2: q = 1, qinv = 1 ; the inverse 1 is always 1 q = 3, qinv = 3 ; 2^{b}  1 is ≡  1 mod (2^{b}) 1 is its own inverse; 1 is its own inverse for b = 3: q = 1, qinv = 1 q = 3, qinv = 3 q = 5, qinv = 5 q = 7, qinv = 7 These are the smallest valid modular inverses and can also be trivially explained by noting that any odd number squared is ≡ 1 (mod 8) The mod operation is intended to take advantage an intrinsic computer data width as part of the computation. Any reduced residue system should work, so many different values of qinv might achieve the desired result using a particular bit width to truncate multiplication results. 
20120729, 21:11  #3 
"Gang aft agley"
Sep 2002
2·1,877 Posts 
for a bit width (b) of 4 bits, I calculated the q,qinv to be
(1,2),(3,11),(5,13),(7,7),(9,9),(11,3),(13,5),(15,15) looking at q * qinv in binary, the results are: 0000 0001 0010 0001 0100 0001 0011 0001 0101 0001 0010 0001 0100 0001 1110 0001 Looking at the least significant bits we are preparing to begin discussion of an algorithm presented in E. Mayer's paper that over successive iterations converges the q * qinv value to 1 for a modulo a particular bit width. 
20120729, 21:36  #4 
"Forget I exist"
Jul 2009
Dumbassville
8384_{10} Posts 
I made a pari code for the brute force method to get these values:
Code:
for(b=4,4,forstep(q=1,2^b1,2,print(((2^b+1)/q)%(2^b)","b","q))) 
20120729, 21:39  #5 
"Gang aft agley"
Sep 2002
2×1,877 Posts 
In the last post, 1 will always be the correct value of inverse for 1.
The above 8 qinv values are valid inverses for varying maximum bit widths: ∞, 5, 6, 4, 4, 5, 6, 5. One point to note is that these widths are from the q * qinv result. E. Mayer's paper presents an algorithm that will always converge on a proper qinv value as long as the initial guess has at least the least significant bit correct. Since these are all odd numbers, that bit is always 1. Considering the valid bits to be the bits of the product q * qinv that end in 0...01, the algorithm is quadratic in the sense that each iteration doubles number of valid bits. Each qinv value is calculated in successive approximation qinv_{new} = q * (2qinv_{old}) He does an optimization on the last step but I skip talking about that for now. Last fiddled with by only_human on 20120729 at 22:05 Reason: Deleted a simplification that I want to double check 
20120729, 21:41  #6 
"Gang aft agley"
Sep 2002
2·1,877 Posts 

20120729, 21:57  #7 
"Gang aft agley"
Sep 2002
2×1,877 Posts 
Now, even if the number of converged bit doubles per iteration, the first few iterations seem particularly pointless because only a small number of bits will be calculated. Usually in these cases, a table lookup might do the trick. Picking a particular size table lookup is a bit problematic because doubling the table size only saves one iteration of the algorithm and the algorithm is already very fast)
E. Mayer presents a good initial guess method that he received in a private communication from P. Montgomery. Using qinv= (3 * q) XOR 2, values will be valid for at least 5 bits. That saves a couple of iterations from the onset (if it were good for 8 bits it would save another iteration). I find the pattern of this interesting and these aspects are definitely stepping into the subforum of Miscellaneous Math's navel gazing but I intend to look a bit further and entertain myself without looking too stupid (if possible). That is the direction I am following at the moment. Last fiddled with by only_human on 20120729 at 21:59 
20120729, 22:40  #8 
"Gang aft agley"
Sep 2002
2×1,877 Posts 
Ok, I have just been working on paper and won't get much further that way but I thought I would lay down the hen scratching of what the initial guess looks like using that 3q xor 2 trick.
Here are the initial values for 5 bit q values. (binary for q*qinv, (q, qinv), with of value of 1 at the end of product): Code:
0000 0000 0001 (1,1),∞ 0000 0010 0001 (3,11),5 0000 0100 0001 (5,13),6 0000 1010 0001 (7,23),5 0000 1110 0001 (9,25),5 0001 1000 0001 (11,35),7 0001 1110 0001 (13,37),5 0010 1100 0001 (15,47),6 0011 0100 0001 (17,49),6 0100 0110 0001 (19,59),5 0101 0000 0001 (21,61),8 0110 0110 0001 (23,71),5 0111 0010 0001 (25,73),5 1000 1100 0001 (27, 83),6 1001 1010 0001 (29,85),5 1011 1000 0001 (31,95),7 Looking at the top half of the table to compare qinv calculations to those I calculated in earlier messages for a q value limited to 4 bits, it can be compared to the inverse calculations I did in posts 3 and 5 of this thread. The width of the 1 value from the product of q * qinv for the first 8 values of this table (using the 3q xor 2 trick) are: ∞, 5, 6, 5, 5, 7, 5, 6 versus the values I calculated earlier: ∞, 5, 6, 4, 4, 5, 6, 5. This is interesting because, using the trick, half of the values are 5 bits and each subsequent higher bit with has half as many occurrences as the previous lower bit width. That is not true for the for the q * qinv products I calculated in earlier messages for a four bit q value. Last fiddled with by only_human on 20120729 at 23:13 Reason: deleted word converged. No iterations have be run from initial guess as yet 
20120729, 23:39  #9  
"Gang aft agley"
Sep 2002
2·1,877 Posts 
Now I reach the the motivation for spawning this thread. From E. Mayer's thread about his paper, I have these things that I want to consider:
Quote:
Quote:
Also it has been noticed that after iterating the algorithm a few times, a region of 1 bits accumulates immediately to the left of the converged 0..01 value. I think that this region of 1s grows 1 bit wider per iteration. I want to look at that too. My thinking is that these 1s accumulate as a sort of overflow which shouldn't optimally happen. Now the single bit immediately to the left of the 0..01 value has to be 1 because otherwise it would be part of the 0...01 value. But, if this region of 1s is growing 1 bit wider per iteration, then from a crude hand wave, each bit in the q * qinv product corresponds to half a bit of q or qinv because log_{2}(q *qinv) = log_{2}q +log_{2}qinv. Not really because q is fixed but the idea is that a nbit * nbit multiply produces a 2nbit result; so if the 2nbit result has a region of 1s that is 1 bit wider per iteration, something is at least half a bit wrong (or could be improved perhaps by rounding) in the qinv value per iteration. Last fiddled with by only_human on 20120730 at 00:09 

20120729, 23:49  #10  
"Gang aft agley"
Sep 2002
2·1,877 Posts 
Quote:
Quote:
Last fiddled with by only_human on 20120730 at 00:05 

20120729, 23:56  #11  
"Forget I exist"
Jul 2009
Dumbassville
2^{6}·131 Posts 
Quote:
one thing that crossed my mind is that q*qinv=1 mod 2^b can be rewritten as q*qinv = z*2^b+1 for known q and b is there a way to find something about the other values ? edit 2: I put in another mod in my printing in a newer version since it the values get over a certain value they no longer look like mod values Last fiddled with by science_man_88 on 20120730 at 00:04 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Inverse of Smoothness Probability  paul0  Math  6  20170725 16:41 
mi64: inverse does not exist  wreck  NFS@Home  1  20160508 15:44 
Inverse of functions  Raman  Math  5  20110413 23:29 
Inverse Laplace Transform  flouran  Math  1  20100118 23:48 
Google.com's unhealthy obsession with famous mathematical constants  ewmayer  Lounge  3  20050821 18:09 