20200805, 16:26  #12 
"Robert Gerbicz"
Oct 2005
Hungary
58B_{16} Posts 
https://www.youtube.com/watch?v=EY27...e=youtu.be&t=9
I'm at set bits=644353 for an equivalent defined problem AND still counting. 
20200805, 16:47  #13  
Jun 2003
2·5·479 Posts 
Quote:
BTW, Interested in the your problem and solution approach. 

20200805, 19:12  #14 
"Robert Gerbicz"
Oct 2005
Hungary
3×11×43 Posts 
OK, here it is:
Using an improved method Houston, reached 111.181 set bits (!), and with lower limits of the solution this could work easily on GPU also. [note that we started from 720737 ones of N !!!]. See the history here: https://mersenneforum.org/showpost.p...8&postcount=21 why we need this for P1 test (from another topic) The problem is to compute x^N mod M given x^(2^e) mod M for all (small values). It is similar to addition chain problem but here the 2^e is given free. Use the sliding window method, you could ask why a left to right method works here, the short reason is that because the "bases" x^(2^e) is given. So let N=sum(i=0,t,2^e[i]*u[i]), where for a fixed E>0: 0<u[i]<2^(E+1) and u[i] is odd, and in each possible choice we choose the maximal u[] that still fits in (0,2^(E+1)) [this choice is unique]. You need to compute x^N mod M what is x^sum(i=0,t,2^e[i]*u[i]) mod M= prod(i=1,2^(E+1),2,(x^(2^e[i]))^u[i]) = = prod(i=1,2^(E+1)1,2,(prod(x^(2^e[j]) : u[j]=i))^i) mod M = = prod(i=1,2^(E+1)1,2,y[i]^i) mod M, Here for each i we defined y[i]=prod(x^(2^e[j]) : u[j]=i) And can still compute that final product with a chain method: Say for E=2: you need R=y[1]*y[3]^3*y[5]^5*y[7]^7 a0=y[7] res=y[7] a1=y[5]*a0 (=y[5]*y[7]) res=res*a1 (=y[5]*y[7]^2) a2=y[3]*a1 (=y[3]*y[5]*y[7]) res=res*a2 (=y[3]*y[5]^2*y[7]^3) a4=y[1]*a3 (=y[1]*y[3]*y[5]*y[7]) res=res^2 res=res*a4 (=y[1]*y[3]^3*y[5]^5*y[7]^7) then R=res. So basically you'd need only 2^(E+1) mulmods in this last part. Returning to our original problem, the optimal solution is using E=13, with O(2^E*length(M)) memory, not store the x^(2^e) numbers in memory just the y[] array. Use such number that is still optimal/fits in GPU memory. Last fiddled with by R. Gerbicz on 20200806 at 14:15 Reason: minor typos (in one place there was N instead of M) 
20200805, 21:53  #15 
Jan 2017
79 Posts 
Hmm what do you mean by bit count here? Are you talking about the same thing (number of 1 bits in N*m for some integer m)? The algorithm you describe seems to be about computing x^N mod M more cheaply, I don't see how it'd apply to finding an m such that N*m has fewer set bits...
Last fiddled with by uau on 20200805 at 21:53 
20200805, 22:20  #16  
"Robert Gerbicz"
Oct 2005
Hungary
3·11·43 Posts 
Quote:
In my method I'm using much smaller number of multiplications to compute this x^N mod M, if all x^(2^e) mod M is given. Note that this is a nonconstant improvement, in general we can reach O(b/log(b)) mulmods, what is much better than the average b/2. You could still use the k*N idea here but as in above that gives very small gain. 

20200805, 22:38  #17 
Jan 2017
79 Posts 
Yes I got that part, what I meant to ask was what the "reached 111.181 set bits" was measuring  in context it first seemed to be about the N*m thing, but then your algorithm was about something else. So you mean something like your algorithm computing the value using 111181 multiply operations?
Last fiddled with by uau on 20200805 at 22:38 
20200805, 22:43  #18  
"Robert Gerbicz"
Oct 2005
Hungary
10110001011_{2} Posts 
Quote:
Written a small Pari code to get that number for E=13 for the given N. 

20200806, 06:14  #19 
"Mihai Preda"
Apr 2015
531_{16} Posts 

20200806, 17:11  #20 
"Ben"
Feb 2007
2^{3}×419 Posts 
If it is possible/cheap to compute 3^1 mod M, then it is possible to do even a little better by first recoding the exponent in signed representation {1, 0, 1}.
Code:
powersmooth(1000000) has 1442080 bits exp has 720737 nonzeros recoded exp has 480780 nonzeros sliding windowed recoded exp needs 100622 muls Also, all of the signed exponent words are odd. Storage of the precomputed window terms doubles because you need to precompute and store 2^E through 2^E, but is then halved because you only need to precompute/store half of them (the odd ones), so you still need storage of O(2^E*length(M)). Last fiddled with by bsquared on 20200806 at 17:16 
20200806, 17:54  #21  
"Robert Gerbicz"
Oct 2005
Hungary
58B_{16} Posts 
Quote:
But I've just revisited my pari code and forget to subtract one, so my updated code gives only 104341 multiplications for E=13 (and that is still the optimal). In general for random N, and if we fill up the whole y[] array, then we can expect log2(N)/(E+2)+2^E multiplications (in our case this formula gives 104330 as approx. so quite close to the real number). 

20200806, 18:44  #22  
"Ben"
Feb 2007
6430_{8} Posts 
Quote:
In the handbook of applied cryptography, chapter 14 algorithm 14.85 we can just use lefttoright sliding windows to get the number of multiplications down to 102989 plus precomputing 2^12 odd multipliers x^b, 1 <= b < 2^k (b odd), using the (optimal) max window size k=13. Algorithm 14.121 describes how to recode the exponent in signed digits. For example e = 2^9 + 2^8 + 2^6 + 2^5 + 2^4 + 2^2 + 2 + 1 = 2^10  2^7  2^3  1. Doing the recoding first on our 1442080bit exponent reduces the number of nonzero bits from to 720737 (one bits) to 480780 (one or minusone bits). Then doing the sliding window algorithm on the recoded exponent results in 100622 muls (plus precomputation). Those multiplications will range over elements x^b, 2^13 < b < 2^13, b odd. So, now that I've written all this down, I see that recoding probably doesn't save much over just doing sliding window. Especially since you need x^1 mod M with recoding. 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Some puzzle  Harrywill  Puzzles  4  20170503 05:10 
Four 1's puzzle  Uncwilly  Puzzles  75  20120612 13:42 
a puzzle  bcp19  Puzzles  18  20120302 04:11 
Dot puzzle  nibble4bits  Puzzles  37  20060227 09:35 
now HERE'S a puzzle.  Orgasmic Troll  Puzzles  6  20051208 07:19 