Is this a puzzle?
Consider the number PowerSmooth(1000000) with the 2 factors removed; e.g. in parigp:
[CODE] PowerSmooth(B1)= result=1; forprime(p=3, B1, result *= p^floor(log(B1)/log(p))); return(result) PowerSmooth(1000000) [/CODE] This number, in binary, has 720737 bits set to 1 ("binary hamming weight", hammingweight() in parigp). The puzzle is: find a multiple of this number that has a Hamming Weight <= 710000. (or, How low a HW multiple can you find?) Thanks! :) 
Iterating thru all odd numbers, starting from 3,
[CODE]3:720673 9:720566 21:720169 59:720143 89:719926 127:719857 237:719477 261:719250 817:719025 6923:718588 229095:718410 276755:718351 322621:718138[/CODE] Using the multiplicative inverse of lower order n bits: [CODE]10:720468 12:719859 242:719603 607:719455 1945:719364 1954:718901 5343:718576[/CODE] Here 5343 means the multiplier is Mod( N, 2^5343)^1 Using random odd numbers < 2^64 [CODE]10719928016004921607:720163 9421399378650073937:719987 9838800997419312683:719646 9031419907504264227:719565 12801182192819992405:719465 11276055256514179587:719415 7687617259492261083:719397 3594335502178041641:719255 6046992651811463807:718858 12624414163832553385:718786 4750919058227643243:718688 18130704136173611049:718654 5224850625010740453:718528 1147537420811012579:718472 7233411429689023907:718469 1993933662070155813:718441 6862363197769235605:718113[/CODE] 
I broke through 718K:
multiplying with 1999099*3684553 produces HW=717965 still a long way to go to 710K.. it looks like the "brute force search" is not working very well. 
Brute force probability analysis.
Hitting < 718K should have a probability around 1 in 2e7, so still achievable by brute force, hitting < 710K is much less likely with 1 in 8e76. Formula used (please correct me if my approach is not applicable):
[$$]p = \frac{\sum_{k=0}^{x}{\binom{\lceil log2(n) \rceil}{k}}}{n}[/$$] [LIST][*][$]p[/$] is the probability.[*][$]n[/$] in the number we want to "depopulize".[*][$]x[/$] is the target popcount. If our target popcount would be greater than [$]\lceil log2(n) \rceil[/$], take the sum from [$]x[/$] to [$]\lceil log2(n) \rceil[/$].[/LIST] Since we are multiplying [$]n[/$] with another value to "depopulize" it, the final result number is getting larger. This is not displayed by this formula, but should not make a huge difference, since our value for multiplication is really small in comparison. 
[QUOTE=kruoli;552531]Hitting < 718K should have a probability around 1 in 2e7, so still achievable by brute force, hitting < 710K is much less likely with 1 in 8e76. Formula used (please correct me if my approach is not applicable):
[$$]p = \frac{\sum_{k=0}^{x}{\binom{\lceil log2(n) \rceil}{k}}}{n}[/$$] [LIST][*][$]p[/$] is the probability.[*][$]n[/$] in the number we want to "depopulize".[*][$]x[/$] is the target popcount. If our target popcount would be greater than [$]\lceil log2(n) \rceil[/$], take the sum from [$]x[/$] to [$]\lceil log2(n) \rceil[/$].[/LIST] Since we are multiplying [$]n[/$] with another value to "depopulize" it, the final result number is getting larger. This is not displayed by this formula, but should not make a huge difference, since our value for multiplication is really small in comparison.[/QUOTE]There appears to be a typo. Should the exceptional sum be taken from 0, not x? 
Do you mean my third bullet point? In that case the sum really should still go from [$]x[/$] to [$]\lceil log2(n) \rceil[/$]. But maybe I'm missing something here. :confused2:

Another possibly interesting question: the quantity hammingweight(PowerSmooth(N))/N appears to converge to a value close to 0.72 can one prove that it converges and can one find a relatively simple expression for its value?

[QUOTE=xilman;552537]Another possibly interesting question: the quantity hammingweight(PowerSmooth(N))/N appears to converge to a value close to 0.72 can one prove that it converges and can one find a relatively simple expression for its value?[/QUOTE]
I assume that half the bits are ones, so that's about equivalent to length of the number x being 1.44N bits, or log(x) = 1.44N*log(2) = N. The prime number theorem says the sum of log(p) for primes p less than L is about L. The PowerSmooth function multiplies the largest power of the prime less than L instead of just the prime itself, but that shouldn't make too much difference. So the 0.72 value is probably 1/log(2)/2, where the 1/log(2) comes from the size of the product as predicted by the prime number theorem, and an additional /2 for only half the bits being ones. 
[QUOTE=xilman;552537]Another possibly interesting question: the quantity hammingweight(PowerSmooth(N))/N appears to converge to a value close to 0.72 can one prove that it converges and can one find a relatively simple expression for its value?[/QUOTE]
This should follow a binomial distributionI suspect. Though no proof as bits are not independent. 
[QUOTE=preda;552502]I broke through 718K:
multiplying with 1999099*3684553 produces HW=717965 still a long way to go to 710K.. it looks like the "brute force search" is not working very well.[/QUOTE] Minor improvements from brute force: [CODE]27275863:717813 93899447:717739[/CODE] 
[CODE]136110417:717547
173564051:717463[/CODE] 
[QUOTE=axn;552604][CODE]136110417:717547
173564051:717463[/CODE][/QUOTE] [url]https://www.youtube.com/watch?v=EY27lgnPKWI&feature=youtu.be&t=9[/url] I'm at set bits=644353 for an equivalent defined problem AND still counting. 
[QUOTE=R. Gerbicz;552648][url]https://www.youtube.com/watch?v=EY27lgnPKWI&feature=youtu.be&t=9[/url]
I'm at set bits=644353 for an equivalent defined problem AND still counting.[/QUOTE] LOL. More like too dumb, too slow :smile: BTW, Interested in the your problem and solution approach. 
[QUOTE=axn;552651]BTW, Interested in the your problem and solution approach.[/QUOTE]
OK, here it is: Using an improved method Houston, reached 111.181 set bits (!), and with lower limits of the solution this could work easily on GPU also. [note that we started from 720737 ones of N !!!]. See the history here: [url]https://mersenneforum.org/showpost.php?p=552398&postcount=21[/url] why we need this for P1 test (from another topic) The problem is to compute x^N mod M given x^(2^e) mod M for all (small values). It is similar to addition chain problem but here the 2^e is given free. Use the sliding window method, you could ask why a left to right method works here, the short reason is that because the "bases" x^(2^e) is given. So let N=sum(i=0,t,2^e[i]*u[i]), where for a fixed E>0: 0<u[i]<2^(E+1) and u[i] is odd, and in each possible choice we choose the maximal u[] that still fits in (0,2^(E+1)) [this choice is unique]. You need to compute x^N mod M what is x^sum(i=0,t,2^e[i]*u[i]) mod M= prod(i=1,2^(E+1),2,(x^(2^e[i]))^u[i]) = = prod(i=1,2^(E+1)1,2,(prod(x^(2^e[j]) : u[j]=i))^i) mod M = = prod(i=1,2^(E+1)1,2,y[i]^i) mod M, Here for each i we defined y[i]=prod(x^(2^e[j]) : u[j]=i) And can still compute that final product with a chain method: Say for E=2: you need R=y[1]*y[3]^3*y[5]^5*y[7]^7 a0=y[7] res=y[7] a1=y[5]*a0 (=y[5]*y[7]) res=res*a1 (=y[5]*y[7]^2) a2=y[3]*a1 (=y[3]*y[5]*y[7]) res=res*a2 (=y[3]*y[5]^2*y[7]^3) a4=y[1]*a3 (=y[1]*y[3]*y[5]*y[7]) res=res^2 res=res*a4 (=y[1]*y[3]^3*y[5]^5*y[7]^7) then R=res. So basically you'd need only 2^(E+1) mulmods in this last part. Returning to our original problem, the optimal solution is using E=13, with O(2^E*length(M)) memory, not store the x^(2^e) numbers in memory just the y[] array. Use such number that is still optimal/fits in GPU memory. 
[QUOTE=R. Gerbicz;552676]OK, here it is:
Using an improved method Houston, reached 111.181 set bits (!), and with lower limits of the solution this could work easily on GPU also. [note that we started from 720737 ones of N !!!].[/QUOTE] Hmm what do you mean by bit count here? Are you talking about the same thing (number of 1 bits in N*m for some integer m)? The algorithm you describe seems to be about computing x^N mod M more cheaply, I don't see how it'd apply to finding an m such that N*m has fewer set bits... 
[QUOTE=uau;552695]Hmm what do you mean by bit count here? Are you talking about the same thing (number of 1 bits in N*m for some integer m)? The algorithm you describe seems to be about computing x^N mod M more cheaply, I don't see how it'd apply to finding an m such that N*m has fewer set bits...[/QUOTE]
See: the original task for P1 method is to get x^N mod M, but with multiplying N with a "small" number is still good for the method, so when you compute r(k)=x^(k*N) mod M [and then gcd(r(k)1,M)]. In our case x^(2^e) is given free, because you're computing the prp residue for M=2^p1. The idea from Preda was to reuse these residues to compute the required residue for the P1 test cheaper than the standard way, and it is really possible you need just hamming(N) mulmods if log2(N)=b this is roughly b/2 for random N, and here you can still use a small multiple of N to lower this, what this topic started. In my method I'm using much smaller number of multiplications to compute this x^N mod M, if all x^(2^e) mod M is given. Note that this is a nonconstant improvement, in general we can reach O(b/log(b)) mulmods, what is much better than the average b/2. You could still use the k*N idea here but as in above that gives very small gain. 
[QUOTE=R. Gerbicz;552699]In my method I'm using much smaller number of multiplications to compute this x^N mod M, if all x^(2^e) mod M is given.[/QUOTE]
Yes I got that part, what I meant to ask was what the "reached 111.181 set bits" was measuring  in context it first seemed to be about the N*m thing, but then your algorithm was about something else. So you mean something like your algorithm computing the value using 111181 multiply operations? 
[QUOTE=uau;552703]Yes I got that part, what I meant to ask was what the "reached 111.181 set bits" was measuring  in context it first seemed to be about the N*m thing, but then your algorithm was about something else. So you mean something like your algorithm computing the value using 111181 multiply operations?[/QUOTE]
Yes, so used the "number of set bits"="number of multiplications" to describe the solution's goodness. Written a small Pari code to get that number for E=13 for the given N. 
[QUOTE=R. Gerbicz;552676]OK, here it is:
[/QUOTE] Nice Robert, thank you! I look forward to implementing this, I have a feeling P1 firststage just got a speed boost :) (disclaimer: when used in conjuction with PRP) 
If it is possible/cheap to compute 3^1 mod M, then it is possible to do even a little better by first recoding the exponent in signed representation {1, 0, 1}.
[CODE] powersmooth(1000000) has 1442080 bits exp has 720737 nonzeros recoded exp has 480780 nonzeros sliding windowed recoded exp needs [B][U]100622[/U][/B] muls [/CODE] Edit: Also, all of the signed exponent words are odd. Storage of the precomputed window terms doubles because you need to precompute and store 2^E through 2^E, but is then halved because you only need to precompute/store half of them (the odd ones), so you still need storage of O(2^E*length(M)). 
[QUOTE=bsquared;552785]If it is possible/cheap to compute 3^1 mod M, then it is possible to do even a little better by first recoding the exponent in signed representation {1, 0, 1}.
[CODE] powersmooth(1000000) has 1442080 bits exp has 720737 nonzeros recoded exp has 480780 nonzeros sliding windowed recoded exp needs [B][U]100622[/U][/B] muls [/CODE] [/QUOTE] Really don't understand, could you explain it? But I've just revisited my pari code and forget to subtract one, so my updated code gives only 104341 multiplications for E=13 (and that is still the optimal). In general for random N, and if we fill up the whole y[] array, then we can expect log2(N)/(E+2)+2^E multiplications (in our case this formula gives 104330 as approx. so quite close to the real number). 
[QUOTE=R. Gerbicz;552786]Really don't understand, could you explain it?
But I've just revisited my pari code and forget to subtract one, so my updated code gives only 104341 multiplications for E=13 (and that is still the optimal). In general for random N, and if we fill up the whole y[] array, then we can expect log2(N)/(E+2)+2^E multiplications (in our case this formula gives 104330 as approx. so quite close to the real number).[/QUOTE] I'm just looking for ways to minimize the number of multiplications in a modular exponentiation x^N mod M. In the [URL="http://cacr.uwaterloo.ca/hac/about/chap14.pdf"]handbook of applied cryptography[/URL], chapter 14 algorithm 14.85 we can just use lefttoright sliding windows to get the number of multiplications down to 102989 plus precomputing 2^12 odd multipliers x^b, 1 <= b < 2^k (b odd), using the (optimal) max window size k=13. Algorithm 14.121 describes how to recode the exponent in signed digits. For example e = 2^9 + 2^8 + 2^6 + 2^5 + 2^4 + 2^2 + 2 + 1 = 2^10  2^7  2^3  1. Doing the recoding first on our 1442080bit exponent reduces the number of nonzero bits from to 720737 (one bits) to 480780 (one or minusone bits). Then doing the sliding window algorithm on the recoded exponent results in 100622 muls (plus precomputation). Those multiplications will range over elements x^b, 2^13 < b < 2^13, b odd. So, now that I've written all this down, I see that recoding probably doesn't save much over just doing sliding window. Especially since you need x^1 mod M with recoding. 
Maybe also relevant to mention that the problem of finding the shortest addition chain for a given binary string is known to be NPhard (section 14.102 in HOAC). But you'd only have to do it once for a fixed B1.

[QUOTE=bsquared;552789]
Doing the recoding first on our 1442080bit exponent reduces the number of nonzero bits from to 720737 (one bits) to 480780 (one or minusone bits). Then doing the sliding window algorithm on the recoded exponent results in 100622 muls (plus precomputation). Those multiplications will range over elements x^b, 2^13 < b < 2^13, b odd. So, now that I've written all this down, I see that recoding probably doesn't save much over just doing sliding window. Especially since you need x^1 mod M with recoding.[/QUOTE] Searched today exactly that crypto paper, thanks. I'll check it. 
[QUOTE=bsquared;552789]we can just use lefttoright sliding windows to get the number of multiplications down to 102989 plus precomputing 2^12 odd multipliers x^b, 1 <= b < 2^k (b odd), using the (optimal) max window size k=13.
[/QUOTE] Yes, I've used the popular sliding window name, but here there is no such precomputation here. Yeah it is hard to believe there is such an algorithm, because in that sliding window you are actually doing many squaring, so you could not reach less than log2(N)=b multiplications, that is a complexity barrier here. But as I've written with the given x^(2^e) you can reach ~b/2 multiplications if you are using only the bits=1, what is now lower than the barrier. But there is another type of "precomputation" when in y[k] you're accumulating the products you needed to raise to the kth power. To get an inside where I've gotten the idea, when you want to multiple by x^63 with the traditional method you need 6 multiplications but with: 1. multiple x and x^4 to get x^5 2. multiple x^5 and x^16 to get x^21 3. square x^21 to get x^42 4. multiple x^21 and x^42 to get x^63 You need only 5, because you need to multiple the partial product by x^63. But now it is already a gain over the classical method. Found this in my head, without computer, and actually this is the smallest counterexample. You can do even better multiple by x^64 and divide by x. Using only 3 mults total. Similar to your method. And here x is NOT fixed, but if you extract those x bases using the given N exponent/residues and see what you need to raise to the k=63 or any power in the window you can save lots of work, what we reached with the optimal E=13. 
I see that you are finding a short addition chain using the x^(2^k) elements that will be computed anyway during the test. That's a very nice approach. As mentioned, the short addition chain problem is hard, so I was looking at how other known methods compared. I had thought that exponent recoding gave a larger benefit but after looking closer it is not much better than a standard sliding window. Both of those known methods are not any better than what you have found (but close).

[QUOTE=bsquared;552795]That's a very nice approach. As mentioned, the short addition chain problem is hard.[/QUOTE]
What addition chains here is now obsolete in the method. See one of my post, the required final task for E=2 to get: R=y[1]*y[3]^3*y[5]^5*y[7]^7 mod M. 
[QUOTE=bsquared;552789]
Doing the recoding first on our 1442080bit exponent reduces the number of nonzero bits from to 720737 (one bits) to 480780 (one or minusone bits). Then doing the sliding window algorithm on the recoded exponent results in 100622 muls (plus precomputation). Those multiplications will range over elements x^b, 2^13 < b < 2^13, b odd. [/QUOTE] Checked those 480780, 100622 numbers. Turned out that it is using more multiplications, I counted every mults, and to get the final product from the y[] array you also need mults. I mean this is not that a big surprise, for the old/known powmod problem your signed representation could be a winner, but this is a slightly different problem. Counted 104979 mults for your method but using E=13, and 106085 for E=12. [I have 104341 for my method with E=13]. Actually when you first hit a given entry in the y[] array then you can do only a (fast) set and not a mult, so that is smaller than 100622 mults (in my method you can also do this trick). And for the exponents (for E=12) yes it is true that  2^(E+1)< exponent <2^(E+1) but since in the signed representation you don't see two consectuive nonzero term the sharper inequality holds: abs(exponent)<=2^E*4/3 (just sum 2^E,2^(E2),2^(E4) etc.) even with these tiny improvements you lost (at least for this given fixed N), but likely in the general case also. 
I've found another algorithm to compute g^exponent once you are given all g^(2^k) [mod N]. Turned out it is using more mulmods, but this still depends how efficiently you partition the bits=1 in the exponent.
The problem with the fixed window size is that you are killing those ones not that efficiently (because only half of the bits is one), what about using not fixed size windows? Fix an offset vector: offset=[0,...,], and we're searching shifts t, for that for all i index at offset[i]+t there is a one in exponent, and to get many hits say size(offset)<=14 for our N. We can use an abnormally large window, say offset[i]<2000. Accumulate the product of these x(k)=prod(g^(2^t)), at the end you need to calculate prod(x(k)^e(k)), do now the standard way, collect the product for each 2^h in the exponent e(k), note that e(k)=prod(2^offset[]), so e(k) contains very few bits=1.The complexity of each offset is: count1+size(offset) mulmods, where you have count different t shits for the offset, hence its rate: rate=(count1+size(offset))/(size(offset)*count) [you need to lower the rate]. 
All times are UTC. The time now is 11:09. 
Powered by vBulletin® Version 3.8.11
Copyright ©2000  2021, Jelsoft Enterprises Ltd.