mersenneforum.org "Better than pseudo-random" iterator for Pollard's Rho?
 Register FAQ Search Today's Posts Mark Forums Read

 2014-12-11, 13:23 #1 mickfrancis   Apr 2014 Marlow, UK 1110002 Posts "Better than pseudo-random" iterator for Pollard's Rho? I wonder if anyone could answer a question I have regarding iterator functions used in Pollard's Rho algorithm: As I understand it, the aim is to use a function that is as psuedo-random as possible; this is assumed in the complexity analyses that have been carried out. It should also be computationally cheap for obvious reasons. However, there are functions that we could consider that have shorter cycle (and tail) lengths than a perfectly (psuedo-)random one. An absurd example is the function that generates numbers spaced at a multiple of a factor (p) of N: Code:  p = getSmallestFactor(N); function f(x) = (x + p) % N; This would loop (mod p) immediately and Pollard's Rho would find the factor. This would, of course completely defeat the object, as we are factorizing in order to find a factor. My question is this: Are there 'better than psuedo-random' iterators that could be used, whose cycle-shortening benefits out-weigh their additional cost? I have been playing with a class of function (that doesn't know about factors of N) that does produce shorter cycles. Unfortunately, the cost outweighs the benefits: Suppose we have an array of distinct non-zero values C[0], C[1], ... and corresponding functions f(0), f(1), ... where f(i) is the 'standard' iterator using C[i] as the constant: Code:  function f(i)(x) = (x^2 + C[i]) % N; If o represents function composition and we define function F(m) by: Code:  function F(m) = (m == 0) ? f(0) : f(m) o F(m-1); then Code:  F(0) is the 'standard' iterator using C[0] as constant: x -> (x^2 + C[0]) % N F(1) is the function that iterates twice (with different C values): x -> ((x^2 + C[0])^2 + C[1]) % N = (x^4 + 2C[0]x^2 + C[0]^2 + C[1]) % N F(m)(x) generates a monic polynomial in x of degree 2^(m+1), with coefficients of x^i being non-zero iff i is even. In tests, with N = 18868629790280349989518009992987136785227164415079137 (C53) with factor 7777774777777 (P13), I get the cycle lengths and times shown below. This is a Java implementation, so not that fast. The right-most set of values used x^1024 rather than x^2. Code: +------------+---------------------++---------------------++---------------------------++---------------------------+ | | C[i] = i + 1 || C[i] = RANDOM() || C[i] = 16384 + 16777216 i || C[i] = 16384 + 16777216 i | | | || || || exp=1024 | | +-----------+---------++-----------+---------++-----------+---------------++-----------+---------------+ | m | cycle len | time (s)|| cycle len | time (s)|| cycle len | time (s) || cycle len | time (s) | +------------+-----------+---------++-----------+---------++-----------+---------------++-----------+---------------+ | 0 | 2537770 | 4 || 1742273 | 9 || 9408 | 5 || 6073387 | 10 | | 1 | 1149581 | 4 || 559885 | 25 || 784504 | 5 || 2370108 | 8 | | 2 | 85333 | 10 || 168663 | 1 || 1217979 | 7 || 1852565 | 7 | | 3 | 1254046 | 7 || 2800001 | 26 || 573250 | 4 || 411681 | 2 | | 4 | 437527 | 10 || 135296 | 7 || 203384 | 10 || 1012270 | 11 | | 5 | 1218369 | 10 || 682778 | 9 || 3235934 | 28 || 226494 | 5 | | 6 | 1026505 | 7 || 437593 | 21 || 365988 | 8 || 195009 | 12 | | 7 | 750226 | 8 || 451631 | 13 || 229190 | 2 || 503277 | 4 | | 8 | 418334 | 5 || 294776 | 26 || 1294949 | 18 || 1409182 | 17 | | 9 | 349321 | 35 || 375245 | 7 || 52603 | 9 || 861531 | 10 | | 10 | 567977 | 9 || 1936414 | 37 || 26064 | 10 || 391137 | 5 | | 11 | 156450 | 3 || 1551018 | 37 || 1326637 | 24 || 1158178 | 20 | | 12 | 24626 | 10 || 187018 | 5 || 165718 | 13 || 297729 | 5 | | 13 | 121306 | 12 || 1002300 | 25 || 683933 | 14 || 251780 | 13 | | 14 | 442266 | 7 || 181412 | 20 || 85678 | 14 || 521311 | 8 | | 15 | 156963 | 13 || 183218 | 12 || 946 | 29 || 93879 | 3 | | 16 | 249666 | 16 || 662852 | 25 || 692360 | 17 || 1103360 | 29 | | 20 | 126940 | 5 || 43376 | 7 || 1379429 | 43 || 114094 | 9 | | 24 | 314699 | 11 || 1007216 | 44 || 282941 | 11 || 278688 | 10 | | 28 | 537623 | 24 || 347046 | 21 || 7142 | 7 || 150583 | 6 | | 32 | 62009 | 4 || 423423 | 26 || 93590 | 31 || 666143 | 29 | | 48 | 223585 | 25 || 235338 | 72 || 28398 | 23 || 529759 | 40 | | 64 | 11518 | 1 || 41043 | 44 || 28635 | 8 || 163760 | 14 | | 96 | 16999 | 2 || 272321 | 63 || 427018 | 53 || 81186 | 10 | | 128 | 258409 | 34 || 137880 | 170 || 310206 | 63 || 561129 | 107 | | 196 | 317769 | 84 || 130109 | 85 || 66443 | 46 || 63363 | 46 | | 256 | 77296 | 27 || 68387 | 84 || 37436 | 62 || 62412 | 111 | | 512 | 57461 | 115 || 139949 | 169 || 104195 | 71 || 98073 | 121 | | 1024 | 213690 | 250 || 33953 | 168 || 24878 | 247 || 831 | 52 | | 2048 | 33468 | 215 || 10758 | 183 || 78044 | 261 || 4163 | 13 | | 8192 | 42665 | 456 || 31186 | 435 || 23467 | 274 || 46184 | 463 | | 16384 | 6590 | 877 || 30331 | 862 || 5191 | 528 || 1481 | 107 | | 65536 | 6861 | 1020 || 2597 | 371 || 1293 | 517 || 4511 | 438 | | 132772 | 5542 | 929 || 3656 | 868 || 6285 | 1166 || 4351 | 876 | | 524288 | 327 | 234 || 2439 | 2870 || 3967 | 2575 || 31 | 445 | | 1048576 | 537 | 1740 || 6 | 5535 || 2699 | 4414 || 1708 | 4057 | | 2097152 | 1344 | 4125 || 935 | 3611 || 361 | 2343 || 1518 | 3862 | | 4194304 | 330 | 3793 || 10 |15137 || 746 | 8931 || 776 | 3921 | | 8388608 | 885 |18768 || 503 | 5366 || 167 | 4477 || | | | 16777216 | 320 | 8347 || | || 134 | 17180 || | | +------------+-----------+---------++-----------+---------++-----------+---------------++-----------+---------------+  (I haven't recorded rho tail length, which is why times appear erratic.) Is there a well-known reason why cycle lengths tend to decrease as m increases? If all C[i] values are the same, the iterator F(m) degenerates to repeated application of the 'standard' iterator, and will (I think) lead to shorter cycle length iff GCD(m+1,cycleLength) != 1, but this is a different effect. Thanks in advance for any feedback on this, Mick.
2014-12-11, 14:43   #2
R.D. Silverman

"Bob Silverman"
Nov 2003
North of Boston

165108 Posts

Quote:
 Originally Posted by mickfrancis I wonder if anyone could answer a question I have regarding iterator functions used in Pollard's Rho algorithm: As I understand it, the aim is to use a function that is as psuedo-random as possible; this is assumed in the complexity analyses that have been carried out. It should also be computationally cheap for obvious reasons. However, there are functions that we could consider that have shorter cycle (and tail) lengths than a perfectly (psuedo-)random one. An absurd example is the function that generates numbers spaced at a multiple of a factor (p) of N: Code:  p = getSmallestFactor(N); function f(x) = (x + p) % N; This would loop (mod p) immediately and Pollard's Rho would find the factor. This would, of course completely defeat the object, as we are factorizing in order to find a factor. My question is this: Are there 'better than psuedo-random' iterators that could be used, whose cycle-shortening benefits out-weigh their additional cost? I have been playing with a class of function (that doesn't know about factors of N) that does produce shorter cycles. Unfortunately, the cost outweighs the benefits: Suppose we have an array of distinct non-zero values C[0], C[1], ... and corresponding functions f(0), f(1), ... where f(i) is the 'standard' iterator using C[i] as the constant: Code:  function f(i)(x) = (x^2 + C[i]) % N; If o represents function composition and we define function F(m) by: Code:  function F(m) = (m == 0) ? f(0) : f(m) o F(m-1); then Code:  F(0) is the 'standard' iterator using C[0] as constant: x -> (x^2 + C[0]) % N F(1) is the function that iterates twice (with different C values): x -> ((x^2 + C[0])^2 + C[1]) % N = (x^4 + 2C[0]x^2 + C[0]^2 + C[1]) % N F(m)(x) generates a monic polynomial in x of degree 2^(m+1), with coefficients of x^i being non-zero iff i is even. In tests, with N = 18868629790280349989518009992987136785227164415079137 (C53) with factor 7777774777777 (P13), I get the cycle lengths and times shown below. This is a Java implementation, so not that fast. The right-most set of values used x^1024 rather than x^2. Code: +------------+---------------------++---------------------++---------------------------++---------------------------+ | | C[i] = i + 1 || C[i] = RANDOM() || C[i] = 16384 + 16777216 i || C[i] = 16384 + 16777216 i | | | || || || exp=1024 | | +-----------+---------++-----------+---------++-----------+---------------++-----------+---------------+ | m | cycle len | time (s)|| cycle len | time (s)|| cycle len | time (s) || cycle len | time (s) | +------------+-----------+---------++-----------+---------++-----------+---------------++-----------+---------------+ | 0 | 2537770 | 4 || 1742273 | 9 || 9408 | 5 || 6073387 | 10 | | 1 | 1149581 | 4 || 559885 | 25 || 784504 | 5 || 2370108 | 8 | | 2 | 85333 | 10 || 168663 | 1 || 1217979 | 7 || 1852565 | 7 | | 3 | 1254046 | 7 || 2800001 | 26 || 573250 | 4 || 411681 | 2 | | 4 | 437527 | 10 || 135296 | 7 || 203384 | 10 || 1012270 | 11 | | 5 | 1218369 | 10 || 682778 | 9 || 3235934 | 28 || 226494 | 5 | | 6 | 1026505 | 7 || 437593 | 21 || 365988 | 8 || 195009 | 12 | | 7 | 750226 | 8 || 451631 | 13 || 229190 | 2 || 503277 | 4 | | 8 | 418334 | 5 || 294776 | 26 || 1294949 | 18 || 1409182 | 17 | | 9 | 349321 | 35 || 375245 | 7 || 52603 | 9 || 861531 | 10 | | 10 | 567977 | 9 || 1936414 | 37 || 26064 | 10 || 391137 | 5 | | 11 | 156450 | 3 || 1551018 | 37 || 1326637 | 24 || 1158178 | 20 | | 12 | 24626 | 10 || 187018 | 5 || 165718 | 13 || 297729 | 5 | | 13 | 121306 | 12 || 1002300 | 25 || 683933 | 14 || 251780 | 13 | | 14 | 442266 | 7 || 181412 | 20 || 85678 | 14 || 521311 | 8 | | 15 | 156963 | 13 || 183218 | 12 || 946 | 29 || 93879 | 3 | | 16 | 249666 | 16 || 662852 | 25 || 692360 | 17 || 1103360 | 29 | | 20 | 126940 | 5 || 43376 | 7 || 1379429 | 43 || 114094 | 9 | | 24 | 314699 | 11 || 1007216 | 44 || 282941 | 11 || 278688 | 10 | | 28 | 537623 | 24 || 347046 | 21 || 7142 | 7 || 150583 | 6 | | 32 | 62009 | 4 || 423423 | 26 || 93590 | 31 || 666143 | 29 | | 48 | 223585 | 25 || 235338 | 72 || 28398 | 23 || 529759 | 40 | | 64 | 11518 | 1 || 41043 | 44 || 28635 | 8 || 163760 | 14 | | 96 | 16999 | 2 || 272321 | 63 || 427018 | 53 || 81186 | 10 | | 128 | 258409 | 34 || 137880 | 170 || 310206 | 63 || 561129 | 107 | | 196 | 317769 | 84 || 130109 | 85 || 66443 | 46 || 63363 | 46 | | 256 | 77296 | 27 || 68387 | 84 || 37436 | 62 || 62412 | 111 | | 512 | 57461 | 115 || 139949 | 169 || 104195 | 71 || 98073 | 121 | | 1024 | 213690 | 250 || 33953 | 168 || 24878 | 247 || 831 | 52 | | 2048 | 33468 | 215 || 10758 | 183 || 78044 | 261 || 4163 | 13 | | 8192 | 42665 | 456 || 31186 | 435 || 23467 | 274 || 46184 | 463 | | 16384 | 6590 | 877 || 30331 | 862 || 5191 | 528 || 1481 | 107 | | 65536 | 6861 | 1020 || 2597 | 371 || 1293 | 517 || 4511 | 438 | | 132772 | 5542 | 929 || 3656 | 868 || 6285 | 1166 || 4351 | 876 | | 524288 | 327 | 234 || 2439 | 2870 || 3967 | 2575 || 31 | 445 | | 1048576 | 537 | 1740 || 6 | 5535 || 2699 | 4414 || 1708 | 4057 | | 2097152 | 1344 | 4125 || 935 | 3611 || 361 | 2343 || 1518 | 3862 | | 4194304 | 330 | 3793 || 10 |15137 || 746 | 8931 || 776 | 3921 | | 8388608 | 885 |18768 || 503 | 5366 || 167 | 4477 || | | | 16777216 | 320 | 8347 || | || 134 | 17180 || | | +------------+-----------+---------++-----------+---------++-----------+---------------++-----------+---------------+  (I haven't recorded rho tail length, which is why times appear erratic.) Is there a well-known reason why cycle lengths tend to decrease as m increases? If all C[i] values are the same, the iterator F(m) degenerates to repeated application of the 'standard' iterator, and will (I think) lead to shorter cycle length iff GCD(m+1,cycleLength) != 1, but this is a different effect. Thanks in advance for any feedback on this, Mick.
(1) You will not do better than cycles of length sqrt( pi/2 * p) ON AVERAGE with any prng. This is the expected
(asymptotic) cycle length based on the birthday "paradox".

(2) Even if you were to succeed in your goal, the effort would be pointless, IMO because you would be
finding a small improvement in an OBSOLETE algorithm. For numbers that are within reach of the method,
ECM and other methods will be faster.

2014-12-11, 14:57   #3
bsquared

"Ben"
Feb 2007

372310 Posts

Quote:
 Originally Posted by R.D. Silverman (2) Even if you were to succeed in your goal, the effort would be pointless, IMO because you would be finding a small improvement in an OBSOLETE algorithm....
But IMO the effort is not pointless with respect to furthering his interest in and understanding of number theory, programming, and factorization methods.

2014-12-11, 15:14   #4
mickfrancis

Apr 2014
Marlow, UK

23·7 Posts

Quote:
 Originally Posted by R.D. Silverman (1) You will not do better than cycles of length sqrt( pi/2 * p) ON AVERAGE with any prng. This is the expected (asymptotic) cycle length based on the birthday "paradox". (2) Even if you were to succeed in your goal, the effort would be pointless, IMO because you would be finding a small improvement in an OBSOLETE algorithm. For numbers that are within reach of the method, ECM and other methods will be faster.

I take your first point about psuedo-random number generators, of course - that is why I was wondering about "better than psuedo random" generators.

I also understand that Pollard's Rho is of relatively little use these days, and take your point that small improvements would be of no interest. However, if a generator could be found that reduced the cycle length by orders of magnitude (my F(4194304) does this, for example - expected cycle length of 3.5M goes down pretty consistently by a factor of more than 1000), would this not be of academic interest at least? What is it about these functions that shortens the cycle length?

I do appreciate that as a non-mathematician, I am probably barking up a whole forest of wrong trees, and really appreciate that you took the time to reply.

Kind regards,

Mick.

2014-12-11, 15:57   #5
mickfrancis

Apr 2014
Marlow, UK

3816 Posts

Quote:
 Originally Posted by bsquared But IMO the effort is not pointless with respect to furthering his interest in and understanding of number theory, programming, and factorization methods.
Absolutely right - I'm having a great time, regardless of the utility of my efforts!

 2014-12-11, 17:36 #6 jasonp Tribal Bullet     Oct 2004 2·52·71 Posts The object is not to find a sequence of iterates that are 'more' random than some other sequence; Rho chooses a sequence whose members we hope are each a product of many primes, leading to a collision mod N between two such members. If you know something about the structure of the factors of N then rho can be modified to incorporate that structure; the best example is the factorization of the 8th Fermat number.
2014-12-11, 19:44   #7
xilman
Bamboozled!

"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across

22×3×5×193 Posts

Quote:
 Originally Posted by jasonp The object is not to find a sequence of iterates that are 'more' random than some other sequence; Rho chooses a sequence whose members we hope are each a product of many primes, leading to a collision mod N between two such members. If you know something about the structure of the factors of N then rho can be modified to incorporate that structure; the best example is the factorization of the 8th Fermat number.
I'm well aware that this viewpoint is heretical, but the steps of each of ECM, P-1 and P+1 produces a sequence of a product of (many) primes leading to a collision mod N.

Another, so far unrealised, algorithm would compute n! mod N in polynomial time (or, at least, in sub-exponential time) which would immediately give a fast factoring algorithm . I, and many others, have been thinking along these lines for many years. Needless to say, we've had no success.

Last fiddled with by xilman on 2014-12-11 at 19:45 Reason: Clarification

2014-12-11, 22:32   #8
mickfrancis

Apr 2014
Marlow, UK

23×7 Posts

Quote:
 Originally Posted by jasonp The object is not to find a sequence of iterates that are 'more' random than some other sequence; Rho chooses a sequence whose members we hope are each a product of many primes, leading to a collision mod N between two such members. If you know something about the structure of the factors of N then rho can be modified to incorporate that structure; the best example is the factorization of the 8th Fermat number.
Thanks Jason. I've just found Pollard and Brent's paper "Factorization of the Eighth Fermat Number", which looks very interesting - I'll give it a read and see if it helps me in my understanding of this.

 Similar Threads Thread Thread Starter Forum Replies Last Post wildrabbitt Miscellaneous Math 11 2015-03-06 08:17 markg Software 17 2008-05-24 18:03 ColdFury Math 4 2007-05-29 21:30 nitai1999 Software 7 2004-08-26 18:12

All times are UTC. The time now is 21:14.

Thu Dec 1 21:14:45 UTC 2022 up 105 days, 18:43, 0 users, load averages: 0.67, 0.88, 0.93