mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2005-04-06, 14:43   #1
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

2,467 Posts
Default Knuth-Schroeppel analysis

Hi,

I've been wondering about how the technique of estimating root properties for NFS or Knuth-Schroeppel factors for MPQS actually works. According to various sources (i.e. TAOCP 4.5.4) they compare the average log contribution of small primes to the factorization of the numbers that are tested for smoothness with the average log contribution in uniformly, randomly chosen integers.

Let f(p,S) be the average exponent of the prime p in values chosen uniformly at random from the set S. For S the natural numbers, we have f(p,\N) = 1/(p-1).

If S is the set of values we test for smoothness, and s \in S chosen at random, then the log of the residual after dividing out primes p<k can be expected to be

log(s)-\sum_{p \in \Primes, p<k} f(p,S)*log(p)

Comparing the size of the residual to that for S=\N, we find that the residual should be smaller by
\alpha = \sum_{p \in \Primes, p<k} (f(p,S) - 1/(p-1))*log(p)

Now, apparantly, they argue that because of this the probability that an s \in S is smooth is about the same as that a random integer of size s/exp(\alpha) is smooth (to the same bound k).

Is this about right? Has anything been proven about this estimate? Especially the last step seems like a leap of faith to me... numerical evidence supports the idea, as shown in several papers (I remember something in Lenstra/Bernstein's NFS paper), but I was wondering if there's more theoretical fundation to it. Any comments or pointers to sources would be greatly appreciated.

Thanks,
Alex
akruppa is offline   Reply With Quote
Old 2005-04-06, 15:06   #2
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

1D2416 Posts
Thumbs up

Quote:
Originally Posted by akruppa
Hi,

I've been wondering about how the technique of estimating root properties for NFS or Knuth-Schroeppel factors for MPQS actually works. According to various sources (i.e. TAOCP 4.5.4) they compare the average log contribution of small primes to the factorization of the numbers that are tested for smoothness with the average log contribution in uniformly, randomly chosen integers.

Let f(p,S) be the average exponent of the prime p in values chosen uniformly at random from the set S. For S the natural numbers, we have f(p,\N) = 1/(p-1).

If S is the set of values we test for smoothness, and s \in S chosen at random, then the log of the residual after dividing out primes p<k can be expected to be

log(s)-\sum_{p \in \Primes, p<k} f(p,S)*log(p)

Comparing the size of the residual to that for S=\N, we find that the residual should be smaller by
\alpha = \sum_{p \in \Primes, p<k} (f(p,S) - 1/(p-1))*log(p)

Now, apparantly, they argue that because of this the probability that an s \in S is smooth is about the same as that a random integer of size s/exp(\alpha) is smooth (to the same bound k).

Is this about right? Has anything been proven about this estimate? Especially the last step seems like a leap of faith to me... numerical evidence supports the idea, as shown in several papers (I remember something in Lenstra/Bernstein's NFS paper), but I was wondering if there's more theoretical fundation to it. Any comments or pointers to sources would be greatly appreciated.

Thanks,
Alex
"Is this about right? Has anything been proven about this estimate? "

Yes, it is about right. Yes, it is an assumption; but "leap of faith" is
too strong. There is substantial supporting numerical evidence. I characterise something as "leap of faith" when there is no supporting
evidence. There are also some probability arguments to suggest that the
assumption is correct (or at most off by a constant factor). These
arguments [for at least quadratic extension fields] are discussed in the
"Cohen-Lenstra" heuristics.
R.D. Silverman is offline   Reply With Quote
Old 2005-04-12, 15:21   #3
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

1001101000112 Posts
Default

I looked at Cohen and Lensta's paper and some follow-up papers (i.e. Williams and te Riele), but I'm afraid they require much more algebra than I know. I'm clueless about character groups, Dedekind domains and whatnot.

I think a vague argument that the analysis is about right could be made supposing that the rho(alpha) function is "linear enough" for small variation of alpha that a weighted sum can be pulled inside the argument. I'll look at this idea more closely sometime later.

Thanks for your reply,
Alex

Last fiddled with by akruppa on 2005-04-12 at 15:21
akruppa is offline   Reply With Quote
Old 2009-12-19, 21:14   #4
ThiloHarich
 
ThiloHarich's Avatar
 
Nov 2005

32·11 Posts
Default

I think the question akruppa raised is valid.

I analyzed the smooth values and found out that the probability of a factor p being part of the smooth value is much higher then for a random number (which should be clear).
For the SIQS (without large primes) I estimated a value of:

2.2 / (p - 1)^.72

for the expected length of a factor (of the factor base, different from 2) in a smooth decomposition.

So I used this approximation for determining the multiplier. If it differs it always causes a better runtime then the approximation log (p)/p or log (p)/(p-1).
Here are some examples:

N = 25249581771989830594180551024377089571(38)

my multiplier: 59 time : 0.24417041800000003
knuth-schroeppel: 3 time : 0.520334009

N = 24339015700034049398642312663507799431(38)

my multiplier: 11 time: 0.283217839
knuth-schroeppel: 7 time: 0.45216473


N = 15028821219978351294914980921150425953(38)
my multiplier: 2 time: 0.241755027
knuth-schroeppel: 1 time: 0.6213507580000001

From my point of view the p-1 approximation is better then the p since the following facts:

A random number is dividable by a factor p with probability 1/p. With probability 1/p^2 it is dividable by p^2.
So the resulting length of the exponent of a factor p is
log (p)/p + 2*log (p)/p^2 + 3*log (p)/p^3 + ... =
log (p)/p * (1 + 1/p + 1/p^2 + ..) =
log (p)/p * (1/(1-1/p)) = log (p) * (1/(p-p/p)) = log (p) /(p-1)

I will try running the sieve with different multipliers lets say k_1, k_2, k_3 with polynomials including the factors k_2*k_3, k_1*k_3, k_1*k_2. So the conguences are all mod k_1*k_2*k_3. Will this work?
ThiloHarich is offline   Reply With Quote
Old 2009-12-19, 21:54   #5
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,529 Posts
Default

Do you mean 2.2*log(p)?

Msieve picks the same multipliers your code does, and I thought it uses the Knuth-Schroeppel algorithm. A factor of p contributes 2*log(p)/(p-1) if it is not part of the multiplier, and 1*log(p)/(p-1) if it is.
jasonp is offline   Reply With Quote
Old 2009-12-19, 23:11   #6
ThiloHarich
 
ThiloHarich's Avatar
 
Nov 2005

32·11 Posts
Default

No I really mean something O(p^c) where c is something around -.72. instead of 2 * log (p) / (p-1).
You can see it as increasing the knuth-schröppel factors 2*log (p) / (p-1) by a factor (p-1) ^ 0.37 / log (p).
I alway used 2 * log (p)/ (p-1). I did not see why a factor of 1 should work better if p divides a.
This is an simple approximation of the values I observed, and it worked for my sieve.
ThiloHarich is offline   Reply With Quote
Old 2009-12-20, 00:38   #7
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

DC916 Posts
Default

Using 1 instead of 2 when p divides the multiplier is a trick to account for the factor base containing only one sieve root for p in that case, instead of the usual two roots. I think Contini's thesis first documents the idea.
jasonp is offline   Reply With Quote
Old 2009-12-20, 10:05   #8
ThiloHarich
 
ThiloHarich's Avatar
 
Nov 2005

11000112 Posts
Default

Ahh this is true. I added this correction to the approximation. But since this is only the case for some small factors this does not influence the overall result much.
Since the higher factors were weighted much higher with my formula we should not stop at some factor with the summing them up, as we can do with the Knuth-Schroeppel fromula.
ThiloHarich is offline   Reply With Quote
Old 2010-01-08, 17:01   #9
ThiloHarich
 
ThiloHarich's Avatar
 
Nov 2005

32·11 Posts
Default

I found the reason why I was getting greater and better multipliers.

I took for the factor k a term - log (k), but since q(x) = (sqrt(n) + x)^2 - kn grows with O(sqrt(k*n)*x), a factor - log(k)/2 is more suitable.

So even that my approximation on the resulting hit rate (or the resulting length) of the factors in the factorbase is better then in other approximations, the resulting multipliers are mostly the same then in the other approximations like the implementation of alpertron :(
ThiloHarich is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Knuth's GCD Lemma Dougy Math 5 2014-04-08 20:50
CLOCK_WATCHDOG_TIMEOUT (101) analysis TObject Software 4 2013-02-05 23:53
Dimensional analysis davieddy Puzzles 9 2011-08-02 09:59
Write to Donald Knuth! cheesehead Lounge 20 2009-08-17 03:19
mersenne analysis troels munkner Miscellaneous Math 2 2006-07-17 03:18

All times are UTC. The time now is 05:25.

Sat Sep 26 05:25:56 UTC 2020 up 16 days, 2:36, 0 users, load averages: 1.87, 1.74, 1.71

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.