mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Factoring (https://www.mersenneforum.org/forumdisplay.php?f=19)

 fivemack 2009-08-23 21:29

Counting duplicates

I'm wondering what can sensibly be done to predict the number of duplicate relations that a large sieve run will produce.

My initial attempt ignores the fact that lattices are involved, and make the false assumption that sieving with some special-Q gives all the relations from some Platonic fixed set of relations which have that Q dividing the polynomial value.

In which case, to estimate the number of unique relations that you'd get sieving from Q0 to Q1, sieve short intervals at q=Q0, q=(Q0+Q1)/2, q=Q1-epsilon, measuring sieving time, and then either

* count only relations containing no factor between q+1 and Q1.
* count relations with one factor between Q0 and Q1 as having weight 1, relations with two factors as weight 1/2, with three as weight 1/3 ...

This gets you a unique-yield-per-q-range figure at three points, and you integrate it by the trapezium rule; also integrate the raw-yield-per-q and the sieve-time-per-q that you have measured, because that's easy, and the sieve time gives some idea how long the full run will take.

Measuring the timings means this does account for large Q taking longer than small Q; on the other hand, a large Q does sample a significantly larger space: I sieved ranges R1=25M..25M+10k and R2=75M..75M+10k, there was one relation Z in R1 with a factor in R2, and five relations in R2 with a factor in R1 (obviously including the relation Z). So this technique gives an underestimate of the number of unique relations, which isn't terribly helpful.

I have enormous amounts of data from many large sieving jobs to play with; what should I do next?

 jasonp 2009-08-24 02:26

Maybe you should contact Arjen Lenstra or Joppe Bos; IIRC they are studying the prediction problem. Willemein Ekkelkamp published a paper at ANTS 8 that developed accurate estimates for line sieving to find enough relations, assuming no duplicates, but I don't know if she's still working in this area.

 All times are UTC. The time now is 23:24.