mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Factoring (https://www.mersenneforum.org/forumdisplay.php?f=19)

 mickfrancis 2016-03-30 13:47

I've been experimenting with a non-sieving version of the Quadratic Sieve algorithm. To date the performance is nowhere near as good as the basic Quadratic Sieve, let alone SIQS; gathering relations to factor a C80 takes around 14 hours using single-threaded Haskell over GMP on a standard desktop machine, using the Single Large Prime variation.

However, I am not a proper mathematician, so I am almost certainly missing some obvious algorithmic improvements (I guess I would need a couple of orders of magnitude to make it interesting).

In brief, rather than sieving consecutive x values for smooth f(x) values, 'candidate' x values are generated that have a better than average chance of producing smooth f(x) values. Smoothness testing is then carried out using a remainder tree before factoring the values that pass the test, to produce the relations.

Would anyone be interested in discussing this with a view to algorithmic refinements?

 jasonp 2016-03-30 14:40

The expected performance of this is worse than that of CFRAC with the same optimizations, since at least in that case the numbers being batch factored are all of size near sqrt(N), whereas with QS the candidates grow in size.

'Early abort' is helpful in both these algorithms, since we expect a lot of small factors in relations but only a few large factors, so it would make sense to do a little sieving and reject sieve values unless only a little of the sieve value is unfactored. In that sense batch cofactorization can be seen as a memory-efficient method for avoiding having to sieve most of the traditional factor base. Making a small sieve run fast is quite easy, using a large sieve much less so.

Bernstein wrote around 2004 that it was a myth that sieving was always optimal; needless to say it was a bit controversial :)

 mickfrancis 2016-03-30 15:24

[QUOTE=jasonp;430385]The expected performance of this is worse than that of CFRAC with the same optimizations, since at least in that case the numbers being batch factored are all of size near sqrt(N), whereas with QS the candidates grow in size.

'Early abort' is helpful in both these algorithms, since we expect a lot of small factors in relations but only a few large factors, so it would make sense to do a little sieving and reject sieve values unless only a little of the sieve value is unfactored. In that sense batch cofactorization can be seen as a memory-efficient method for avoiding having to sieve most of the traditional factor base. Making a small sieve run fast is quite easy, using a large sieve much less so.

Bernstein wrote around 2004 that it was a myth that sieving was always optimal; needless to say it was a bit controversial :)[/QUOTE]

Thanks Jason. Interestingly, because of the way the candidate x values are chosen, subsequent factoring is on values less than sqrt(n), sometimes by a factor of more than 10. Also, the values being factored remain the same order of magnitude throughout.

This is because the candidate x values are chosen such that f(x) has known large(ish) factors from the factor base; sieving would mark these x values, whereas I select them explicitly. Therefore the f(x) can be divided by these factors before smoothness testing and factorisation.

Supposing I want f(x) to have factors p1, p2 and p3 from the factor base; f generally has 2 roots modulo each of these, giving 8 x-values (via the CRT) between 0 and p1*p2*p3-1 that will give an f(x) with p1, p2 and p3 as factors. I use the least of these x-values and also the greatest minus p1*p2*p3 (which is negative). This gives smallest (?) abs(x) values, and hence f(x) values, such that f(x) has all 3 factors.

I do use early-exit, and this typically happens when the number of relations reaches the high 80's % of the way to the factor base size (plus margin).

 mickfrancis 2016-03-30 20:57

[QUOTE=jasonp;430385]The expected performance of this is worse than that of CFRAC with the same optimizations[/QUOTE]

Can CFRAC really factor a C80 (in less than 14 hours)?

 jasonp 2016-03-30 23:16

It's unlikely, but your original description didn't mention that sieving was used, so I also doubt that non-sieving QS would be able to factor a C80 :)

 mickfrancis 2016-03-31 06:21

[QUOTE=jasonp;430406]It's unlikely, but your original description didn't mention that sieving was used, so I also doubt that non-sieving QS would be able to factor a C80 :)[/QUOTE]
Ah, well, this is my point you see :) I have a non-sieving implementation, in Haskell, that [B]does[/B] generate enough relations to factor a C80 in 14 hours. As far as I can tell, this is currently the fastest non-sieving, congruence-based method for numbers of this size (a claim I make with some trepidation...). Of the non-congruence-based algorithms, ECM is faster, I think, even when both factors are of similar magnitude, but then rather a lot of R&D has gone into that!

What needs refinement, I think, is the "candidate generation" stage (see earlier post). For example, at the moment, the "known large factors" of f(x) are from the factor base, but it is also possible to have a factor p[SUP]2[/SUP], where p is not in the factor base (using Hensel lifting). So far this doesn't seem to be an improvement, but it gives another dimension for exploration.

 All times are UTC. The time now is 04:57.