mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2019-05-18, 08:55   #23
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

592210 Posts
Default

I implemented based on https://www.rieselprime.de/ziki/Self...uadratic_sieve rather than Contini's thesis. This might be slightly clearer.
henryzz is online now   Reply With Quote
Old 2019-05-18, 10:45   #24
tabidots
 
May 2019

23×3 Posts
Default

Quote:
Originally Posted by henryzz View Post
I implemented based on https://www.rieselprime.de/ziki/Self...uadratic_sieve rather than Contini's thesis. This might be slightly clearer.
Thanks—I wish I had found that before Contini's thesis! (By the way, that page preserves Contini's typo—does it affect your program at all?)

Well, I am still tinkering with my SIQS, because I feel I am really close. Definitely feeling the sunk-cost effect at this point, as I have been slaving away at this for a couple weeks now.

I have figured out the whole sign issue. It affects the calculation of the subsequent solns, not the subsequent bs.

So, this is correct, as per Contini:

b = b + 2 \times (-1)^{\lceil\frac{i}{2\nu}\rceil} \times B_{\nu}

But the solns involve subtracting from, not adding to, the previous soln:

soln_p := soln_p - (-1)^{\lceil\frac{i}{2\nu}\rceil} \times Bainv2_{\nu,p} \textrm{mod} p

After I incorporated this, I was able to successfully verify:
  • b^2 \equiv N \textrm{mod} A \forall b
  • ax^2 + 2bx + c \equiv 0 \textrm{mod} p, p \in FB, x \in \{soln1_p, soln2_p\}
  • u^2 \equiv v \textrm{mod} n , that is (ax + b)^2 \equiv a(ax^2 + 2bx + c) \textrm{mod} n for each sieved x

It picks up relations quite quickly compared to before, and they pass the tests above. But I am still not finding factors, which is really puzzling—after the polynomials, I'm using the exact same code as for MPQS (being sure to re-factor in the factored-out a values at the end).

Last fiddled with by tabidots on 2019-05-18 at 10:45
tabidots is offline   Reply With Quote
Old 2019-05-18, 11:53   #25
tabidots
 
May 2019

2410 Posts
Default

Here is something really bizarre. I sprinkled some assertions in my MPQS functions to ensure that
  1. For every sieved x, u^2 \equiv v \textrm{mod} n. (That is, (ax + b)^2 \equiv g_{a,b}(x) \textrm{mod} n.)
  2. For every x, y pair used to try finding a factor, x^2 \equiv y^2 \textrm{mod} n.

My MPQS fails the first assertion and passes the second, succeeding in finding a factor.
My SIQS passes the first assertion and fails the second, failing to find a factor.

A real head-scratcher!
tabidots is offline   Reply With Quote
Old 2019-05-18, 14:05   #26
tabidots
 
May 2019

23×3 Posts
Default

I've discovered the (a?) reason my SIQS currently struggles to find a factor is that it produces an abundance of duplicate relations. I'll have to rework the A-generating algorithm. Too bad—the Monte Carlo code was so clean.
tabidots is offline   Reply With Quote
Old 2019-05-18, 17:29   #27
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

7318 Posts
Default

Quote:
Originally Posted by tabidots View Post
I found the typo that was being referred to in your superbly-documented code. I actually had been referring to your code previously as well (to the extent that I can read Java), but for some reason I did not think to hunt for the sign-switch typo in it.
Congrats and thanks. Actually I think the documentation could be better, but if you know the Contini paper you'll find some help. I tried to document all the things that were hard to grasp for me.

Quote:
Originally Posted by tabidots View Post
So, here's something strange—in my code I have z, which is what you call -1^CeilTerm.

Previously, when I simply changed the line

Code:
b_new = b + 2 * z * Bs[nu - 1]
(paraphrasing from Clojure) to

Code:
b_new = b - 2 * z * Bs[nu - 1]
it didn't do anything, which is why I wasn't getting anywhere before.

After looking at your code, I tried changing z to be the negative of itself before setting b_new, and now it picks up smooths at what I would consider to be a more normal pace. Wow!

Strange, because it should be the same result, but whatever, that performance hurdle has been cleared. *shrug*
This might be caused by modulus calculations. In some programming languages (like Java), a modulus may not always be positive.

Quote:
Originally Posted by tabidots View Post
I tried to translate the K-S multiplier algorithm from your code when I was working on SQUFOF, but I couldn't seem to get it working properly. Not only that, but I found that square-free multipliers were often not the fastest; in some cases, they didn't even find a factor, and actually the best multipliers were perfect squares! So I abandoned K-S. Maybe it's different for SIQS?
Indeed there are different K-S variants for different algorithms. I didn't ever try one on SquFoF, but I know that CFrac needs a different implementation than QS.

Quote:
Originally Posted by tabidots View Post
I've discovered the (a?) reason my SIQS currently struggles to find a factor is that it produces an abundance of duplicate relations. I'll have to rework the A-generating algorithm. Too bad—the Monte Carlo code was so clean.
That is another typical pitfall. The a-parameter generator should generate a-params that do not have too many q_i in common.
Otherwise you will get many duplicate relations, and those will lead to many fails in the linear algebra step.
In my software I do get some duplicate relations and more final factor tests than theoretically required, but checking for duplicate relations would be worse performance-wise.

Finally: It took me 3 month to implement the first working version. You should not give up short before the goalline ;-)
Till is offline   Reply With Quote
Old 2019-05-19, 02:37   #28
tabidots
 
May 2019

23×3 Posts
Default

Quote:
Originally Posted by Till View Post
Finally: It took me 3 month to implement the first working version. You should not give up short before the goalline ;-)
Thanks for the encouragement :)

Quote:
Originally Posted by Till View Post
That is another typical pitfall. The a-parameter generator should generate a-params that do not have too many q_i in common.
Otherwise you will get many duplicate relations, and those will lead to many fails in the linear algebra step.
So, I'm making progress (if you define progress as different results, not necessarily better ones). I changed the following things:
  • Use the full polynomial (ax)^2 + 2abx + b^2 - N without factoring out a and factoring it back in later—for some reason, factoring it out breaks the whole thing, and a factor is never found (even though it works in MPQS)
  • Remove one of the used qs from the candidate pool after each a is computed. This was the simplest way (i.e., involving minimal changes) I could conceive of, to guarantee that the factors of each a are distinct by at least one prime.
  • Decrease the sieve size (I was using 65536 for 40-50 digits before; now I am playing around with 10000~20000)

For the 40-digit number I mentioned, it is sieving and picking up relations much more quickly (due to the reduced sieve size). It is also finding a factor much more quickly (on the first try, in fact), which would seem to indicate that this new method of generating a is an effective way of reducing duplicate relations.

For this number, it's still not as fast as MPQS but significantly faster than before.

However, for the 50-digit number, the generation of a slows to a crawl (basically a halt) after enough q-candidates are removed. I started out with 250 q-candidates and at some point after 150-200 polynomials, when between 50-100 candidates are remaining, it takes far too long to find a suitable set of qs.

Would a more sophisticated method to generate a maintain a constant time cost no matter which polynomial it is (i.e., the first or the kth)?

By the way, I also tried generating a using something like the Carrier-Wagstaff method (to the extent that I could understand it):
  • Split the q-candidates s.t. odd indexes are in one group ("odds") and even indexes are in the other ("evens"),
  • Split the odds into pairs, sort them into a hash-map by the logs of their product
  • Take s-2 random candidates from the evens (the "q-base")
  • Find the odd pair whose product log is nearest to the difference between the log of the product of the q-base and the log of the ideal a

This completely backfired and could not even find a single a, ever. I think the q-base product was always too high.

Last fiddled with by tabidots on 2019-05-19 at 02:40
tabidots is offline   Reply With Quote
Old 2019-05-19, 17:23   #29
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

11×43 Posts
Default

Quote:
Originally Posted by tabidots View Post
Would a more sophisticated method to generate a maintain a constant time cost no matter which polynomial it is (i.e., the first or the kth)?
Yes. (when the N to factor are not too small)

Quote:
Originally Posted by tabidots View Post
By the way, I also tried generating a using something like the Carrier-Wagstaff method (to the extent that I could understand it):
  • Split the q-candidates s.t. odd indexes are in one group ("odds") and even indexes are in the other ("evens"),
  • Split the odds into pairs, sort them into a hash-map by the logs of their product
  • Take s-2 random candidates from the evens (the "q-base")
  • Find the odd pair whose product log is nearest to the difference between the log of the product of the q-base and the log of the ideal a

This completely backfired and could not even find a single a, ever. I think the q-base product was always too high.

You might try to port my AParamGenerator01 class. It does not need a particular range of primes reserved for the a-parameter computations. The backdraw is that you have to filter the prime base after each a-parameter computation. The latter is pretty quick though.

Last fiddled with by Till on 2019-05-19 at 17:25 Reason: about small N
Till is offline   Reply With Quote
Old 2019-05-20, 07:57   #30
tabidots
 
May 2019

1816 Posts
Default

Quote:
Originally Posted by Till View Post
You might try to port my AParamGenerator01 class. It does not need a particular range of primes reserved for the a-parameter computations. The backdraw is that you have to filter the prime base after each a-parameter computation. The latter is pretty quick though.
What is the big picture of what you are doing there? This is as far as I could understand (I'm ignoring the multiplier here):
  1. Compute the ideal \hat a = \frac{\sqrt{2n}}{M}
  2. Compute the ideal \hat q = \sqrt[s]{\hat a}
  3. Pick all but the last q at random from the candidates based on a normal(?) probability distribution whose center is at \hat q, with an adjustment based on the size of n
  4. Do a binary search (-ish?) for the last q (using the actual value, not logarithms)

Are you removing all the used qs from the candidates after generating each a? When accumulating qs, how do you guarantee that you have not surpassed \hat a too early? If `randomOffset` happens to be positive on every iteration, then that would happen, right?

Last fiddled with by tabidots on 2019-05-20 at 07:57
tabidots is offline   Reply With Quote
Old 2019-05-20, 14:16   #31
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

1110110012 Posts
Default

Quote:
Originally Posted by tabidots View Post
What is the big picture of what you are doing there? This is as far as I could understand (I'm ignoring the multiplier here):
  1. Compute the ideal \hat a = \frac{\sqrt{2n}}{M}
  2. Compute the ideal \hat q = \sqrt[s]{\hat a}
  3. Pick all but the last q at random from the candidates based on a normal(?) probability distribution whose center is at \hat q, with an adjustment based on the size of n
  4. Do a binary search (-ish?) for the last q (using the actual value, not logarithms)

Are you removing all the used qs from the candidates after generating each a? When accumulating qs, how do you guarantee that you have not surpassed \hat a too early? If `randomOffset` happens to be positive on every iteration, then that would happen, right?

You analyzed 1-3 correctly.


About 4: It is not a binary search, but a step-by-step search around the theoretically best q-value for the first free q. To find the first free q, I keep a list of those q already in use (instead of removing used q's from an "all q-list").


This procedure helps to give enough randomness to the final result. The deterministic computation of the last q makes sure that the final a-value is very near the theortically best a-value.
Till is offline   Reply With Quote
Old 2019-05-21, 00:22   #32
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3×1,181 Posts
Default

Another effective strategy for not-too-large problems is to precompute a list of bitfields. The first bitfield has the bottom k bits set (k the number of primes in A) and successive values also have k bits set but with low-order bits clear and higher-order bits set, with each bitfield chosen to have at least two bits different from all the others. For j even, bit j set refers to a factor base prime j/2 above the optimal q value and j odd similarly below the optimal q value.

When k = 3,4,5,6 it doesn't take a large bitfield to generate a lot of different A values that are nearly guaranteed to avoid duplicate relations
jasonp is offline   Reply With Quote
Old 2019-05-21, 06:12   #33
tabidots
 
May 2019

23×3 Posts
Default

I found a library with a weighted-sample function that can take k random samples and an optional weighting function, so I experimented with using a normal distribution as the function, with \mu = \hat q = \sqrt[s]{n}, \sigma = 0.3\hat q, and using all factor-base primes > 2 as the candidate pool. (Removed 2 as it causes a NullPointerException after some number of polynomials, probably in computing the modular square roots or something.)

For the last q, it chooses the nearest prime in the factor base in a manner similar to yours.

(As is preferred in Clojure, I am trying very hard to do this without fiddling with array indexes, so that the code be can be written in a declarative rather than imperative style.)

It generates a-values that pick up smooths, so I guess it "works"—but only up to a point. At some point, enough polynomials are generated that there are no more "free qs" and my q-collection consists of just one q, slowing the program to a crawl. What did I do wrong, that I run out of q candidates? (This didn't happen before since I was only removing 1 or 2 items from the candidate pool after each polynomial, not 5 or 6.)

Quote:
Originally Posted by Till View Post
To find the first free q, I keep a list of those q already in use (instead of removing used q's from an "all q-list")
Effectively though, you are still trying to ensure that once a q is used, it is never used again, right? As indicated by the presence of

Code:
qIndexSet.add(qIndex);
in each iteration of the loop and after determining the last q.

Last fiddled with by tabidots on 2019-05-21 at 06:17
tabidots is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
SIQS on GPU cgy606 Factoring 1 2016-10-21 13:37
SIQS problems henryzz Factoring 2 2013-08-26 23:02
SIQS Problem patrickkonsor Factoring 3 2008-10-20 12:03
HELP! Online diagnostics lythanhnguyen Hardware 3 2007-09-11 06:19
Memory Diagnostics under XP? R.D. Silverman Hardware 3 2006-11-17 16:06

All times are UTC. The time now is 09:55.


Tue Oct 26 09:55:49 UTC 2021 up 95 days, 4:24, 0 users, load averages: 1.16, 1.72, 1.92

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.