mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2019-05-16, 20:55   #12
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2·32·7·47 Posts
Default

Quote:
Originally Posted by bsquared View Post
Is that time spent just building new A polys or does it also include setup time for the new poly (e.g., computing A^-1 mod p for all factor base p)? 10% seems high either way but if that's just for building new A polys then it seems *really* high. In contrast I usually spend < 3% total for building new A's and all of the setup for new A's.



Seems reasonable given your other parameters. I typically see anywhere from 50 to 150 A's at 50 digits, using 6 factors per A, but I use a far smaller factor base and sieve interval. It is all a tradeoff.
That is including all the set up. The two bits that take the time are calculating A^-1 mod p and then multiplying that with all the generators for each factor of A mod p. I am using the built in BigInteger in c# which is quite slow. I might experiment with a couple of alternatives libraries. In the past I have seen BigInteger have a terrible runtime order but have less overhead for smaller numbers. Looking at my modular inversion code I can see improvements.

I think you can see why I have a larger sieve interval.

I have just realized that I was still skipping half the Bs so calculating As is taking half the time it was. It is still very slow for smaller numbers especially.

Last fiddled with by henryzz on 2019-05-16 at 21:06
henryzz is online now   Reply With Quote
Old 2019-05-17, 02:32   #13
tabidots
 
May 2019

110002 Posts
Default

Quote:
Originally Posted by bsquared View Post
I would ask, how are you determining smoothness exactly? In other words, the sieve value above (or below) which you try to find a relation. If your polynomials and roots are good then maybe it is as simple as tweaking your smoothness bound so as to generate more potential relation reports.

The value I use is obtained from a bunch of empirically determined magic numbers, loosely related to Contini's suggestion along the lines of:
"compute the number of bits in M/2*sqrt(N/2), then subtract some slack". M is the size of your sieve interval.
Thanks for this suggestion. I haven't gotten around to properly debugging the polynomials as jasonp suggested, but I sure hope the fix is as simple as this!

I realized that my threshold was quite far off. For SIQS, I did not change the procedure I found in the Python implementation for MPQS, which is full of that person's magic numbers:

1. Calculate the maximum possible g(x) value: (M*sqrt(2n)) / 2

2. Define the base threshold: 0.735 * log10(max_gx)

3. Determine the minimum "significant" prime: floor(3 * base_threshold) (p below this won't be used in the sieve)

4. Calculate the fudge factor: 0.25 * sum(log10(p) for all p below min_sig_prime)

5. Calculate the final threshold: base_threshold - fudge_factor

One thing I noticed (perhaps because in MPQS, I have factored one "a" out of the polynomial) is that my max_gx value for SIQS is completely off.

When you say

Quote:
"compute the number of bits in M/2*sqrt(N/2), then subtract some slack"
the number whose bits you're counting is the max_gx value, right? Is that

M / (2 * sqrt(N/2)) or
(M/2) * sqrt(N/2) ?

I can't seem to get my max_gx value to approach anywhere near the maximum values actually returned by the polynomials.

I decided to try using Msieve's parameters for F & M for SIQS. So for the 50-digit number above (55720182748350551450373114705225729163236062899069), M is 65536.

One polynomial I just generated is
g(x) = 2723740665885560669400118386023188081 x^2 +
930016696893326215892951094451829278 x +
5687450887644869965258458870102862443574037010

g(65536) = 6010987144693123589465802296837959857061524974

which is not even on the same planet as:

current max_gx = (M*sqrt(2N)) / 2 = 3494811551456090780950364160
M / (2 * sqrt(N/2)) = 6.144776667872878E-19
(M/2) * sqrt(N/2) = 1747405775728045390475198464

I have probably made some stupid mistake somewhere, right?

Last fiddled with by tabidots on 2019-05-17 at 02:39
tabidots is offline   Reply With Quote
Old 2019-05-17, 13:25   #14
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

2×1,789 Posts
Default

Quote:
Originally Posted by tabidots View Post

Is that

M / (2 * sqrt(N/2)) or
(M/2) * sqrt(N/2) ?
Contini suggests taking "a" approximately equal to sqrt(2N)/M, so that gmax is about M*sqrt(N/2). In my code I treat M as the entire sieve interval (+ and - sides) which is why I divide M by 2. Sorry for the confusion.

With A \approx \sqrt{2N}/M, you should be generating values with size of about 70 bits. Here are a few that my code came up with for the 50 digit example:

Code:
A = 978059782600599905911
B = 1406266881656945627063
C = -2107894421575738271129468804544
A = 975874113426336640897
B = 1085138790265600306933
C = -2112615482004038794930218933912
A = 982775394573367733491
B = 1201046251131435906861
C = -2097780196401273381630906077752
So something seems to be amiss with how you generate polynomials as the coefficients (and the resulting values of g(x)) are about 50 bits too large. That would definitely explain the difficulty in finding smooths!
bsquared is offline   Reply With Quote
Old 2019-05-17, 13:52   #15
tabidots
 
May 2019

23·3 Posts
Default

Quote:
Originally Posted by bsquared View Post
So something seems to be amiss with how you generate polynomials as the coefficients (and the resulting values of g(x)) are about 50 bits too large. That would definitely explain the difficulty in finding smooths!
Ah! I seem to be misunderstanding something very fundamental here.

In MPQS, one side of the congruence was obtained from values of `ax + b`, the "unsquared" polynomial, and the other side was taken from `(ax + b)^2`, or `a(ax + 2b + c)`.

The coefficients I posted were not the raw "a, b, c" but rather `a^2, 2ab, b^2 - N`, since g_{a,b}(x) = (ax + b)^2 - N = (a^2)(x^2) + 2ab + b^2 - N. Since `a` is not a square in SIQS, I cannot factor out the `a`, and the polynomial must remain in its full form.

But I am still looking for a congruence between `ax + b` and `(a^2)(x^2) + 2ab + b^2 - N`, right? Or am I looking for a congruence between something else?

Last fiddled with by tabidots on 2019-05-17 at 13:53
tabidots is offline   Reply With Quote
Old 2019-05-17, 14:37   #16
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

1101111110102 Posts
Default

Quote:
Originally Posted by tabidots View Post
Ah! I seem to be misunderstanding something very fundamental here.

In MPQS, one side of the congruence was obtained from values of `ax + b`, the "unsquared" polynomial, and the other side was taken from `(ax + b)^2`, or `a(ax + 2b + c)`.

The coefficients I posted were not the raw "a, b, c" but rather `a^2, 2ab, b^2 - N`, since g_{a,b}(x) = (ax + b)^2 - N = (a^2)(x^2) + 2ab + b^2 - N. Since `a` is not a square in SIQS, I cannot factor out the `a`, and the polynomial must remain in its full form.

But I am still looking for a congruence between `ax + b` and `(a^2)(x^2) + 2ab + b^2 - N`, right? Or am I looking for a congruence between something else?
You can construct b such that b^2 - N is divisible by a. Then A factors out and you have g_{a,b}(x) = a(ax^2+2bx+c), i.e., all g(x) are divisible by A. For each smooth g(x)/a you find, add the factors of A to the relation. In mpqs that wasn't needed since A was chosen to be a square. Now they participate in the matrix step more directly.

Contini's thesis starting on page 9 describes how to choose B so that B^2-N is divisible by A, and so that enumerating all such possible B's can be done quickly using a Gray code.

So, looking at the poly you posted again, I get that:
Code:
A=sqrt(2723740665885560669400118386023188081)=1650375916537066009
B=930016696893326215892951094451829278/(2*a) = 281759048824692071
But (B^2 - N) is not divisible by A, so maybe look at how you are generating B (Contini, bottom of page 9).
bsquared is offline   Reply With Quote
Old 2019-05-17, 15:58   #17
tabidots
 
May 2019

23·3 Posts
Default

Weird... I don't think I changed anything, but all my polynomials are testing fine now for b^2 \equiv N (\textrm{mod } a), (though I don't have the original data that generated the polynomial I wrote earlier, so I can't be sure).

A and B are around ~70 bits and C is around ~100 bits.

I changed the RHS polynomial to ax + 2b + c, where c = \frac{(b^2 - N)}{a}. LHS is still ax + b. (RHS are the values that get converted to a matrix. LHS are the values that just get multiplied together.)

All my soln_1, soln_2 solutions for the polynomial are checking out, as well.

But it still takes a long time to get smooths. I played around with the threshold, and setting it very low made it grab almost 100 smooths per A. But the sieving took an eternity. It might have worked out to the same amount of time in total. Felt like at least 10 minutes, and then it seemed to hang when trying to find a factor (not sure what happened there).

My sieving code can't be the roadblock either, because my MPQS just factored this number in ~83s.

What else can I diagnose?
tabidots is offline   Reply With Quote
Old 2019-05-17, 16:49   #18
tabidots
 
May 2019

23·3 Posts
Default

Here is some sample test output (just a tiny bit, obviously, because the data generated by this program is oceanic). Don't know how helpful it will actually be, but just in case:

Code:
N = 55720182748350551450373114705225729163236062899069

M = 65536
Max g(x):  345916436897199940228470734848
Current smooths: 0/1200 (0 primes used)
-------
New A: 161193073467281037817
B: 251205167917836684051
C: -345673554618068592277051911604
Max g(x) from this poly:  346678350229398117877754253900

B: 237418390873394944367
C: -345673554659860363590871822140
B: 116211583075761509923
C: -345673554925767324491631150420
B: 129998360120203249607
C: -345673554904709089668306466860
B: 47900535511707982579
C: -345673554995315416711249773684
.....
I do notice that the max-gx value is still a bit screwy, in that for the very very first polynomial (first b of first a) where I printed the value of g(M), g(M) is actually higher than max-gx (M \sqrt{\frac{N}{2}}). But it's not that far off, so I imagine that's not the reason this is performing so poorly.

Last fiddled with by tabidots on 2019-05-17 at 16:50
tabidots is offline   Reply With Quote
Old 2019-05-17, 22:12   #19
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

1D916 Posts
Default

Quote:
Originally Posted by jasonp View Post
This process is what led me to believe that Contini's thesis has a typo in the arithmetic for computing the next B given the previous one; instead of adding the next root you have to subtract it.
I'ld confirm that.

@tabidots:

The size of your A- and B-parameters looks good.

If I run your last example, then my program uses a bigger prime base (size=1732, pmax=32621) and a smaller sieve array (size=11520). The Knuth-Schroeppel multiplier used is 37. My A-parameters have 8 factors instead of yours having 6 of them. These differences must not mean much, but adjustment is a big issue in SIQS.

The biggest performance bug I had was not permitting primes p > M in the prime base.
Till is offline   Reply With Quote
Old 2019-05-18, 00:33   #20
tabidots
 
May 2019

23·3 Posts
Default

Quote:
Originally Posted by Till View Post
I'ld confirm that.
I found the typo that was being referred to in your superbly-documented code. I actually had been referring to your code previously as well (to the extent that I can read Java), but for some reason I did not think to hunt for the sign-switch typo in it.

So, here's something strange—in my code I have z, which is what you call -1^CeilTerm.

Previously, when I simply changed the line

Code:
b_new = b + 2 * z * Bs[nu - 1]
(paraphrasing from Clojure) to

Code:
b_new = b - 2 * z * Bs[nu - 1]
it didn't do anything, which is why I wasn't getting anywhere before.

After looking at your code, I tried changing z to be the negative of itself before setting b_new, and now it picks up smooths at what I would consider to be a more normal pace. Wow!

Strange, because it should be the same result, but whatever, that performance hurdle has been cleared. *shrug*

Other gotchas: I was recalculating c upon every new b, and the new c values were not divisible by a. So you have to re-use the same c (as well as the same a); only b changes.

Now I am actually dealing with the opposite problem—my program is too eager to look for a factor, and finds relations way too fast, haha (in that there is no way 40 or 50 relations will produce a congruence for a number fo this size). So I will have to edit that bit and see what happens when I gather more relations than the factor base, not just the used primes.

Quote:
Originally Posted by Till View Post
The Knuth-Schroeppel multiplier used is 37...
The biggest performance bug I had was not permitting primes p > M in the prime base.
I tried to translate the K-S multiplier algorithm from your code when I was working on SQUFOF, but I couldn't seem to get it working properly. Not only that, but I found that square-free multipliers were often not the fastest; in some cases, they didn't even find a factor, and actually the best multipliers were perfect squares! So I abandoned K-S. Maybe it's different for SIQS?

Last fiddled with by tabidots on 2019-05-18 at 00:45 Reason: clarity; emojis don't work
tabidots is offline   Reply With Quote
Old 2019-05-18, 01:00   #21
tabidots
 
May 2019

1816 Posts
Default

Okay, so my SIQS is still not finding a factor even after gathering a number of relations greater than the size of the factor base. Speed aside, that means sieving is okay; as I said, my MPQS works well, so linear algebra is okay as well.

So, I must be mistaken in how I'm looking for congruences. Just want to clarify the following:
  1. The RHS (main) polynomial values are ax + 2b + c.
  2. c does not change when b changes, only when a changes.
  3. From the main polynomial, one a has already been factored out and should be multiplied back in when finding factors (in my code I do this after adding the exponent vectors).
  4. The LHS (smaller) values, which just get multiplied together in the end, are ax + b whose b changes with every iteration, just like the main polynomial.

EDIT: I realized why changing the sign of z made an impact—it affected how the new set of solutions to the polynomial are calculated! So now my sieve picks up lots of smooths per polynomial, but every new "b" after the first (for each "a") seems to be wrong somehow. b^2 - N is not divisible by a, and the solutions calculated with the Bainv2 values are also not producing zeros. Hmm...

Last fiddled with by tabidots on 2019-05-18 at 01:51
tabidots is offline   Reply With Quote
Old 2019-05-18, 02:40   #22
tabidots
 
May 2019

23×3 Posts
Default

Ah, scratch everything I just said. I realized that instead of changing the sign of z, I was subtracting it from 1. Man, complicated algebra is really confusing in Lisps.

So no wonder nothing was working despite picking up so many smooths—they must have all been completely wrong values!

Anyway, I thought I was getting somewhere, but I am back where I started (or not even there—since I changed the polynomial, now I can't find factors at all). I clearly don't know what I'm doing and I think I'm just going to give up for now. Thanks all for your help and patience.
tabidots is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
SIQS on GPU cgy606 Factoring 1 2016-10-21 13:37
SIQS problems henryzz Factoring 2 2013-08-26 23:02
SIQS Problem patrickkonsor Factoring 3 2008-10-20 12:03
HELP! Online diagnostics lythanhnguyen Hardware 3 2007-09-11 06:19
Memory Diagnostics under XP? R.D. Silverman Hardware 3 2006-11-17 16:06

All times are UTC. The time now is 10:02.


Tue Oct 26 10:02:53 UTC 2021 up 95 days, 4:31, 0 users, load averages: 1.57, 1.67, 1.82

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.