mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2013-07-25, 20:36   #111
lorgix
 
lorgix's Avatar
 
Sep 2010
Scandinavia

3×5×41 Posts
Default

Quote:
Originally Posted by firejuggler View Post
42, lorgix, 42.. (H2G2 ref)

That sweet spot is our grail, and... nobody got it it yet, that's why we experiment.
I wouldn't be asking all those questions about how one should go about finding the optimum if I thought it had already been found.

Where my questions faulty/not worth asking, or do you not have answers for them? Should I come back in 7.5 million years?
lorgix is offline   Reply With Quote
Old 2013-07-25, 20:45   #112
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

23×52×13 Posts
Default

I unfortunately, have no clear answer.
However, for poly under 165 digits, the range below a leading coef below 1M often yield a good poly at least.

I do hope, however, that in 7.5 million years, we will have an answer for C150-C190, or if we don't , that either humanity has other more urgent problem; or that will be a matter of milli-seconds.

Last fiddled with by firejuggler on 2013-07-25 at 20:49
firejuggler is offline   Reply With Quote
Old 2013-07-25, 22:55   #113
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

4,861 Posts
Default

Our E-score target for poly select (per Batalov's hand-coded range that appears in the log) is set one digit smaller when using GPU search. I believe that is equivalent to 15% higher- so we have moved the goalposts.

We should spend exactly as much time poly-searching such that total project length is minimized- that is, that the expected time it would take us to find a better poly would be greater than the sieve-time saved from that poly. We surely should not divide the old time guideline by the GPU speedup of ~50x, because the marginal time for a better poly is tiny at that point. At the same time, Batalov/msieve's target range has been mostly dead-on in my experience, in that once one finds a poly in the upper half of the range, it could take a doubling or tripling of poly-select time to improve it- so I stop immediately. Tom noted a digit-region where the target is too high, so merely getting into the range is worthy of proceeding to sieve; I am not sure if the formula has been altered in a recent SVN.

The -nps step optimizes for size only, so setting a tight stage 2 norm is identical to proceeding with the top x% of hits from a looser norm. Since we're interested in minimizing project time, I run -npr on 300-500 hits a day when selecting 150-165 digit numbers, as the root step is fast at that size (say, 2 core-hrs per day); on the c176 I'm doing now, the npr cost is higher so I tighten the norm down to 200-300 hits/day; the discarded hits may have a 1-2% chance of producing the best poly, but cost 3-4 hrs at C176 size vs 30 min at C158.

My "divide by 20" gives a baseline to produce a reasonable number of hits/day; for larger runs it should be tightened further. I have probably run npr on 4500 candidates for the C176 over 10 days, while a 3-day C157 run might have 1500 root-optimized.

Some more precise data from the C176:
Default stage2norm is 1.14e25 = 114e23. Dividing by 20 is 5.7e23. I did a 5-day run while on holiday with norm set at 4.5e23, producing 2000 hits which took 25 hr to -npr.

I sorted these into buckets:
511 with norm better than 2.5e23, best polys 1.30, 1.29, 1.28, 1.26, 1.26e-13
597 between 2.5e23 and 3.5e23, best poly 1.24e-13
947 with norm between 3.5e23 and 4.5e23, best poly 1.20e-13.

That led me to tighten the bound to 3.5e23, halving the candidates for -npr. I'm still producing 250/day, but npr takes 3-4 hr instead of 7. Now that I've put this in writing, I should tighten to 3e23. I think firejuggler said he tightened all the way to 1e23?

I am certain the GPU code should set a stage2norm at least 10x tighter than present.

-Curtis
VBCurtis is offline   Reply With Quote
Old 2013-07-26, 00:30   #114
WraithX
 
WraithX's Avatar
 
Mar 2006

479 Posts
Default

Another thing to consider is that when running -npr you can get different "best polynomials" by changing the stage2_norm.

I'd pick your top 50 or 100*, and run through those with a stage2_norm just larger than the lowest "score" in the .ms file. (I'm not sure what the name of the last field in the .ms file is, but that is what I am calling "score" here) Lets make up some numbers. Say the lowest "score" is 1.23e21, then I'd use -npr "stage2_norm=1.3e21", then once that's done, run -npr again with maybe 2e21, then 3e21, 5e21, 1e22, 5e22, 1e23, etc. I definitely saw different "best polynomials" when using this method. I think it'd be interesting to hear how this works for other people too.

*To pick the top 50 or 100, use the unix sort and head commands like so:
With degree-5 polynomials use:
sort -g -k 10 <file>.ms | head -100 > top100.ms
Or with degree-6 polynomials, use:
sort -g -k 11 <file>.ms | head -100 > top100.ms

Also, if you know a good minimum e-value you are looking for, you can specify that to reduce the amount of output in the .p file by using the min_evalue=1.20e-13 (for example), in addition to all of your other options like stage2_norm, etc.
WraithX is offline   Reply With Quote
Old 2013-07-26, 02:00   #115
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

4,861 Posts
Default

Quote:
Originally Posted by WraithX View Post
Another thing to consider is that when running -npr you can get different "best polynomials" by changing the stage2_norm.

I'd pick your top 50 or 100*, and run through those with a stage2_norm just larger than the lowest "score" in the .ms file. (I'm not sure what the name of the last field in the .ms file is, but that is what I am calling "score" here) Lets make up some numbers. Say the lowest "score" is 1.23e21, then I'd use -npr "stage2_norm=1.3e21", then once that's done, run -npr again with maybe 2e21, then 3e21, 5e21, 1e22, 5e22, 1e23, etc. I definitely saw different "best polynomials" when using this method. I think it'd be interesting to hear how this works for other people too.

*To pick the top 50 or 100, use the unix sort and head commands like so:
With degree-5 polynomials use:
sort -g -k 10 <file>.ms | head -100 > top100.ms
Or with degree-6 polynomials, use:
sort -g -k 11 <file>.ms | head -100 > top100.ms

Also, if you know a good minimum e-value you are looking for, you can specify that to reduce the amount of output in the .p file by using the min_evalue=1.20e-13 (for example), in addition to all of your other options like stage2_norm, etc.
If I understand the root optimization correctly, setting the stage2norm in this step restricts how much size we can give up while chasing alpha.
1. Does a looser bound make -npr take longer?
2. Why would tighter settings produce better polys than looser settings?
VBCurtis is offline   Reply With Quote
Old 2013-07-26, 02:41   #116
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,541 Posts
Default

A looser bound does make the root optimization take a little longer, but the work it does is capped pretty aggressively. You'll never make it take, say, an hour for one polynomial.

A tighter size bound can sometimes find better polynomials than a looser bound because the root optimization will spoil a little of the size no matter what, and if the size starts off very good and the bound starts off tight, then a polynomial with very good size and mediocre alpha may trump a different polynomial with much better alpha but much worse size. When the bound is loose you're saying it's okay to completely kill the size of a polynomial to find a really good alpha, and often that gamble doesn't pay off.

When I first prototyped the root optimization, I was finding degree 6 polynomials with an alpha around -13, whose size was terrible, and which were easily beaten by other polynomials with an alpha of about -7.

When there is a lot of headroom between the size of the polynomial and the bound, the code will actually run the root optimization several times with gradually loosening bounds, to try to catch the spectrum of polynomials that trade off good size and good alpha. I was concerned that having to manually adjust the bound would leave good polynomials on the table, but the process doesn't work as well as I'd like.
jasonp is offline   Reply With Quote
Old 2013-07-26, 05:44   #117
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

258B16 Posts
Default

Quote:
Originally Posted by jasonp View Post
Stage 1 is a fairly ...<snip>...
Day by day I learn these things better and better from you guys!
LaurV is offline   Reply With Quote
Old 2013-07-26, 07:03   #118
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

260010 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
I think firejuggler said he tightened all the way to 1e23?
I did for a while, but didn't get any worthy result, so I put it back toward the 3e23 region too.
firejuggler is offline   Reply With Quote
Old 2013-07-26, 14:58   #119
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,541 Posts
Default

Quote:
Originally Posted by lorgix View Post
Let's say you are dead set on using a certain poly to factor a number using GNFS, how would you choose the number to be factored if you wanted it to be fast?
To a certain extent this is a definition game, if the polynomial drives the problem then by definition it's not GNFS, it's SNFS. When you make up a polynomial by the 'base-m method', you're really expressing some number in a weird base (i.e. base m). The algebraic polynomial becomes the coefficients of the base-m expansion, and (x-m) becomes the rational polynomial. When the algebraic coefficients are small, you are saying that a number is efficiently represented by a particular choice of m. For polynomial degree d, you can choose m to break the input number into d or d+1 pieces and use a rational polynomial of x-m, or you can pretend the base is a rational number and give the rational polynomial two coefficients, so that it's (p*x-m) for some p and m. The extra flexibility this provides for efficiently representing the input is why the latter method is favored.

So a breakthrough in polynomial selection would involve choosing the perfect rational number m/p to express a given input in base m/p.

Quote:
What, if anything, do we know about the distribution of the (quality-quantifying-)values produced by -nps and -npr?
-nps produces an alpha and a size score, if I'm not mistaken. Should we consider both, or only the size?
The size score printed out by -nps does incorporate the root score printed out by -nps. That root score is not the complete root score, but only the part that would not change if we ran -npr.

Basically the size score is the 'average size' of a value of the algebraic polynomial. More precisely, it's the log of the square root of the sum of the squares of all the polynomial values we'd expect to encounter during the sieving. As there are far too many of these, the summation is actually a numerically computed integral. Every polynomial has regions of large size and other regions of small size, so we're interested in the aggregate total of those. The root score at this point is biased to be better than random but it's not a fatal problem if it's bad, and a polynomial is not rejected during the size optimization if its root score is bad. I don't know what the distribution of score is expected to be, I would guess it's Poisson in that a much better score is exponentially less likely, but can still occur, which wouldn't happen for normally distributed scores.

Quote:
Can it be assumed that there isn't a sizable penalty to defining the portion of -nps-output that should be passed to -npr as the "top n entries" or "top x percent"? Or does it have to be more complex than that?
At the end of the day, it's just a heuristic that attempts to optimize the number of high ranking polynomials found per unit time.
jasonp is offline   Reply With Quote
Old 2013-08-26, 21:30   #120
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

40728 Posts
Default

I've just started the polynomial search for 3,766+. Any help you wish to provide would be appreciated.
frmky is offline   Reply With Quote
Old 2013-08-26, 23:26   #121
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

23·52·13 Posts
Default

There is a poly request thread, you should ask there.
also,assuming you meant 3^766+1,a quick search got me
Code:
# norm 9.753274e-022 alpha -7.577806 e 1.809e-016 rroots 3
skew: 678318.70
c0: -14839974954786763594441670520652786777724284725
c1: -59207059990609276626072777478650700934610
c2: 112272803357459276186666662767487659
c3: 166834608504564159598745313926
c4: -791734147179218373240242
c5: 312400000019243136
Y0: -3982774696339649872762227894607454685254
Y1: 154887166637956279
while excepted is much higher(expecting poly E from 5.37e-016 to > 6.18e-016) this is a start

Last fiddled with by firejuggler on 2013-08-26 at 23:28
firejuggler is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Polynomial Discriminant is n^k for an n-1 degree polynomial carpetpool Miscellaneous Math 14 2017-02-18 19:46
Help choosing motherboard please. Flatlander GPU Computing 4 2011-01-26 08:15
Choosing the best CPU for sieving siew Factoring 14 2010-02-27 10:07
MPQS: choosing a good polynomial ThiloHarich Factoring 4 2006-09-05 07:51
Choosing amount of memory azhad Software 2 2004-10-16 16:41

All times are UTC. The time now is 01:28.


Sat Jul 17 01:28:12 UTC 2021 up 49 days, 23:15, 1 user, load averages: 0.83, 1.06, 1.16

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.