mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Cunningham Tables

Reply
 
Thread Tools
Old 2007-12-11, 05:03   #12
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

DD716 Posts
Default

Quote:
Originally Posted by fivemack View Post
I don't know what the asymptotics of over-sieving are like: I did one, accidental experiment (http://www.mersenneforum.org/showthread.php?t=9381) in which 26M relations gave no matrix, 27M relations gave a matrix with side 2746454 and cycle-entries 178617306, and 34M gave side 2261961, cycle-entries 147036684, so the weight per row stayed absolutely constant and the side is going down. But I don't know if the conclusion to draw is '7 million relations reduce the side by 20%' or '25% more relations than needed for a minimal matrix reduce the side by 20%', and I don't know whether going further would make the weight per row start to go up.
In general, the filtering produces a continuum of matrices, some larger and less dense, others smaller and more dense. The only real requirements are that the final number of matrix columns be a little larger than the number of rows, and that the number of columns must be at least the number of factor base entries ignored by the filtering. Smaller matrices reduce the block lanczos overhead but increase the matrix multiply time; given feedback about how long each part takes it's possible to pick a matrix out of all the possibilities that minimizes the solve time.

Msieve takes a very basic approach: the literature has matrices that average 60-70 nonzeros per column, so the filtering will first produce a collection of matrix columns that meets the minimum requirements for a usable matrix, then squeezes that down by continuing to merge ideals until the estimated average number of nonzeros in the lightest matrix columns found exceeds TARGET_DENSITY (=65.0). Just increasing this number will produce smaller matrices that are more dense; however, the few timing experiments I've done indicate that 65.0 is already a little too high, in that larger but sparser matrices solve slightly faster. So it makes sense that the density stays the same as you oversieve but the size goes down...it's that way by design :)

Having heard that the CWI filtering tools require some experience combined with trial-and-error in order to do a good job, I set out to make my filtering try to do a good job automatically. I think it does a good enough job now, but don't really know how much better it can be made to perform.

The NFSNET factorizations I've seen so far are pretty significantly oversieved, with enough excess so that clique processing gets rid of a large number (~40%) of the relations that survive the singleton removal.
jasonp is offline   Reply With Quote
Old 2007-12-11, 10:14   #13
smh
 
smh's Avatar
 
"Sander"
Oct 2002
52.345322,5.52471

29×41 Posts
Default

Quote:
Originally Posted by fivemack View Post
To get the sieving done in reasonable time will require a few dozen cores, which means I will have to get several people to help
I can spare a couple of cpu's for this
smh is offline   Reply With Quote
Old 2007-12-11, 11:29   #14
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

2×132×19 Posts
Default OK, let's hunt polynomials

I assume you have the executable pol51m0b available; it's part of the ggnfs package.

Create a file called '6.383.data' containing the single line

Code:
N 230380135646168002240144238096238189782429580465812519176892278271650463794969643225877877269156894108094881082195219664775471894182470295616143804362949333632033489
Pick a5 ranges among yourselves; a quarter-million a5 will take, I think, a CPU-day on a 2.5GHz machine to sieve and about a CPU-week to optimise, and produce around 100MB of output.

For example, if you've picked the range 1.75 to 2 million, do
Code:
ggnfs/bin/pol51m0b -p 8 -n 1e25 -b 6.383 -a 1750000 -A 2000000
If you have to stop the job half-way through, simply look at the last line in the file 6.383.51.m that it creates, which will be of the form
Code:
1802134060 6819433011508008067 31019844821582516560648299688392
and start again with -a 1802134 - that is, the first number in the last line divided by 1000.

After that job has completed, and created a file 6.383.51.m, run

Code:
ggnfs/bin/pol51opt -b 6.383 -n 1.5e23 -N 6e20 -e 4.5e-13
which will take rather longer to run, and will create a (hopefully fairly small) file 6.383.cand.

Find the line in that file of the form
Code:
BEGIN POLY #skewness 1018899.74 norm 4.05e+22 alpha -5.02 Murphy_E 5.03e-13
with the largest Murphy_E number at the end, and post here the section from that 'BEGIN POLY' line to the 'END POLY' line which follows.

Sorry if I'm teaching people to suck eggs here.

We probably want to search for a total of around half a GHz-year, IE up to about a5=6000000 - the best polynomial I've found in a very small search would take about five GHz-years to sieve.

I'll run a=0 through 250000.
fivemack is offline   Reply With Quote
Old 2007-12-11, 13:08   #15
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22×5×373 Posts
Default

Quote:
Originally Posted by fivemack View Post
That's quite a long list, and I don't quite see the rationale behind its entries; there's one more-wanted, a few early holes, but I'm wondering why
What makes you think there is a rationale? I simply consulted the Tarot....
There is a bias toward base 2.
R.D. Silverman is offline   Reply With Quote
Old 2007-12-11, 13:14   #16
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

2·132·19 Posts
Default

Quote:
Originally Posted by fivemack View Post
Pick a5 ranges among yourselves; a quarter-million a5 will take, I think, a CPU-day on a 2.5GHz machine to sieve and about a CPU-week to optimise, and produce around 100MB of output.
Those timings are, to use a technical term, balderdash; a better estimate is 1000 a5 sieved per CPU-minute, say rather over a million a day. So we can aim to do a5 up to at least 25 million, and it would make more sense to reserve in blocks of 10^6. I'll take 0-1000000.

Last fiddled with by fivemack on 2007-12-11 at 13:14
fivemack is offline   Reply With Quote
Old 2007-12-11, 13:17   #17
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

46438 Posts
Default

Quote:
Originally Posted by fivemack View Post
If that's too far to stretch, there is clearly already interest in 3^527-1 C160/S252, though I'd not want to set up to shoot that fox if akruppa is already ready to hunt it.
I have no plans for this number.


Edit:
Quote:
Originally Posted by Paul Leyland
1) no large prime is > 1e9
I used 31 bit primes for 5,349-. Getting the filter code to work took a bit of hacking but there was no real problem in the end. I'll see if I still have the hacked version somewhere, if I do we could (and should, for a project this size) use 31 bit large primes.

Alex

Last fiddled with by akruppa on 2007-12-11 at 13:23
akruppa is offline   Reply With Quote
Old 2007-12-11, 14:12   #18
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3×1,181 Posts
Default

Quote:
Originally Posted by fivemack View Post
Those timings are, to use a technical term, balderdash; a better estimate is 1000 a5 sieved per CPU-minute, say rather over a million a day. So we can aim to do a5 up to at least 25 million, and it would make more sense to reserve in blocks of 10^6. I'll take 0-1000000.
I'll take 24M-25M, running on two 1.86GHz cores

Edit: the post below was in response to my complaint that a 6-digit a was low for an input this size, but I'd forgotten that the inputs given to the Kleinjung tools are scaled down.

Last fiddled with by jasonp on 2007-12-11 at 14:35
jasonp is offline   Reply With Quote
Old 2007-12-11, 14:22   #19
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

2·132·19 Posts
Default

'a5' is confusingly scaled by a factor 10^3 in pol51m0b, as I'm sure you know; presumably so that they can parse the -a and -A parameters as 32-bit integers.

RSA576 is ten more digits larger than 6^383+1, so we'd be searching half as far for this C165 as they did for that C174; that sounds reasonable to me, if not verging on overkill. For a C156 I found an unusually good polynomial lurking at a5 ~ 6 billion.

Best of luck with the hunt!

Last fiddled with by fivemack on 2007-12-11 at 14:26
fivemack is offline   Reply With Quote
Old 2007-12-11, 14:24   #20
JHansen
 
JHansen's Avatar
 
Apr 2004
Copenhagen, Denmark

22×29 Posts
Default

Quote:
Originally Posted by fivemack View Post
Find the line in that file of the form
Code:
BEGIN POLY #skewness 1018899.74 norm 4.05e+22 alpha -5.02 Murphy_E 5.03e-13
with the largest Murphy_E number at the end, and post here the section from that 'BEGIN POLY' line to the 'END POLY' line which follows.
If you have a cat, grep and tail available you can do

Code:
cat *.cand | grep e-13 | sort -k 10 | tail
to automagically retrieve the lines containing the ten highest scores.

--
Cheers,
Jes
JHansen is offline   Reply With Quote
Old 2007-12-11, 22:13   #21
smh
 
smh's Avatar
 
"Sander"
Oct 2002
52.345322,5.52471

29·41 Posts
Default

I just started running 1M - 2M

Maybe create a new thread with the instructions and the reservations?
smh is offline   Reply With Quote
Old 2007-12-12, 03:20   #22
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

351610 Posts
Default

Taking 2M - 3M
bsquared is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Large Sequence Project direction henryzz Aliquot Sequences 17 2013-08-09 00:15
Year Over Year TF Progress petrw1 Factoring 3 2013-03-20 19:34
Top 10 GMP-ECM for the year bdodson GMP-ECM 142 2013-03-01 12:54
What year is it? E_tron Lounge 3 2004-12-31 13:43
1 Year QuintLeo Lounge 14 2003-11-14 07:56

All times are UTC. The time now is 08:14.


Tue Jul 27 08:14:22 UTC 2021 up 4 days, 2:43, 0 users, load averages: 2.35, 1.92, 1.79

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.