mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Cunningham Tables (https://www.mersenneforum.org/forumdisplay.php?f=51)
-   -   6- table (https://www.mersenneforum.org/showthread.php?t=3577)

fivemack 2009-05-03 19:05

[QUOTE=frmky;172123]
Not much oversieving. I would have expected the matrix to be much bigger. Even our 2,908+ matrix is a bit bigger. And the i7 is fast! That matrix took only a month. The 2,908+ matrix should finish in a couple of weeks, and it will have taken about 3.5 months on a 2GHz Barcelona K10.
Greg[/QUOTE]

I'd say there was a fair amount of oversieving; initially Bruce sieved 10M-160M on both sides, getting 278146913 unique relations, and the matrix that arrived was noticeably bigger:

Tue Mar 24 21:52:20 2009 matrix is 22586885 x 22587133 (6499.2 MB) with weight 1573087910 (69.65/col)

with an ETA of about 1130 hours.

There seem to be advantages in the linear algebra as well as in sieving yield to having a fairly large small-prime bound; 2+908 had to deal with an enormous duplication rate to get its relations.

The i7 has a very good memory controller, and I think benefits significantly from being in a single-processor system so there's no requirement to check ownership of cache lines with a processor not on the same piece of silicon. I am surprised to have finished before 2+908 did.

bdodson 2009-05-03 21:34

[QUOTE=10metreh;172127]How much ECM was run? Was the P58 an ECM miss?[/QUOTE]

The number was C249, diff 270 when Tom found it, so only 2*t50. I added
7*t50, as 11020 curves with B1 = 260M (default B2). Also, Tom reports
[QUOTE]Taking out the P58 would have left a number probably slightly harder by GNFS than the SNFS was. [/QUOTE]
perhaps illustrating Bob's point that these large composites aren't very good
candidates for ecm factoring. My recollection (from late Jan/early Feb) is
that this was the last hard number before my adjusting to p59/p60 factors
found in snfs's from Greg and Tom. I'm just finishing c. 14*t50 on Serge's
2, 2068M, at c268 = diff 268. -Bruce

fivemack 2009-05-04 10:07

Timing and duplication estimates
 
I re-ran 0.01% of the sieving (Q=k*10^7 .. k*10^7+10^3) on one CPU of the i7 machine and extrapolated up (using per-ideal measures) for the yield and timings.

So I would estimate that the A10-170 R10-260 produced 430 million R-side and 280 million A-side raw relations, for a duplicate rate of near-enough 50% (367M unique), and took about 350 million CPU-seconds: call it a hundred thousand CPU-hours. This is about 30% longer than the C180 GNFS took last year, and rather over twice as long as 109!+1 has taken to sieve.

A10-160 R10-160 would have been about 540 million raw relations (so a duplicate rate still essentially 50%, since 278M unique) in about 250 million CPU-seconds, so we used about 10^8 CPU-seconds on the cluster to save (1130-821)*3600 ~ 10^6 seconds of real-time on the linalg machine. I think the cluster's big enough that this was a saving in terms of total time.

Batalov 2009-05-04 10:28

Congratulations! Very impressive all around, and a very fast job for such a huge matrix!

The 96-96 split is a nice entry for a modern Kunstkammer.
(Sadly, there exists a [URL="http://hpcgi2.nifty.com/m_kamada/f/c.cgi?q=60001_198"]97-97[/URL] split.) But anyway!

-S

jasonp 2009-05-04 14:50

Finding 14 dependencies in the presence of those zero-character messages is also a relief. The other possibility was that too many quadratic characters generated these messages, so that you would get dependencies from the linear algebra but the square root would fail on all of them (or perhaps just half of them, with complaints that Newton iteration did not converge).

There's a fairly simple workaround to minimize the chance of that happening in the future, and it will become especially important now that jobs with 32-bit large primes are becoming more common.

bdodson 2009-05-04 14:57

[QUOTE=10metreh;172127]How much ECM was run? Was the P58 an ECM miss?[/QUOTE]

OK, that was the long version. Here's the short version: if I had
found the p58, it would have been the 2nd largest on the current
top10, after four months of global ecm effort, everyone. Factors
above p57 are a gift, not a computational objective.

On Xilman/Paul's point that ecm pretesting, on hard sieving candidates
with small and medium sized factors removed is less likely to give
a top10 factor, I now have three of these candidates with small factors
p58, p59 and p60. (As well as a bunch with smallest factor p80+.)
I'm still puzzled why untested numbers ought to be any more likely to
give up a p62+ than one of these near-term sieving candidates. -Bruce

fivemack 2009-05-04 22:51

One of the four dependencies did give me a 'Newton iteration did not converge' message, which presumably means that half of them would have but that I was lucky.

I may well not understand this correctly, but I thought the quadratic characters were there to kill off the 2-part of the unit group of the underlying number field, and that there's no reason to believe that that 2-part will be terribly large: Aoki's factorisations which say 'we found 64 dependencies and reduced by quadratic characters to 61' presumably mean that the 2-part turned out to have precisely three generators. If the groups are normally that small, I wonder if Aoki's strategy of applying the characters afterwards might not be the right way to go.

jasonp 2009-05-05 01:09

The groups typically are that small; most of the time allocating 5 quadratic characters is enough to guarantee that the square root will work correctly, and using more than the (unkown) minimum just uses up dense matrix rows. But that requires that each quadratic character doesn't divide any of the relations, and if you can't assure that then that charater is useless for guarantee purposes.

The only reasons the quadratic characters are computed at the start of the LA instead of the end are 1) the Lanczos code already has to solve a small Gauss elimination problem and that code would have to be duplicated elsewhere, and 2) the relations are already in memory when the LA starts so they don't have to be read again.

Could you print the p and r values inside the for()-loop on line 210 of gnfs/gf2.c, then exit after the loop finishes? This requires restarting the LA from scratch but only running long enough to read the relations from disk. The fact that you got a Newton failure at all, and a number of dependencies approximately equal to (expected number minus number of quadratic characters) all makes me suspect that only one or two quadratic characters are valid.

fivemack 2009-05-05 10:41

What do you mean by a quadratic character dividing a relation? I suppose these are quadratic characters chi_p for some rational prime p and the concern is that p shouldn't appear on either side in any relation, which would explain why it was hard to find one having sieved with 32-bit primes on both sides.

In which case, allowing 64-bit p and working down from 2^64 feels as if it ought to be safe for quite a while ... even Dan Bernstein doesn't propose large primes of more than 40 bits! Or is it terribly slow to compute the values of the character for large p?

jasonp 2009-05-05 14:09

64-bit p would definitely solve the problem; I'm reluctant to go that route because msieve only has optimized code to find roots of polynomials with coefficients modulo 32-bit p. The time needed to compute the characters is not a big concern.

R.D. Silverman 2009-06-01 12:59

6,335-
 
6,335- c170 = p83.p88

37844794094580139581697623770911579688837081742561513466850889366516267662341180891
1327309015857828899623999948822386264843491918815374735541893912578511688338311537475701


All times are UTC. The time now is 22:54.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.