[QUOTE=frmky;172123]
Not much oversieving. I would have expected the matrix to be much bigger. Even our 2,908+ matrix is a bit bigger. And the i7 is fast! That matrix took only a month. The 2,908+ matrix should finish in a couple of weeks, and it will have taken about 3.5 months on a 2GHz Barcelona K10. Greg[/QUOTE] I'd say there was a fair amount of oversieving; initially Bruce sieved 10M160M on both sides, getting 278146913 unique relations, and the matrix that arrived was noticeably bigger: Tue Mar 24 21:52:20 2009 matrix is 22586885 x 22587133 (6499.2 MB) with weight 1573087910 (69.65/col) with an ETA of about 1130 hours. There seem to be advantages in the linear algebra as well as in sieving yield to having a fairly large smallprime bound; 2+908 had to deal with an enormous duplication rate to get its relations. The i7 has a very good memory controller, and I think benefits significantly from being in a singleprocessor system so there's no requirement to check ownership of cache lines with a processor not on the same piece of silicon. I am surprised to have finished before 2+908 did. 
[QUOTE=10metreh;172127]How much ECM was run? Was the P58 an ECM miss?[/QUOTE]
The number was C249, diff 270 when Tom found it, so only 2*t50. I added 7*t50, as 11020 curves with B1 = 260M (default B2). Also, Tom reports [QUOTE]Taking out the P58 would have left a number probably slightly harder by GNFS than the SNFS was. [/QUOTE] perhaps illustrating Bob's point that these large composites aren't very good candidates for ecm factoring. My recollection (from late Jan/early Feb) is that this was the last hard number before my adjusting to p59/p60 factors found in snfs's from Greg and Tom. I'm just finishing c. 14*t50 on Serge's 2, 2068M, at c268 = diff 268. Bruce 
Timing and duplication estimates
I reran 0.01% of the sieving (Q=k*10^7 .. k*10^7+10^3) on one CPU of the i7 machine and extrapolated up (using perideal measures) for the yield and timings.
So I would estimate that the A10170 R10260 produced 430 million Rside and 280 million Aside raw relations, for a duplicate rate of nearenough 50% (367M unique), and took about 350 million CPUseconds: call it a hundred thousand CPUhours. This is about 30% longer than the C180 GNFS took last year, and rather over twice as long as 109!+1 has taken to sieve. A10160 R10160 would have been about 540 million raw relations (so a duplicate rate still essentially 50%, since 278M unique) in about 250 million CPUseconds, so we used about 10^8 CPUseconds on the cluster to save (1130821)*3600 ~ 10^6 seconds of realtime on the linalg machine. I think the cluster's big enough that this was a saving in terms of total time. 
Congratulations! Very impressive all around, and a very fast job for such a huge matrix!
The 9696 split is a nice entry for a modern Kunstkammer. (Sadly, there exists a [URL="http://hpcgi2.nifty.com/m_kamada/f/c.cgi?q=60001_198"]9797[/URL] split.) But anyway! S 
Finding 14 dependencies in the presence of those zerocharacter messages is also a relief. The other possibility was that too many quadratic characters generated these messages, so that you would get dependencies from the linear algebra but the square root would fail on all of them (or perhaps just half of them, with complaints that Newton iteration did not converge).
There's a fairly simple workaround to minimize the chance of that happening in the future, and it will become especially important now that jobs with 32bit large primes are becoming more common. 
[QUOTE=10metreh;172127]How much ECM was run? Was the P58 an ECM miss?[/QUOTE]
OK, that was the long version. Here's the short version: if I had found the p58, it would have been the 2nd largest on the current top10, after four months of global ecm effort, everyone. Factors above p57 are a gift, not a computational objective. On Xilman/Paul's point that ecm pretesting, on hard sieving candidates with small and medium sized factors removed is less likely to give a top10 factor, I now have three of these candidates with small factors p58, p59 and p60. (As well as a bunch with smallest factor p80+.) I'm still puzzled why untested numbers ought to be any more likely to give up a p62+ than one of these nearterm sieving candidates. Bruce 
One of the four dependencies did give me a 'Newton iteration did not converge' message, which presumably means that half of them would have but that I was lucky.
I may well not understand this correctly, but I thought the quadratic characters were there to kill off the 2part of the unit group of the underlying number field, and that there's no reason to believe that that 2part will be terribly large: Aoki's factorisations which say 'we found 64 dependencies and reduced by quadratic characters to 61' presumably mean that the 2part turned out to have precisely three generators. If the groups are normally that small, I wonder if Aoki's strategy of applying the characters afterwards might not be the right way to go. 
The groups typically are that small; most of the time allocating 5 quadratic characters is enough to guarantee that the square root will work correctly, and using more than the (unkown) minimum just uses up dense matrix rows. But that requires that each quadratic character doesn't divide any of the relations, and if you can't assure that then that charater is useless for guarantee purposes.
The only reasons the quadratic characters are computed at the start of the LA instead of the end are 1) the Lanczos code already has to solve a small Gauss elimination problem and that code would have to be duplicated elsewhere, and 2) the relations are already in memory when the LA starts so they don't have to be read again. Could you print the p and r values inside the for()loop on line 210 of gnfs/gf2.c, then exit after the loop finishes? This requires restarting the LA from scratch but only running long enough to read the relations from disk. The fact that you got a Newton failure at all, and a number of dependencies approximately equal to (expected number minus number of quadratic characters) all makes me suspect that only one or two quadratic characters are valid. 
What do you mean by a quadratic character dividing a relation? I suppose these are quadratic characters chi_p for some rational prime p and the concern is that p shouldn't appear on either side in any relation, which would explain why it was hard to find one having sieved with 32bit primes on both sides.
In which case, allowing 64bit p and working down from 2^64 feels as if it ought to be safe for quite a while ... even Dan Bernstein doesn't propose large primes of more than 40 bits! Or is it terribly slow to compute the values of the character for large p? 
64bit p would definitely solve the problem; I'm reluctant to go that route because msieve only has optimized code to find roots of polynomials with coefficients modulo 32bit p. The time needed to compute the characters is not a big concern.

6,335
6,335 c170 = p83.p88
37844794094580139581697623770911579688837081742561513466850889366516267662341180891 1327309015857828899623999948822386264843491918815374735541893912578511688338311537475701 
All times are UTC. The time now is 22:54. 
Powered by vBulletin® Version 3.8.11
Copyright ©2000  2022, Jelsoft Enterprises Ltd.