![]() |
|
|
#1 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
72×131 Posts |
For the 143-digit GNFS that finished last week, I'd estimated that I needed forty million relations, and left a computer running while I was on vacation for two weeks.
The 34.87 million relations that I had when I got back turned out to be enough and to spare, so I have an example of the progress of substantial oversieving for which I'd never have deliberately wasted the CPU-weeks. I wrote a script to take the first N million relations and call msieve, for N=20..34; results are below, and I've attached the log.gz file imax is the value of N in 'filtering ideals with norm less than N'; m2 is two-way merge. Code:
Start Unique Ideals imax Post-singleton Wants/mx
20M 18.502 19.536 15614051 3545677/2895023 8330430
21M 19.375 20.017 15633659 4318113/3497620 7833588
22M 20.245 20.481 15653645 5122371/4105037 7255320
23M 21.110 20.929 15673242 5930338/4691768 6603966
24M 21.971 21.362 15691904 6749413/5265956 5880948
25M 22.828 21.779 15709384 7562202/5814714 5099214
26M 23.669 22.177 15719416 8366709/6339373 4266156
27M 24.490 22.553 15719863 9148543/6829981 [too much excess] imax->14147876 9140854/7011861 [m2] 5653246/3524254 [full merge] 3013765 cycles need 2746454, weight 178617306
28M 25.306 22.917 15721138 9928856/7304823 [too much excess x2 ] imax->12576910 9315044/7144296 [m2] 5979328/3808580 [full merge] 3131154 cycles need 2618780, weight 170250303
29M 26.118 23.270 15723042 10703805/7762032 [too much excess x2] imax->12578433 8604979/6413104 [m2] 5762959/3571084 [full merge] 3069587 cycles need 2529284, weight 164716409
30M 26.925 23.612 15725444 11473718/8203160 [too much excess x3] imax->11007810 8751460/6589631 [m2] 5991309/3829480 [full merge] 3144340 cycles need 2441680, weight 158959924
31M 27.728 23.944 15728554 11228760/7777498 [too much excess x3] imax->11009987 8379192/6186606 [m2] 5896958/3704372 [full merge] 3120073 cycles need 2384572, weight 155323303
32M 28.526 24.267 15732212 10539043/7087062 [too much excess x3] imax->11012548 8031318/5838275 [m2] 5753304/3560261 [full merge] 3072732 cycles need 2336461, weight 151973890
33M 29.322 24.581 15736155 10038466/6585693 [too much excess x3] imax->11015308 7748092/5554555 [m2] 5649210/3455673 [full merge] 3032556 cycles need 2295873, weight 149497600
34M 30.113 24.856 15741063 9671890/6218078 [too much excess x3] imax->11018744 7509170/5314972 [m2] 5579959/3385761 [full merge] 2999124 cycles need 2261961, weight 147036684
34.87M 30.797 25.143 15745305 9434087/5979445 [too much excess x3 ] imax->11021713 7330944/5136209 [m2] 5524394/3329659 [full merge] 2971342 cycles need 2233859, weight 145527382
matrix is 2231774 x 2233859 with weight 211557876 (avg 94.71/col); filter, save48
matrix is 2209607 x 2209854 with weight 161820670 (avg 73.23/col) runs in 12.5hrs on 2 cores C2@2400
I would be interested to know what the relationship between the 'weight of N cycles' line and the weight of the final matrix is; obviously I didn't want to run nine redundant huge linear-algebra jobs, and -nc1 doesn't build the matrix whilst -nc2 builds it and runs it. N is the size of the matrix and it's roughly square ... would it be possible to add a -nc1.5 which builds the matrix, tells you how big it is and stops? I was getting about a million relations per CPU-day on this project, so I wonder whether it's actually large enough that sieving beyond the first point that a matrix could be built would be sensible. Probably not; a 2.7M matrix would, all else equal, take 1.5 times as long to run as a 2.2M matrix, so I've saved six CPU-hours by spending a CPU-week. Can anyone think of anything else useful I can do with this dataset? It's about 2.3G gzipped and I'm used to posting DVDs by now. Last fiddled with by fivemack on 2007-09-28 at 16:57 Reason: explanation |
|
|
|
|
|
#2 |
|
Tribal Bullet
Oct 2004
67258 Posts |
I'll have to look more carefully at your numbers. To build but not solve the matrix, you can just call exit(-1) around line 592 of gnfs/gf2.c and then recompile.
The progression of percentage excess is the same for many different runs because clique processing reduces excess above 70% down to exactly 70%, and different factorizations have the same filtering behavior beyond that point. Right now if msieve is doing the sieving it pushes for 70% excess or more, whereas if you're just completing a factorization it wants at least 8% excess. When the filtering completes, the cycle weight reported is a (fairly accurate) approximation; the figure does not account for ideals below 100, along with incidental cancellation among the ideals that were not tracked. Experience shows that when the matrix is built the average number of nonzeros per column is about 6 higher than this figure. Counts of nonzeros reported in the linear algebra are exact. I suppose oversieving is useful if the matrix threatens to not fit in memory, but for big jobs it's pretty clear you should just go ahead with the postprocessing at the first opportunity. No need to mail a DVD this time, I've already taken too much of your beer money. Last fiddled with by jasonp on 2007-09-28 at 18:35 |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| "lanczos error: only trivial dependencies found" with massive oversieving | eigma | Msieve | 21 | 2015-05-28 03:27 |
| Oversieving | chris2be8 | Msieve | 7 | 2010-03-13 21:51 |
| Oversieving + estimate of relations needed | RedGolpe | Msieve | 10 | 2009-06-15 15:32 |
| v1.40 patch for massive NFS oversieving | jasonp | Msieve | 18 | 2009-04-09 03:20 |
| Extreme oversieving and msieve | jbristow | Msieve | 8 | 2008-01-02 21:43 |