mersenneforum.org  

Go Back   mersenneforum.org > Other Stuff > Archived Projects > NFSNET Discussion

 
 
Thread Tools
Old 2005-09-15, 23:44   #1
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22·5·373 Posts
Question Filtering Phenomenon

I just finished 2,993+ and started the filtering for 2,1322M.

With the same size factor bases, I had slightly more relations for
2,1322M than 2,993+ and thought I was done.

However...

My first filter pass was done with mergelevel 2 on the raw data.
It produced 5166697 equations on 1805419 ideals with filtmin set
to 18.8 million. 18.8 million is the factor base bound. This corresponds to
1.2 million primes in each factor base. Theoretically, therefore I require
at least 1805419 + 2400000 equations to build the matrix. I have about
916K more equations than I need. This should allow me to squeeze down
the matrix.

I then ran another filter pass to remove any duplicate and it removed about
78,000. So far, so good.

I then fully factored all relations and ran filter again. This time with
filtmin set to 250,000. The result was 2.554 million equations on 6.486
million ideals.

HUH???

The first filter pass *seemed* to have excess relations. Yet when
I reduced filtmin from the factor base bound to 250K, the number of ideals
increased by 4.68 million!! Yet there are only 2.4 million ideals in the
factor base, so the number of ideals should go up by AT MOST 2.4 million.

It is clear that I don't have enough relations and need to do some more
sieving. But how can a surplus of 900K relations turn into a deficit of
almost 4 million? Especially when the final filter step only discarded 75K
heavy relations???

Something doesn't add up.
R.D. Silverman is offline  
Old 2005-09-16, 01:36   #2
dleclair
 
dleclair's Avatar
 
Mar 2003

7×11 Posts
Default

When mergelevel is 0 or 1, the -regroup flag is automatically assumed.

So when you ran the pass to remove duplicates (-mergelevel 0) it destroyed all of your relationsets that were made in the previous pass with mergelevel 2.

Each relationset was turned into a set of distinct relations again and I believe this is what left you with the massive increase in unbalanced ideals.

Start over and eliminate duplicates before you do anything else.

-Don Leclair
dleclair is offline  
Old 2005-09-16, 04:00   #3
xilman
Bamboozled!
 
xilman's Avatar
 
"π’‰Ίπ’ŒŒπ’‡·π’†·π’€­"
May 2003
Down not across

24×13×53 Posts
Default

Quote:
Originally Posted by dleclair
When mergelevel is 0 or 1, the -regroup flag is automatically assumed.

So when you ran the pass to remove duplicates (-mergelevel 0) it destroyed all of your relationsets that were made in the previous pass with mergelevel 2.

Each relationset was turned into a set of distinct relations again and I believe this is what left you with the massive increase in unbalanced ideals.

Start over and eliminate duplicates before you do anything else.

-Don Leclair
I doubt this is the problem. True duplicates are removed at all mergelevels, I believe. The dups you removed at the second stage are those for which the same relation appears in two or more relationsets. You are, of course, correct that -regroup is set implicitly if mergelevel is 0 or 1 and that relationsets are destroyed. Experience shows that rebuilding them again can be quite tricky and this may be what Bob is seeing.

Here is my recipe for filtering. I don't claim it is optimal but it has served me well since the M811 debacle. FBB is the factor base bound, 18.8M in the present case

Step 0: mergelevel=0, filtmin=FBB (only really needs to be done if there is a significant number of duplicates in the raw data, most often from lattice sievers)

Step 1: mergelevel=1 filtmin=FBB, -keep= k*pi(FBB) (This does clique removal. k is a parameter in the range 2.5 through 3.0 and controls how many heavy weight relationsets may be discarded later. I experiment with this but values around 2.8 seem to work well.

Step 2: Refactor completely and build a matrix. The matrix itself is discarded but the logging output gives a precise value for the true excess; call this value XS.

Step 3: mergeleve=10 (or so; I experiment a bit with values 8-12), -maxpass=40, -maxdiscard=XS-2500, -maxrels=10 (or so. Lower it if too few relationsets are discarded, raise it if maxdiscard is reached quite early and when there are still plenty of postponed merges. A value between 7 and 12 is almost always about right).

Step 4: build the production matrix.

The only occasion when I ever use mergelevel=2 these days is when there are so many relations that the duplicate and singleton hashtables won't fit into memory with mergelevel=1 and my "quick and dirty singleton remover" program doesn't do a good enough job. In this case a -mergelevel=2, -maxpass=2, maxrels=50 (not a typo, genuinely high value) strips out singletons and a few of the very largest cliques. I then continue from step 1 above with the output of this step.

Hope this helps. If anyone has alternative recipes, please let me know.


Paul


P.S. Bystanders will have discovered by now that getting good matrices from the filtering stage of NFS is more of an art, or perhaps a craft, than a rigorously defined algorithm.
xilman is offline  
 

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
The big filtering bug strikes again (I think) Dubslow Msieve 20 2016-02-05 14:00
Filtering Error richs Msieve 8 2015-01-18 17:40
Filtering Sleepy Msieve 25 2011-08-04 15:05
Filtering R.D. Silverman Cunningham Tables 14 2010-08-05 08:30
Is this phenomenon new? spkarra Miscellaneous Math 6 2009-07-25 22:05

All times are UTC. The time now is 00:23.


Mon Nov 29 00:23:15 UTC 2021 up 128 days, 18:52, 0 users, load averages: 1.66, 1.29, 1.22

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.