mersenneforum.org Filtering Phenomenon
 User Name Remember Me? Password
 Register FAQ Search Today's Posts Mark Forums Read

 2005-09-15, 23:44 #1 R.D. Silverman     Nov 2003 22·5·373 Posts Filtering Phenomenon I just finished 2,993+ and started the filtering for 2,1322M. With the same size factor bases, I had slightly more relations for 2,1322M than 2,993+ and thought I was done. However... My first filter pass was done with mergelevel 2 on the raw data. It produced 5166697 equations on 1805419 ideals with filtmin set to 18.8 million. 18.8 million is the factor base bound. This corresponds to 1.2 million primes in each factor base. Theoretically, therefore I require at least 1805419 + 2400000 equations to build the matrix. I have about 916K more equations than I need. This should allow me to squeeze down the matrix. I then ran another filter pass to remove any duplicate and it removed about 78,000. So far, so good. I then fully factored all relations and ran filter again. This time with filtmin set to 250,000. The result was 2.554 million equations on 6.486 million ideals. HUH??? The first filter pass *seemed* to have excess relations. Yet when I reduced filtmin from the factor base bound to 250K, the number of ideals increased by 4.68 million!! Yet there are only 2.4 million ideals in the factor base, so the number of ideals should go up by AT MOST 2.4 million. It is clear that I don't have enough relations and need to do some more sieving. But how can a surplus of 900K relations turn into a deficit of almost 4 million? Especially when the final filter step only discarded 75K heavy relations??? Something doesn't add up.
 2005-09-16, 01:36 #2 dleclair     Mar 2003 7×11 Posts When mergelevel is 0 or 1, the -regroup flag is automatically assumed. So when you ran the pass to remove duplicates (-mergelevel 0) it destroyed all of your relationsets that were made in the previous pass with mergelevel 2. Each relationset was turned into a set of distinct relations again and I believe this is what left you with the massive increase in unbalanced ideals. Start over and eliminate duplicates before you do anything else. -Don Leclair
2005-09-16, 04:00   #3
xilman
Bamboozled!

"πΊππ·π·π­"
May 2003
Down not across

24×13×53 Posts

Quote:
 Originally Posted by dleclair When mergelevel is 0 or 1, the -regroup flag is automatically assumed. So when you ran the pass to remove duplicates (-mergelevel 0) it destroyed all of your relationsets that were made in the previous pass with mergelevel 2. Each relationset was turned into a set of distinct relations again and I believe this is what left you with the massive increase in unbalanced ideals. Start over and eliminate duplicates before you do anything else. -Don Leclair
I doubt this is the problem. True duplicates are removed at all mergelevels, I believe. The dups you removed at the second stage are those for which the same relation appears in two or more relationsets. You are, of course, correct that -regroup is set implicitly if mergelevel is 0 or 1 and that relationsets are destroyed. Experience shows that rebuilding them again can be quite tricky and this may be what Bob is seeing.

Here is my recipe for filtering. I don't claim it is optimal but it has served me well since the M811 debacle. FBB is the factor base bound, 18.8M in the present case

Step 0: mergelevel=0, filtmin=FBB (only really needs to be done if there is a significant number of duplicates in the raw data, most often from lattice sievers)

Step 1: mergelevel=1 filtmin=FBB, -keep= k*pi(FBB) (This does clique removal. k is a parameter in the range 2.5 through 3.0 and controls how many heavy weight relationsets may be discarded later. I experiment with this but values around 2.8 seem to work well.

Step 2: Refactor completely and build a matrix. The matrix itself is discarded but the logging output gives a precise value for the true excess; call this value XS.

Step 3: mergeleve=10 (or so; I experiment a bit with values 8-12), -maxpass=40, -maxdiscard=XS-2500, -maxrels=10 (or so. Lower it if too few relationsets are discarded, raise it if maxdiscard is reached quite early and when there are still plenty of postponed merges. A value between 7 and 12 is almost always about right).

Step 4: build the production matrix.

The only occasion when I ever use mergelevel=2 these days is when there are so many relations that the duplicate and singleton hashtables won't fit into memory with mergelevel=1 and my "quick and dirty singleton remover" program doesn't do a good enough job. In this case a -mergelevel=2, -maxpass=2, maxrels=50 (not a typo, genuinely high value) strips out singletons and a few of the very largest cliques. I then continue from step 1 above with the output of this step.

Hope this helps. If anyone has alternative recipes, please let me know.

Paul

P.S. Bystanders will have discovered by now that getting good matrices from the filtering stage of NFS is more of an art, or perhaps a craft, than a rigorously defined algorithm.

 Similar Threads Thread Thread Starter Forum Replies Last Post Dubslow Msieve 20 2016-02-05 14:00 richs Msieve 8 2015-01-18 17:40 Sleepy Msieve 25 2011-08-04 15:05 R.D. Silverman Cunningham Tables 14 2010-08-05 08:30 spkarra Miscellaneous Math 6 2009-07-25 22:05

All times are UTC. The time now is 00:23.

Mon Nov 29 00:23:15 UTC 2021 up 128 days, 18:52, 0 users, load averages: 1.66, 1.29, 1.22