mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2007-04-26, 20:00   #23
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,541 Posts
Default

Quote:
Originally Posted by fivemack View Post
filtering wants 24671685 more relations

When it says 'filtering wants N more relations', presumably that's N more relations added to msieve.dat (in which case, since I get about 120k relations per CPU-hour, that's both CPUs of kolmogorov running until Tuesday) rather than N more relations after filtering; it seems a very precise request!
You're correct; the code wants the amount of excess to be at least 2.2x the excess needed to construct a matrix at all. If it doesn't get that much, it will still try to create a matrix unless the amount of excess is less than 1.2x the minimum, which your dataset doesn't appear to have. The estimate of relations needed is just a fixed multiple of the number needed to get the 2.2x figure. I agree it's a bit lame.

Based on the number of singleton pruning passes, you could benefit from an extra day spent sieving. The rule is that the number of pruning passes shoots up just before an explosion in the number of cycles occurs, then drops down to a smaller number of passes afterwards. It looks like your dataset is in the shooting-up phase, I suspect that once it goes down to maybe 10 filtering passes you'll have a good chance of building a decent matrix.

jasonp
jasonp is offline   Reply With Quote
Old 2007-04-27, 15:46   #24
Andi_HB
 
Andi_HB's Avatar
 
Mar 2007
Germany

23·3·11 Posts
Default

Hi @ all (and sorry for my bad English)
I tested msieve 1.19 with the 100 digit RSA Number and now have a question about the relations.
I started first msieve and get 8 relations for a b.
I started second time msieve with the same b and get for the same b 10 relations!
Is it normal that i get different relation?
The tested Number was the same again and the b was same again too!
It wasnt the last b !
Why the first run dont get 10 relations like the second run?
The relations are different relations and not double relations!
Greats Andi_HB
Andi_HB is offline   Reply With Quote
Old 2007-04-27, 19:33   #25
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

641910 Posts
Default

I am still a little confused about the 'wants more relations' figures.

I have been running msieve incrementally on a load of relations for a C123 that I'm simultaneously running in ggnfs;
Code:
800kQ : 6139265 relations, wants 6317355 more
900kQ:  6915691 relations, wants 6218106 more
1000kQ: 7684796 relations, wants 6099267 more
so the total number of relations that it wants seems to be going up very quickly as I give it more relations.

For a C108 that ran to completion,
Code:
3M relations -> wants 2144766 more
4M relations -> wants 1748337 more
4.5M relations -> wants 1390539 more
4.9M relations -> happily splits the number
fivemack is offline   Reply With Quote
Old 2007-04-27, 21:00   #26
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,541 Posts
Default

Quote:
Originally Posted by fivemack View Post
I am still a little confused about the 'wants more relations' figures.

so the total number of relations that it wants seems to be going up very quickly as I give it more relations.
It should depend to some extent on the size of large primes. For a C122 contributed by Greg, 13.5 million relations were plenty. See this post for a table with the number of cycles of a QS run using triple large primes. Unfortunately, you need to get something like 80-90% of the total relations before a cycle explosion is supposed to occur, and so for the majority of the run it looks like you're never going to finish.

The 'wants more relations' bit would be accurate if there was no explosion, so the expectation is that you do the filtering once when it has no hope of succeeding, then much later to provide a final correction, then a little bit later to complete the run. It works for 100-digit inputs, but I'd never tried it for larger stuff. Of course all of this can be tuned.

One other thing: in gnfs/filter/filter.c, there's a variable 'max_relset_size' that you should probably increase, once the filtering actually has enough excess to proceed. Just make it 32 for now; the next version will tune it a bit better.

Last fiddled with by jasonp on 2007-04-27 at 21:29
jasonp is offline   Reply With Quote
Old 2007-05-01, 15:37   #27
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,541 Posts
Default

Quote:
Originally Posted by Andi_HB View Post
Hi @ all (and sorry for my bad English)
I tested msieve 1.19 with the 100 digit RSA Number and now have a question about the relations.
I started first msieve and get 8 relations for a b.
I started second time msieve with the same b and get for the same b 10 relations!
Is it normal that i get different relation?
The tested Number was the same again and the b was same again too!
It wasnt the last b !
Why the first run dont get 10 relations like the second run?
The relations are different relations and not double relations!
Greats Andi_HB
Sorry for the delay, I just saw your question. It does sound a little strange; msieve versions prior to 1.19 had a bug that could have prevented some relations from being found. Also, if you sieved a range of b values and restarted from the last b in the range, the initializing of logarithms is different for that b value and msieve is slightly more sensitive. The goal is never to find all the relations that are there, but to collect a total of X relations in minimum time. If by making the sieving more sensitive you slow it down too much, then it's actually better to give up finding some relations.
jasonp is offline   Reply With Quote
Old 2007-05-03, 22:54   #28
smh
 
smh's Avatar
 
"Sander"
Oct 2002
52.345322,5.52471

29·41 Posts
Default

Hi Jason,

I gave 1.20 a try on a small (diff. ~110) number sieved with GGNFS.

After sieving GGNFS reports:

Code:
There are 144488 relations with 0 large primes.
And after reducing the weight of relations sets:
Code:
rels:1829859, initialFF:0, finalFF:131544
This is plenty, only 121498 needed (including 7% extraFF as set in factlat.pl). After solving the matrix and a couple of sqrt runs, it completes the factorization.

I moved the relations to msieve, which tells me:

Code:
D:\Factor\msieve\msieve 1.20>msieve -nc -v


Msieve v. 1.20
Fri May 04 00:37:47 2007
random seeds: 11bcb300 7d8f5da1
factoring 484789765067427158969221495354826110066146756804809774358679657732931097184707015431198995
0539483602079419687 (109 digits)
commencing number field sieve (109-digit input)
R0:  61705153508638067785728
R1: -1
A0:  21
A1:  0
A2:  0
A3:  0
A4:  0
A5:  8
size score = 5.767537e-008, Murphy alpha = 0.976670, combined = 4.164884e-008
restarting with 1878981 relations
generating factor base
factor base: found 40001 rational and 40468 algebraic entries
factor base complete:
49098 rational roots (max prime = 599999)
64388 algebraic roots (max prime = 799999)
added 3280 free relations

commencing relation filtering
commencing duplicate removal, pass 1
found 55340 hash collisions in 1882261 relations
commencing duplicate removal, pass 2
found 49122 duplicates and 1833139 unique relations
memory use: 36.9 MB
ignoring smallest 96368 rational and 96826 algebraic ideals
filtering rational ideals above 1248657
filtering algebraic ideals above 1248657
need 425026 more relations than ideals
commencing singleton removal, pass 1
relations with 0 large ideals: 72943
relations with 1 large ideals: 416795
relations with 2 large ideals: 780591
relations with 3 large ideals: 478448
relations with 4 large ideals: 84362
relations with 5 large ideals: 0
relations with 6 large ideals: 0
relations with 7+ large ideals: 0
1833139 relations and about 1795702 large ideals
commencing singleton removal, pass 2
found 938684 singletons
current dataset: 894455 relations and about 670312 large ideals
commencing singleton removal, pass 3
found 175532 singletons
current dataset: 718923 relations and about 481475 large ideals
commencing singleton removal, pass 4
found 49964 singletons
current dataset: 668959 relations and about 430171 large ideals
commencing singleton removal, final pass
memory use: 25.8 MB
commencing in-memory singleton removal
begin with 668959 relations and 442264 unique ideals
reduce to 612390 relations and 384879 ideals in 11 passes
max relations containing the same ideal: 14
not enough excess, attempting to create matrix anyway
filtering wants 592545 more relations
c109 factor: 484789765067427158969221495354826110066146756804809774358679657732931097184707015431198
9950539483602079419687
elapsed time 00:01:19
Why does msieve need so many more relations?

Last fiddled with by smh on 2007-05-03 at 22:55
smh is offline   Reply With Quote
Old 2007-05-03, 23:25   #29
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

144238 Posts
Default

Thanks for your work!

It has not exploded when processing 66.9 million relations on an SNFS200 (most special-Q from 5M to 10.5M and a few from 10.5M to 11M); took about ten minutes for duplicate removal to 62.9M unique relations, then about an hour to get to a matrix.

Pruning proceeded surprisingly fast, then
Code:
Thu May  3 20:40:02 2007  commencing linear algebra
Thu May  3 20:40:04 2007  factor base loaded:
Thu May  3 20:40:05 2007  1857859 rational ideals (max prime = 29999999)
Thu May  3 20:40:05 2007  1858933 algebraic ideals (max prime = 29999999)
Thu May  3 20:42:30 2007  read 5184274 cycles
Thu May  3 20:42:54 2007  cycles contain 11534332 unique relations
Thu May  3 20:45:18 2007  read 11534332 relations
Thu May  3 20:45:38 2007  using 32 quadratic characters above 1073721890
Thu May  3 20:51:41 2007  matrix is 5059837 x 5184274 with weight 323299158 (avg 62.36/col)
Thu May  3 20:52:54 2007  filtering completed in 5 passes
Thu May  3 20:52:56 2007  matrix is 4669844 x 4669924 with weight 279049524 (avg 59.75/col)
Thu May  3 20:53:39 2007  saving the first 48 matrix rows for later
Thu May  3 20:53:43 2007  matrix is 4669796 x 4669924 with weight 181181856 (avg 38.80/col)
Thu May  3 20:53:44 2007  matrix includes 64 packed rows
Thu May  3 20:53:44 2007  using block size 65536 for processor cache size 4096 kB
The linalg has been running for just over three hours and has done 180,000 out of 4669924 dimensions, so that suggests it should complete in about four days (say 1k dimensions per minute): does it do checkpoints? I suspect I've already passed the point where going away and doing two CPU-days more sieving will save even one day on the linalg. Rsize is 1.2GB, Vsize 2.6GB; this is a big job, but something kolmogorov can clearly cope with. Is it likely at any future stage to want more than 4GB working-set or 16GB total?

Do you see parallelised linear algebra (multiple threads on a single machine) as a possibility in the medium future?
fivemack is offline   Reply With Quote
Old 2007-05-03, 23:27   #30
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

72·131 Posts
Default

Quote:
Originally Posted by smh View Post
Why does msieve need so many more relations?
It doesn't need as many more relations as it says it does; I believe it's estimating the number of relations it needs as if there were no great explosion of relations as the number of relations starts to exceed the number of unique primes.

See my #83 in this thread for a case where it says it needs 1.4 million relations and manages total success with 0.4 million.
fivemack is offline   Reply With Quote
Old 2007-05-03, 23:47   #31
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,541 Posts
Default

Quote:
Originally Posted by smh View Post
Why does msieve need so many more relations?
As Tom mentioned, this is an estimate of the number of additional relations you would need to manage a comfortable amount of matrix excess, so that the filtering can produce a really good matrix. You can certainly create a matrix with the relations that you have, but it will be uncomfortably dense and the linear algebra will take longer than it has to. The current code will try to build a matrix if you have more than 20% excess relations, so with only 7% extra it gives up trying. Maybe 20% is excessive (it prefers 120% extra to make the filtering more efficient), I have noticed that the filtering creates many more cycles than it has to.

Tom, great to hear that things seemed to go okay. Could you PM me the logfile? Do you know what kind of peak memory use the process had? I think you oversieved by quite a bit, judging by how sparse the initial matrix was and how many empty rows got removed. The code tries for ~70 entries per column, so the matrix could have ended up smaller than it did. 4 days sounds about right for the solve time, there's no checkpoint facility so you'll have to cross your fingers. I'm definitely planning to multithread the linear algebra in the near future, given that people are lining up to run big jobs :)

Paul mentioned in other threads that NFSNET matrices for jobs somewhat bigger than this need 2GB or so, and it looks like that's a good upper limit on the size of jobs you'd want to solve on one machine. Hopefully the matrix sizes and density are comparable to what the CWI suite produces, and I haven't tuned things at all so perhaps efficiency will improve with experience.

jasonp

Last fiddled with by jasonp on 2007-05-04 at 00:00
jasonp is offline   Reply With Quote
Old 2007-05-04, 00:33   #32
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

210610 Posts
Default

Quote:
Originally Posted by smh View Post
Why does msieve need so many more relations?
If you want to force it to complete, you can lower the limit by lowering the 1.20 on line 206 (or near there) of gnfs/filter/filter.c. I lowered it to 1.15 before compiling to allow it to continue with 15% excess, but beyond that it's probably better to just do a little more sieving.

I also added the line
Code:
 printf("excess/fb_size = %g\n",(double)excess/fb_size);
just before that if block so if there's not yet enough relations, I can tell at a glance how close it's getting to being able to continue. Jason, would you mind adding that (using logprintf() of course!) to the official source?

Quote:
Originally Posted by jasonp View Post
I'm definitely planning to multithread the linear algebra in the near future, given that people are lining up to run big jobs :)
Really??

There is a lot of pent-up demand, as before this there hasn't been any way to reliably complete these large factorizations. Now, with Franke's poly selection & lattice sieve from GGNFS and msieve's postprocessing, a complete, reliable, fast, open-source end-to-end tool is available.

Thanks!
Greg
frmky is offline   Reply With Quote
Old 2007-05-04, 08:15   #33
smh
 
smh's Avatar
 
"Sander"
Oct 2002
52.345322,5.52471

29×41 Posts
Default

Quote:
Originally Posted by frmky View Post
If you want to force it to complete, you can lower the limit by lowering the 1.20 on line 206 (or near there) of gnfs/filter/filter.c. I lowered it to 1.15 before compiling to allow it to continue with 15% excess, but beyond that it's probably better to just do a little more sieving.
I figured that 16% more FF after reducing the weight would be sufficient.

I sieved another 5K specialQ (160 seconds) and msieve completed the factorization.

I guess just bad luck that i didn't have enough relations the first time i tried.

I'll keep the 20% in mind.
smh is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
error when running msieve 1.53 with cuda aein Msieve 9 2019-02-25 14:09
Help need to running Msieve appleseed Msieve 12 2016-04-10 02:31
Problem in running msieve with CUDA mohamed Msieve 20 2013-08-01 08:27
CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES when running msieve 1.5.0 with CUDA ryanp Msieve 3 2012-06-12 03:27
Trouble Running Msieve Sab Msieve 4 2009-07-07 06:19

All times are UTC. The time now is 01:32.


Sat Jul 17 01:32:25 UTC 2021 up 49 days, 23:19, 1 user, load averages: 1.55, 1.40, 1.28

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.