mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2011-04-16, 07:20   #551
jrk
 
jrk's Avatar
 
May 2008

3×5×73 Posts
Default

Quote:
Originally Posted by jrk View Post
The pol51opt poly you posted is really strange, and is probably a bug.
Quite a rare bug! Turns out that the a[4] coefficient being zero in your polynomial caused the skewness to be infinite, and this eventually breaks pol51opt. Thanks for your report.
jrk is offline   Reply With Quote
Old 2011-04-16, 14:30   #552
chris2be8
 
chris2be8's Avatar
 
Sep 2009

24·131 Posts
Default

Quote:
Originally Posted by biwema View Post
When changing the configuration of the perl file to run GGNFS on more than one core, only the lattice sieving part has more threads. While poly search, relation filtering, singleton removal and post processing only one thread is running. Is it here possible to do automatically split up poly search intro more threads? Is it possible that the remaining threads are continuing lattice sieving while one thread is doing relation filtering, duplicate removal and singleton removal to find out if there are enough relations? Maybe that part is unwieldy to implement.
If you use the script from http://mersenneforum.org/showthread.php?t=14425&page=3 it will use multiple threads for polynomial selection, as well as sieving and linear algebra. And it uses msieve for polynomial selection which generates better polynomials that pol51opt.

The filtering, singleton removal and square root stages are still single threaded. I could make the square root stages multi-threaded, but that's quite a lot of work for saving less that 1% runtime.

I'll think about having the remaining threads continuing lattice sieving while one thread is doing relation filtering, duplicate removal and singleton removal to find out if there are enough relations. But it needs some free time to think in. Thanks for the idea though.

Chris K
chris2be8 is offline   Reply With Quote
Old 2011-04-16, 19:59   #553
chris2be8
 
chris2be8's Avatar
 
Sep 2009

24×131 Posts
Default

1 thought, filtering can use quite a lot of memory. Running several sieves in parallel with filtering could cause the box to start paging badly. So it would have to be an option that could be switched off.

How does msieve decide how much memory to use for filtering?

Another point is that I would need to stop any sieves still running after filtering before I start linear algebra so that can use all available CPUs. I can kill them under Linux but do signals work under Windows?

Chris K
chris2be8 is offline   Reply With Quote
Old 2011-04-16, 21:42   #554
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3×1,181 Posts
Default

If the size of the intermediate files generated is more than half of the memory in the machine (or the amount of memory specified), filtering uses different techniques that rely on disk files more. But beyond a certain point the filtering will not try to use less memory, even if you tell it to.
jasonp is offline   Reply With Quote
Old 2011-04-17, 19:35   #555
biwema
 
biwema's Avatar
 
Mar 2004

38110 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
I'll think about having the remaining threads continuing lattice sieving while one thread is doing relation filtering, duplicate removal and singleton removal to find out if there are enough relations. But it needs some free time to think in. Thanks for the idea though.
I use the script which I got from the aliquot tutorial (how to set up aliquiet, yafu, msieve etc.). [ I am running windows xp, so I use cygwin for running GGNFS. Until now I did not manage to automatically connect it to aliquiet. It just generate a directory with the name ggnfs_nnnnnnnnn(the candidate number) and puts a file test.n in it (containing the candidate number). After unsuccessfully starting GGNFS, it is running auto increasing ECM curves. It this point I take the candidate and manually start GGNFS. (just a small information on the side)]

I guess, these script which I use is not optimal. If the number is 108 less, the initial estimate is about 1430000 relations. If the number is between 109 and 132 digits, the estimate is about 2860000 relations. If 133 digits and more, the initial estimate is about 5680000 relations.

In reality, the necessary number of relations is much higher:

C115: Relations needed: 7.85M / 7.76M, Relations found per 100k range: 713000 / 646000.
After 5 ranges, the first relations filtering starts. After that another 6 / 7 sieve ranges and relation filterings are necessary until final factorisation.

C120: Relations needed: 8.36M / 8.33M, Relations found per 60k range: 245000 / 231000.
After 12 / 13 ranges, the first relations filtering starts. After that another 22 / 23 sieve ranges and relation filterings are necessary until final factorisation.

C125: Relations needed: 9.1M / 10.19M, Relations found per 100k range: 246000 / 255000.
After 11 / 12 ranges, the first relations filtering starts. After that another 26 / 28 sieve ranges and relation filterings are necessary until final factorisation.

C130: Relations needed: 8.71M, Relations found per 100k range: 174200.
After 17 ranges, the first relations filtering starts. After that another 33 sieve ranges and relation filterings are necessary until final factorisation.

C137: Relations needed: 14.97M, Relations found per 100k range: 217000.
After 26 ranges, the first relations filtering starts. After that another 43 sieve ranges and relation filterings are necessary until final factorisation.

The last C137 was sieving until 5.7 M until the first relation filtering started (the other examples started filtering at 2.86M)

The C137 was sieves with 3 threads. So one range of 100k took about 1.9 hours instead of 5.7 hours. At the beginning the intermediate relation filtering took a few minutes, at the end more than 1 hour. That means if using one thread, the intermediate filtering takes about 15% of the time; if using 3 threads, it is 33%. 2 Threads have to wait 1 hour until they can continue sieving another 2 hours.

Here I agree it makes sense to continue sieving while one thread is filtering.
But of course it makes much more sense to reduce the number of intermediate filtering. The best way to do that is to improve the initial guess of sieving. In most of the cases between 110-137 digits, an estimate, that is 2 to 2.5 times as much, is much better. In case of oversieving I think the final steps (linear algebra) will benefit a bit.

When having a look at the filtering step there is a message “filtering wants nnnnnn more relations.” At the beginning, that number is quite random and it is nt possible to conclude the amount of sieving necessary. At the last few filtering steps, that estimate is better.
During that intermediate filtering there are 3 output lines, which show the memory use. If that number is small (compared to the size of the number), for example “memory use: 1M” to 5M, the sieving is at the beginning. During the last few sieving steps these values increase quite a lot (“memory use: 50” to 200M). At this point it is possible to plan the correct remaining amount of sieving according to the average output of sieving and the number of relation which the filtering wants.


I kept the logfiles of all about 185 GGNFS factorisations. That is the .n .poly .log and the final result file. If someone is interested, I can send these files (to do for example some statistics).
biwema is offline   Reply With Quote
Old 2011-04-18, 09:06   #556
chris2be8
 
chris2be8's Avatar
 
Sep 2009

40608 Posts
Default

Later versions of factMsieve.pl use the "filtering wants nnnnnnn more relations" message to estimate how many more relations too get before trying filtering again. But later versions of msieve seem to always ask for 1000000 more relations so it doesn't work very well.

Improving the original estimate is one of the FIXMEs in the script. I've got logs from lots of factorizations, but not enough free time to analyze them. A table of relations needed vs size of number and LPBA and LPBR would be useful, if it covers the full range from 90 to 150+ digits (I'm short of data at both ends). And I don't know if it differs for SNFS.

Chris K
chris2be8 is offline   Reply With Quote
Old 2011-04-18, 10:49   #557
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3·1,181 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
But later versions of msieve seem to always ask for 1000000 more relations so it doesn't work very well.
That's unavoidable; you can't come up with a decent estimate of the remaining work to do, when singleton removal destroys the entire dataset. Any choice of answer at this stage will work, but the danger is that you will be working on a small problem and asking for much more than 1M relations will cause too much work to be done. Agreed that for a large job asking for 1M more relations means you'll be running the filtering once per day, uselessly for the first two weeks :)
jasonp is offline   Reply With Quote
Old 2011-04-18, 13:12   #558
Andi47
 
Andi47's Avatar
 
Oct 2004
Austria

1001101100102 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
Later versions of factMsieve.pl use the "filtering wants nnnnnnn more relations" message to estimate how many more relations too get before trying filtering again. But later versions of msieve seem to always ask for 1000000 more relations so it doesn't work very well.

Improving the original estimate is one of the FIXMEs in the script. I've got logs from lots of factorizations, but not enough free time to analyze them. A table of relations needed vs size of number and LPBA and LPBR would be useful, if it covers the full range from 90 to 150+ digits (I'm short of data at both ends). And I don't know if it differs for SNFS.

Chris K
I have attached an excel-table which gives estimations for 82 up to 232 digits for GNFS, bases on factorizations I have done (up to ~150 digits) and factorizations that have been posted by other mersenneforum members.

Further it gives estimates for SNFS for ~100 to 313 digits, again based on factorizations I have done up to difficulty a bit above 200 and factorizations posted by other mersenneforum members.

Note: These tables are based on only a few (or even one) factorization(s) for each size, so these are just estimations and sometimes guesstimations.
Attached Files
File Type: zip _GGNFS_Parameters_a.zip (18.5 KB, 134 views)
Andi47 is offline   Reply With Quote
Old 2011-04-18, 13:52   #559
xilman
Bamboozled!
 
xilman's Avatar
 
"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

2×5,393 Posts
Default

Quote:
Originally Posted by jasonp View Post
That's unavoidable; you can't come up with a decent estimate of the remaining work to do, when singleton removal destroys the entire dataset. Any choice of answer at this stage will work, but the danger is that you will be working on a small problem and asking for much more than 1M relations will cause too much work to be done. Agreed that for a large job asking for 1M more relations means you'll be running the filtering once per day, uselessly for the first two weeks :)
Work around I discovered is to rename the foo.fb and foo.ini files out of the way. The filtering code then fails to run and the sieving continues.

When you think you really have enough relations, rename them back again.

Paul
xilman is offline   Reply With Quote
Old 2011-04-18, 15:17   #560
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

28·3·5 Posts
Default

Quote:
Originally Posted by biwema View Post
I use the script which I got from the aliquot tutorial (how to set up aliquiet, yafu, msieve etc.). [ I am running windows xp, so I use cygwin for running GGNFS. Until now I did not manage to automatically connect it to aliquiet. It just generate a directory with the name ggnfs_nnnnnnnnn(the candidate number) and puts a file test.n in it (containing the candidate number). After unsuccessfully starting GGNFS, it is running auto increasing ECM curves. It this point I take the candidate and manually start GGNFS. (just a small information on the side)]
...
In case it is of any value, I have a page explaining how to set up AliWin (my program) and all the ggnfs, msieve, yafu, etc. needed to run Aliqueit on a Win XP machine. I do not use cygwin. You might find some of the steps helpful:
AliWin2 - A GUI by Edwin C. Hall for Aliqueit by Mikael Klasson

There is discussion about AliWin in this thread.
EdH is offline   Reply With Quote
Old 2011-04-18, 16:25   #561
bchaffin
 
Sep 2010
Portland, OR

22×3×31 Posts
Default

I was under the impression that the python version of factMsieve was more widely used than the perl version, but maybe that's not the case. Is one of them considered more up-to-date than the other?

Anyway the python version seems to have a pretty good estimate of minimum relations; for most factorizations up to 130 digits I find that it usually runs filtering only once or twice. In fact that was my main reason for switching to it.
bchaffin is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Installation of GGNFS LegionMammal978 Msieve 17 2017-01-20 19:49
Running other programs while running Prime95. Neimanator PrimeNet 14 2013-08-10 20:15
Error running GGNFS+msieve+factmsieve.py D. B. Staple Factoring 6 2011-06-12 22:23
GGNFS or something better? Zeta-Flux Factoring 1 2007-08-07 22:40
ggnfs ATH Factoring 3 2006-08-12 22:50

All times are UTC. The time now is 15:48.


Fri Aug 6 15:48:27 UTC 2021 up 14 days, 10:17, 1 user, load averages: 2.04, 2.25, 2.46

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.