![]() |
|
|
#551 |
|
May 2008
3×5×73 Posts |
|
|
|
|
|
|
#552 | |
|
Sep 2009
24·131 Posts |
Quote:
The filtering, singleton removal and square root stages are still single threaded. I could make the square root stages multi-threaded, but that's quite a lot of work for saving less that 1% runtime. I'll think about having the remaining threads continuing lattice sieving while one thread is doing relation filtering, duplicate removal and singleton removal to find out if there are enough relations. But it needs some free time to think in. Thanks for the idea though. Chris K |
|
|
|
|
|
|
#553 |
|
Sep 2009
24×131 Posts |
1 thought, filtering can use quite a lot of memory. Running several sieves in parallel with filtering could cause the box to start paging badly. So it would have to be an option that could be switched off.
How does msieve decide how much memory to use for filtering? Another point is that I would need to stop any sieves still running after filtering before I start linear algebra so that can use all available CPUs. I can kill them under Linux but do signals work under Windows? Chris K |
|
|
|
|
|
#554 |
|
Tribal Bullet
Oct 2004
3×1,181 Posts |
If the size of the intermediate files generated is more than half of the memory in the machine (or the amount of memory specified), filtering uses different techniques that rely on disk files more. But beyond a certain point the filtering will not try to use less memory, even if you tell it to.
|
|
|
|
|
|
#555 | |
|
Mar 2004
38110 Posts |
Quote:
I guess, these script which I use is not optimal. If the number is 108 less, the initial estimate is about 1430000 relations. If the number is between 109 and 132 digits, the estimate is about 2860000 relations. If 133 digits and more, the initial estimate is about 5680000 relations. In reality, the necessary number of relations is much higher: C115: Relations needed: 7.85M / 7.76M, Relations found per 100k range: 713000 / 646000. After 5 ranges, the first relations filtering starts. After that another 6 / 7 sieve ranges and relation filterings are necessary until final factorisation. C120: Relations needed: 8.36M / 8.33M, Relations found per 60k range: 245000 / 231000. After 12 / 13 ranges, the first relations filtering starts. After that another 22 / 23 sieve ranges and relation filterings are necessary until final factorisation. C125: Relations needed: 9.1M / 10.19M, Relations found per 100k range: 246000 / 255000. After 11 / 12 ranges, the first relations filtering starts. After that another 26 / 28 sieve ranges and relation filterings are necessary until final factorisation. C130: Relations needed: 8.71M, Relations found per 100k range: 174200. After 17 ranges, the first relations filtering starts. After that another 33 sieve ranges and relation filterings are necessary until final factorisation. C137: Relations needed: 14.97M, Relations found per 100k range: 217000. After 26 ranges, the first relations filtering starts. After that another 43 sieve ranges and relation filterings are necessary until final factorisation. The last C137 was sieving until 5.7 M until the first relation filtering started (the other examples started filtering at 2.86M) The C137 was sieves with 3 threads. So one range of 100k took about 1.9 hours instead of 5.7 hours. At the beginning the intermediate relation filtering took a few minutes, at the end more than 1 hour. That means if using one thread, the intermediate filtering takes about 15% of the time; if using 3 threads, it is 33%. 2 Threads have to wait 1 hour until they can continue sieving another 2 hours. Here I agree it makes sense to continue sieving while one thread is filtering. But of course it makes much more sense to reduce the number of intermediate filtering. The best way to do that is to improve the initial guess of sieving. In most of the cases between 110-137 digits, an estimate, that is 2 to 2.5 times as much, is much better. In case of oversieving I think the final steps (linear algebra) will benefit a bit. When having a look at the filtering step there is a message “filtering wants nnnnnn more relations.” At the beginning, that number is quite random and it is nt possible to conclude the amount of sieving necessary. At the last few filtering steps, that estimate is better. During that intermediate filtering there are 3 output lines, which show the memory use. If that number is small (compared to the size of the number), for example “memory use: 1M” to 5M, the sieving is at the beginning. During the last few sieving steps these values increase quite a lot (“memory use: 50” to 200M). At this point it is possible to plan the correct remaining amount of sieving according to the average output of sieving and the number of relation which the filtering wants. I kept the logfiles of all about 185 GGNFS factorisations. That is the .n .poly .log and the final result file. If someone is interested, I can send these files (to do for example some statistics). |
|
|
|
|
|
|
#556 |
|
Sep 2009
40608 Posts |
Later versions of factMsieve.pl use the "filtering wants nnnnnnn more relations" message to estimate how many more relations too get before trying filtering again. But later versions of msieve seem to always ask for 1000000 more relations so it doesn't work very well.
Improving the original estimate is one of the FIXMEs in the script. I've got logs from lots of factorizations, but not enough free time to analyze them. A table of relations needed vs size of number and LPBA and LPBR would be useful, if it covers the full range from 90 to 150+ digits (I'm short of data at both ends). And I don't know if it differs for SNFS. Chris K |
|
|
|
|
|
#557 |
|
Tribal Bullet
Oct 2004
3·1,181 Posts |
That's unavoidable; you can't come up with a decent estimate of the remaining work to do, when singleton removal destroys the entire dataset. Any choice of answer at this stage will work, but the danger is that you will be working on a small problem and asking for much more than 1M relations will cause too much work to be done. Agreed that for a large job asking for 1M more relations means you'll be running the filtering once per day, uselessly for the first two weeks :)
|
|
|
|
|
|
#558 | |
|
Oct 2004
Austria
1001101100102 Posts |
Quote:
Further it gives estimates for SNFS for ~100 to 313 digits, again based on factorizations I have done up to difficulty a bit above 200 and factorizations posted by other mersenneforum members. Note: These tables are based on only a few (or even one) factorization(s) for each size, so these are just estimations and sometimes guesstimations. |
|
|
|
|
|
|
#559 | |
|
Bamboozled!
"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across
2×5,393 Posts |
Quote:
When you think you really have enough relations, rename them back again. Paul |
|
|
|
|
|
|
#560 | |
|
"Ed Hall"
Dec 2009
Adirondack Mtns
28·3·5 Posts |
Quote:
AliWin2 - A GUI by Edwin C. Hall for Aliqueit by Mikael Klasson There is discussion about AliWin in this thread. |
|
|
|
|
|
|
#561 |
|
Sep 2010
Portland, OR
22×3×31 Posts |
I was under the impression that the python version of factMsieve was more widely used than the perl version, but maybe that's not the case. Is one of them considered more up-to-date than the other?
Anyway the python version seems to have a pretty good estimate of minimum relations; for most factorizations up to 130 digits I find that it usually runs filtering only once or twice. In fact that was my main reason for switching to it. |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Installation of GGNFS | LegionMammal978 | Msieve | 17 | 2017-01-20 19:49 |
| Running other programs while running Prime95. | Neimanator | PrimeNet | 14 | 2013-08-10 20:15 |
| Error running GGNFS+msieve+factmsieve.py | D. B. Staple | Factoring | 6 | 2011-06-12 22:23 |
| GGNFS or something better? | Zeta-Flux | Factoring | 1 | 2007-08-07 22:40 |
| ggnfs | ATH | Factoring | 3 | 2006-08-12 22:50 |