![]() |
[QUOTE=jrk;258657]The pol51opt poly you posted is really strange, and is probably a bug.[/QUOTE]
Quite a rare bug! Turns out that the a[4] coefficient being zero in your polynomial caused the skewness to be infinite, and this eventually breaks pol51opt. Thanks for your report. |
[QUOTE=biwema;258652]When changing the configuration of the perl file to run GGNFS on more than one core, only the lattice sieving part has more threads. While poly search, relation filtering, singleton removal and post processing only one thread is running. Is it here possible to do automatically split up poly search intro more threads? Is it possible that the remaining threads are continuing lattice sieving while one thread is doing relation filtering, duplicate removal and singleton removal to find out if there are enough relations? Maybe that part is unwieldy to implement.[/QUOTE]
If you use the script from [url]http://mersenneforum.org/showthread.php?t=14425&page=3[/url] it will use multiple threads for polynomial selection, as well as sieving and linear algebra. And it uses msieve for polynomial selection which generates better polynomials that pol51opt. The filtering, singleton removal and square root stages are still single threaded. I could make the square root stages multi-threaded, but that's quite a lot of work for saving less that 1% runtime. I'll think about having the remaining threads continuing lattice sieving while one thread is doing relation filtering, duplicate removal and singleton removal to find out if there are enough relations. But it needs some free time to think in. Thanks for the idea though. Chris K |
1 thought, filtering can use quite a lot of memory. Running several sieves in parallel with filtering could cause the box to start paging badly. So it would have to be an option that could be switched off.
How does msieve decide how much memory to use for filtering? Another point is that I would need to stop any sieves still running after filtering before I start linear algebra so that can use all available CPUs. I can kill them under Linux but do signals work under Windows? Chris K |
If the size of the intermediate files generated is more than half of the memory in the machine (or the amount of memory specified), filtering uses different techniques that rely on disk files more. But beyond a certain point the filtering will not try to use less memory, even if you tell it to.
|
[QUOTE=chris2be8;258688]I'll think about having the remaining threads continuing lattice sieving while one thread is doing relation filtering, duplicate removal and singleton removal to find out if there are enough relations. But it needs some free time to think in. Thanks for the idea though.[/QUOTE]
I use the script which I got from the aliquot tutorial (how to set up aliquiet, yafu, msieve etc.). [ I am running windows xp, so I use cygwin for running GGNFS. Until now I did not manage to automatically connect it to aliquiet. It just generate a directory with the name ggnfs_nnnnnnnnn(the candidate number) and puts a file test.n in it (containing the candidate number). After unsuccessfully starting GGNFS, it is running auto increasing ECM curves. It this point I take the candidate and manually start GGNFS. (just a small information on the side)] I guess, these script which I use is not optimal. If the number is 108 less, the initial estimate is about 1430000 relations. If the number is between 109 and 132 digits, the estimate is about 2860000 relations. If 133 digits and more, the initial estimate is about 5680000 relations. In reality, the necessary number of relations is much higher: C115: Relations needed: 7.85M / 7.76M, Relations found per 100k range: 713000 / 646000. After 5 ranges, the first relations filtering starts. After that another 6 / 7 sieve ranges and relation filterings are necessary until final factorisation. C120: Relations needed: 8.36M / 8.33M, Relations found per 60k range: 245000 / 231000. After 12 / 13 ranges, the first relations filtering starts. After that another 22 / 23 sieve ranges and relation filterings are necessary until final factorisation. C125: Relations needed: 9.1M / 10.19M, Relations found per 100k range: 246000 / 255000. After 11 / 12 ranges, the first relations filtering starts. After that another 26 / 28 sieve ranges and relation filterings are necessary until final factorisation. C130: Relations needed: 8.71M, Relations found per 100k range: 174200. After 17 ranges, the first relations filtering starts. After that another 33 sieve ranges and relation filterings are necessary until final factorisation. C137: Relations needed: 14.97M, Relations found per 100k range: 217000. After 26 ranges, the first relations filtering starts. After that another 43 sieve ranges and relation filterings are necessary until final factorisation. The last C137 was sieving until 5.7 M until the first relation filtering started (the other examples started filtering at 2.86M) The C137 was sieves with 3 threads. So one range of 100k took about 1.9 hours instead of 5.7 hours. At the beginning the intermediate relation filtering took a few minutes, at the end more than 1 hour. That means if using one thread, the intermediate filtering takes about 15% of the time; if using 3 threads, it is 33%. 2 Threads have to wait 1 hour until they can continue sieving another 2 hours. Here I agree it makes sense to continue sieving while one thread is filtering. But of course it makes much more sense to reduce the number of intermediate filtering. The best way to do that is to improve the initial guess of sieving. In most of the cases between 110-137 digits, an estimate, that is 2 to 2.5 times as much, is much better. In case of oversieving I think the final steps (linear algebra) will benefit a bit. When having a look at the filtering step there is a message “filtering wants nnnnnn more relations.” At the beginning, that number is quite random and it is nt possible to conclude the amount of sieving necessary. At the last few filtering steps, that estimate is better. During that intermediate filtering there are 3 output lines, which show the memory use. If that number is small (compared to the size of the number), for example “memory use: 1M” to 5M, the sieving is at the beginning. During the last few sieving steps these values increase quite a lot (“memory use: 50” to 200M). At this point it is possible to plan the correct remaining amount of sieving according to the average output of sieving and the number of relation which the filtering wants. I kept the logfiles of all about 185 GGNFS factorisations. That is the .n .poly .log and the final result file. If someone is interested, I can send these files (to do for example some statistics). |
Later versions of factMsieve.pl use the "filtering wants nnnnnnn more relations" message to estimate how many more relations too get before trying filtering again. But later versions of msieve seem to always ask for 1000000 more relations so it doesn't work very well.
Improving the original estimate is one of the FIXMEs in the script. I've got logs from lots of factorizations, but not enough free time to analyze them. A table of relations needed vs size of number and LPBA and LPBR would be useful, if it covers the full range from 90 to 150+ digits (I'm short of data at both ends). And I don't know if it differs for SNFS. Chris K |
[QUOTE=chris2be8;258866]But later versions of msieve seem to always ask for 1000000 more relations so it doesn't work very well.
[/QUOTE] That's unavoidable; you can't come up with a decent estimate of the remaining work to do, when singleton removal destroys the entire dataset. Any choice of answer at this stage will work, but the danger is that you will be working on a small problem and asking for much more than 1M relations will cause too much work to be done. Agreed that for a large job asking for 1M more relations means you'll be running the filtering once per day, uselessly for the first two weeks :) |
1 Attachment(s)
[QUOTE=chris2be8;258866]Later versions of factMsieve.pl use the "filtering wants nnnnnnn more relations" message to estimate how many more relations too get before trying filtering again. But later versions of msieve seem to always ask for 1000000 more relations so it doesn't work very well.
Improving the original estimate is one of the FIXMEs in the script. I've got logs from lots of factorizations, but not enough free time to analyze them. A table of relations needed vs size of number and LPBA and LPBR would be useful, if it covers the full range from 90 to 150+ digits (I'm short of data at both ends). And I don't know if it differs for SNFS. Chris K[/QUOTE] I have attached an excel-table which gives estimations for 82 up to 232 digits for GNFS, bases on factorizations I have done (up to ~150 digits) and factorizations that have been posted by other mersenneforum members. Further it gives estimates for SNFS for ~100 to 313 digits, again based on factorizations I have done up to difficulty a bit above 200 and factorizations posted by other mersenneforum members. Note: These tables are based on only a few (or even one) factorization(s) for each size, so these are just estimations and sometimes guesstimations. |
[QUOTE=jasonp;258870]That's unavoidable; you can't come up with a decent estimate of the remaining work to do, when singleton removal destroys the entire dataset. Any choice of answer at this stage will work, but the danger is that you will be working on a small problem and asking for much more than 1M relations will cause too much work to be done. Agreed that for a large job asking for 1M more relations means you'll be running the filtering once per day, uselessly for the first two weeks :)[/QUOTE]Work around I discovered is to rename the foo.fb and foo.ini files out of the way. The filtering code then fails to run and the sieving continues.
When you think you really have enough relations, rename them back again. Paul |
[QUOTE=biwema;258803]I use the script which I got from the aliquot tutorial (how to set up aliquiet, yafu, msieve etc.). [ I am running windows xp, so I use cygwin for running GGNFS. Until now I did not manage to automatically connect it to aliquiet. It just generate a directory with the name ggnfs_nnnnnnnnn(the candidate number) and puts a file test.n in it (containing the candidate number). After unsuccessfully starting GGNFS, it is running auto increasing ECM curves. It this point I take the candidate and manually start GGNFS. (just a small information on the side)]
... [/QUOTE] In case it is of any value, I have a page explaining how to set up AliWin (my program) and all the ggnfs, msieve, yafu, etc. needed to run Aliqueit on a Win XP machine. I do not use cygwin. You might find some of the steps helpful: [B][SIZE=2][URL="http://www.starreloaders.com/edhall/AliWin/AliWin.html"]AliWin2 - A GUI by Edwin C. Hall for Aliqueit by Mikael Klasson[/URL][/SIZE][/B] There is discussion about AliWin in [URL="http://www.mersenneforum.org/showthread.php?t=13365"]this thread[/URL]. |
I was under the impression that the python version of factMsieve was more widely used than the perl version, but maybe that's not the case. Is one of them considered more up-to-date than the other?
Anyway the python version seems to have a pretty good estimate of minimum relations; for most factorizations up to 130 digits I find that it usually runs filtering only once or twice. In fact that was my main reason for switching to it. |
| All times are UTC. The time now is 22:49. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.