Poly Search vs Sieving times
Sorry for the neophyte question, but for smaller composites ~150 digits across multiple machines, is there a real gain in overall sieving if a poly with a greater than expected E score is found early in the search?
Specifics: [code] Msieve v. 1.52 (SVN 886) random seeds: 55df12b7 e5d3c816 factoring 218876969782929216599190531580538390484044829345681891460187089862281796240529496476113446316632558113449224677358919720113328517497603198421529 (144 digits) searching for 15digit factors commencing number field sieve (144digit input) commencing number field sieve polynomial selection polynomial degree: 5 max stage 1 norm: 8.34e+21 max stage 2 norm: 1.39e+20 min Evalue: 1.03e11 poly select deadline: 311905 time limit set to 86.64 CPUhours expecting poly E from 1.12e11 to > 1.29e11 [/code](The msieve machine has four cores and I hard set the search time to 12 hours.) The following poly was found within the first few minutes: [code] # norm 8.158791e14 alpha 8.748871 e 1.309e11 rroots 5 skew: 52981938.39 c0: 5925697648913135366967245004294844179177 c1: 468536815244189902014756820336233 c2: 26076232373209850831435621 c3: 542019174557361769 c4: 10239243052 c5: 48 Y0: 21467961164039091417408999106 Y1: 1118175346469539 [/code]I'm running several machines and expect the sieving and LA to take approximately four days. (Of course, I may be way off.) Would a better poly outweigh the 11 hours of search time left for a 4860 hour project? Thanks... 
If the "expected score" is accurate at this size, you should not search further. You may well find a poly that's 5% faster by completing the search, but it's pretty clear that 5% of sieve time is greater than the polysearch time.
However, the Expected Score code is simply guesswork without previous experience about its accuracy, you don't know if you might improve by 10% or more. I'd compromise, and search a couple more hours if nothing comes close in that time, I'd assume I got lucky at the outset and proceed to sieving. A5 coefficients below 10000 should not be searched msieve takes longer at very low A5 values to do the same work. Also note skew is 50M, very high for this size of number; this occasionally causes hiccups in later stages (I have not run into such a hiccup yet, but I follow Frmky's advice to use large A5, say 550M for this size number). Large A5 coefficients have lower skews than very small values. 
Thanks!
My impatience got me to break in at about 2.5 hours of searching and that was the poly chosen  no others were close. My first set of relations came in at 2.2% of the estimated minimum (32523338) in about 1.2 hours. This calculates to (very) roughly 55 hours for sieving, although a couple of the machines will be off at night. I guess I'll see how it turns out... 
Wow! I have a new estimate for sieving time of ~26 hours. I guess the first estimate hadn't received relations from several of the machines. I'll have to see how it works out in reality.
The "Wow!" part is that I remember taking over two weeks to factor c100s... 
There I was!
closing in on a matrix, when the power company lost control, and my scripts weren't robust enough to carry through a restart, so I had to resort to semimanually* completing the job. So much for an accurate timing for this composite factorization... Oh, well, if anything good has come of it, it got me off my a** to place the main factoring machine on a UPS that was already sitting there, waiting to be put into use... *semimanually meaning a manual restart of factmsieve.py and manually concatenating all the relations from those machines that are still running... 
Well, that didn't work.:sad:
Now I'm lost! factmsieve.py wouldn't restart  kept giving an error 255. Renamed number.dat.cyc and that cleared the factmsieve.py error. But, number.dat was trashed. Rebuilt that, but factmsieve.py wouldn't go anywhere, so I have moved to msieve direct entry. Now all I get are 11 errors for the entire 4G number.dat file. Currently, I'm retrieving the rels from the spairs.save.gz file to see if I can get anywhere from there. 
One thing to try is restoring the relations found using the spairs.gz file like you're doing, deleting all the .cyc and such, and then rerunning using the "nc" command. That should take of it, if I understand your error correctly. I've had to do that before when my .dat file disappeared.

[QUOTE=wombatman;356146]One thing to try is restoring the relations found using the spairs.gz file like you're doing, deleting all the .cyc and such, and then rerunning using the "nc" command. That should take of it, if I understand your error correctly. I've had to do that before when my .dat file disappeared.[/QUOTE]
Thanks! That's what I've done and it is working through the msieve steps, but I have encountered another stumbling block: [code] matrix needs more columns than rows; try adding 23% more relations [/code]This keeps coming up even when I add more relations. This might be due to the poly. I'm suspecting this may be what VBCurtis posted about: [QUOTE=VBCurtis] A5 coefficients below 10000 should not be searched msieve takes longer at very low A5 values to do the same work. Also note skew is 50M, very high for this size of number; this occasionally causes hiccups in later stages (I have not run into such a hiccup yet, but I follow Frmky's advice to use large A5, say 550M for this size number). Large A5 coefficients have lower skews than very small values.[/QUOTE]I will keep adding for now, but at some point I may need to start over from scratch... 
The matrix finally built properly for a solve...

I have that issue sometimes as wellI've gotten to 120% of the estimated relations before the matrix builds, even when using a large interval step. Maybe somebody better versed in this can suggest a cause?

The estimates for # of relations in the factmsieve script are not very accurate it uses an exponential with size of number as input, while the actual situation is more like a step function with the steps located where lpbr jumps bits. Serge (Batalov) posted a patch to fix this in some thread I'll see if I can find and link it.
Actual rels needed is roughly 21M for 28bit projects, 40M for 29bit projects, and (I am told) the low 80s for 30bit projects. Some polys produce more or fewer duplicate relations, and sometimes a matrix builds with fewerthantypical relations, so there is a fair amount of variety from project to project. My vague grasp of the "hiccup" mentioned above is that skew alters the area sieved for each specialQ, with higher skews associated with smaller areas. Very high skews may require more specialQ to be searched, and that requirement can lead to using specialQ higher than a poly's efficient sieve range. So it's not that reaching 120% of the script's expected rels is a hiccup it's that the last few rels might be found at a rate half (or worse?) of the sec/rel you achieved during most of the sieving. If you read through threads of large forumteamfactorizations, you'll see discussions amounting to "we've run out of good Q, now what?". Highskew polys run out of good Q more often than lowskew polys. At singleusersize projects, we can preempt this problem by choosing the nexthigher siever version for a project we're even slightly nervous about. We might choose to use 14e instead of 13e for projects a few digits lower than the script's cutoff; in fact, I edited my factmsieve code to shift the cutoff a bit lower. As Mr Womack points out, this makes factorizations in the 135 to 150 digit level more fireandforget at the expense of a few hours of sieving. I have not yet attempted a 30bit project myself to know how much of this extends upward. Curtis 
All times are UTC. The time now is 06:49. 
Powered by vBulletin® Version 3.8.11
Copyright ©2000  2020, Jelsoft Enterprises Ltd.