mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2011-05-16, 21:42   #23
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

72·131 Posts
Default ... time passes ...

5.688e-15 from

Code:
3.03459e+27 7731900 1577978512242172175599 95423605140653278897241546920041651315
(7M-8M, np1 pass done by dodson)

Code:
# norm 1.793708e-19 alpha -8.666055 e 5.688e-15 rroots 5
skew: 485356332.39
c0:  33880793824959521299080011222940458495986210202160
c1:  591092394815535524182704508614360372600396
c2: -3203243723016873436571358474351524
c3: -22038823039813441973676341
c4:  15292135913181844
c5:  7731900
Y0: -95423604518010111820719069727473958907
Y1:  1577978512242172175599
(no point anyone sieving this one, the polynomial selection continues)

Last fiddled with by fivemack on 2011-05-16 at 21:42
fivemack is offline   Reply With Quote
Old 2011-05-19, 09:54   #24
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

11001000100112 Posts
Default Out of contact; keep sieving if you want

Two small logistic problems:
  • The FTP server didn't come back up when I rebooted my internet-facing machine, and I can't figure out why
  • I'm off on a two-week business trip starting Saturday early-morning and probably won't have access either to my home computers or to mersenneforum

Last fiddled with by fivemack on 2011-05-19 at 09:54
fivemack is offline   Reply With Quote
Old 2011-05-20, 20:09   #25
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

191316 Posts
Default Logistic issue resolved

The Ubuntu upgrade helpfully replaced vsftpd with a new version which, when the ownership or content of the /etc/vsftpd.conf file was not to its liking, printed a message out to somewhere that got entirely lost if you started it from /etc/init.d and stopped. If you ran 'sudo /usr/sbin/vsftpd' from the command line then the message was not auto-lost and it was possible to figure out what was going on.

Feel free to run -np1 on whichever areas you want (upload is the traditional ftp ssh.fivemack.org in directory 7+374.polsel) ; I will throw the whole 48-core machine at -np2 on them when I come back June 4th. It would be best if we aim to stop -np1 by June 4, we really ought to have enough data by then.

And I will be carrying a laptop around New England and so able to ssh back home and check that all things are mostly OK; if the house server crashes, which is probability about 30% in a two-week period, that will stop being the case.

(I have just spent half an hour carefully arranging that the C170 sieving job on the 48-core machine is running on local disc and will complete even if the house server crashes; what was the point in buying 64G of memory if I don't occasionally run 96 jobs each wanting half a gigabyte?)

Last fiddled with by fivemack on 2011-05-20 at 20:13
fivemack is offline   Reply With Quote
Old 2011-06-11, 10:44   #26
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

72·131 Posts
Default Results from very substantial G197 searching

So, Bruce devoted a few dozen CPUs to running -np1, and I've just finished running -np2 over the results.

This was enough to get over the 6e-15 barrier; the lucky -np1 hit was

Code:
124620 4977338520063549233737 217869038441461724768935117235124202672
with a not-terribly-exciting np1 score of 1.717e+27 - 49 of the 2866472 -np1 hits with C5 < 1e6 had that low or lower an np1 score - but which managed to produce 29 np2 hits with E greater than 5e-15. The absolute best np1 score in that range was 3.449e+26 from

Code:
505164 3475066243837708248653 164674588686335112717624531349233074163
which gave no stage-2 hits better than 3.877e-15.

The best stage-2 hit from the 124620 line is

Code:
# norm 1.985335e-19 alpha -8.783959 e 6.129e-15 rroots 5
skew: 2577393182.85
c0:  458715885739365187351527378471335055263545887245640
c1:  7381941582106070427594162337636746070796328
c2: -1142745234737742233800853906069268
c3: -8354934962617700027770087
c4:  202809156165122
c5:  124620
Y0: -217869036821239798757879347382504435503
Y1:  4977338520063549233737
n: 61173781879800813987062254208969082152381029438415262556799619943895079615422740343994343770534689647933527791161943080608113058309486560791550732809340767016424578028793088881479019117986214217881
Running gnfs-lasieve4I16e with
Code:
lpbr: 32
lpba: 33
mfbr: 64
mfba: 96
rlambda: 2.6
alambda: 3.6
alim: 200000000
rlim: 200000000
gets a runtime of 1.9 ± 0.05 seconds per relation (when sieving on 48 cores of the large machine, so I suspect the runtime will be better on almost any more conventional hardware) and a yield around 1.8 ± 0.2 relations per Q. I am doing the comparison with the previous-best polynomial; I don't know whether sieving 2000Q ranges at 5M intervals will be enough to show a difference, but in a couple of hours I will.

If jrk or jasonp has any ideas as to whether it's possible to squeeze the 124620 hit further, I would be very interested to hear them.

Hits responsible for other very good np2 include

Code:
308880 4992294099072239261749 181700075733825213335950478154509524198
1924536 2640146002363428816427 126022432502933249670602712953751916291
3131100 3419168484366831651509 114333547388787768308700764775729943273
The largest alpha I saw was for
Code:
135408 3622028521468726992407 214281264898459728886579726096480878475

# norm 9.157441e-20 alpha -10.444364 e 3.681e-15 rroots 3
skew: 14285983896.58
c0: -17894629948875926379165553599552107740491923099531081040
c1: -14999594802058782437261103180906020086616581836
c2:  262965154295269926440176467490577492
c3:  139555033471835254948242467
c4: -1651800663282328
c5:  135408
Y0: -214281273735735328129840104080426528827
Y1:  3622028521468726992407

Last fiddled with by fivemack on 2011-06-11 at 10:51
fivemack is offline   Reply With Quote
Old 2011-06-11, 13:27   #27
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

72·131 Posts
Default Hmm, that's odd

The E=5.688 polynomial that I posted three weeks ago appears absolutely to outclass the E=6.129 one: with the same parameters, the yield is higher (2.2 ± 0.2) and the time per relation lower (1.52 ± 0.06). Blue diamonds are 6.129 and brown squares 5.688; top graph is yield, bottom is runtime.

I've tried changing size_score.c to use rfb_limit = afb_limit = 2e8 (though the sieve area is about right) and see if that makes any difference. The E values come out higher, size and alpha come out exactly the same but the precise polynomial is different. The np2 completes in 36 seconds when I'd have been happy to let it take 360000. And the 7731900 line which gave the E=5.688 polynomial now gives something with E=26.47, the 124620 line gives E=28.10; I'm sieving those two polynomials now.

Would someone who actually understands polynomial selection mind taking a look at this?
Attached Thumbnails
Click image for larger version

Name:	Screen shot 2011-06-11 at 14.23.00.png
Views:	147
Size:	47.8 KB
ID:	6719  

Last fiddled with by fivemack on 2011-06-11 at 13:28
fivemack is offline   Reply With Quote
Old 2011-06-11, 14:48   #28
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

1101110101012 Posts
Default

You probably already know this, but the E score integrates Dickman's function over the sieving interval; the choice of factor base bounds controls the size of the inputs to Dickman's function, so I would expect the E score to be larger with larger bounds (but *not* comparable to other E scores that involve different parameters). The size and alpha optimization are identical when stage 2 is run with different E score parameters, it's only the last stage that changes a little.

Regarding the reported scores, what the output calls a 'size score' is actually Bernstein's superelliptic integral, which is independent of the choice of factor base bounds, and is also independent of the translation and skew applied to the polynomial or the sieving region. The code calculates Bernstein's score and if it is high enough then it moves to a phase which changes the polynomial translation and the skew of the sieving region to optimize the E value with your precise choice of parameters.

Regarding the differences you see in actual sieving performance, I wish I knew what was going on. One nontrivial possibility is that the E score calculation code has a bug; if you look at the polynomial Shi Bai found for RSA190, the E value reported by the CADO tools is ~10% different from that reported by Msieve. Of course the CADO tools use the midpoint rule with a fixed number of points to compute the integral, and Msieve uses a fancy adaptive numerical integrator, plus for smaller problems the two toolkits agree to several decimal places. Another theoretical possibility is that the alpha score can drastically affect the E-value but affects actual sieving performance to a lesser degree.

(My understanding of polynomial selection basically boils down to a few rules of thumb backed up by experiment, so make of it what you will)
jasonp is offline   Reply With Quote
Old 2011-06-12, 04:30   #29
jrk
 
jrk's Avatar
 
May 2008

3·5·73 Posts
Default

Quote:
Originally Posted by fivemack View Post
The np2 completes in 36 seconds when I'd have been happy to let it take 360000.
The core root sieve in msieve is very fast, but the final polynomial optimization is slow, so only the top several hits identified by the root sieve are optimized. (Older versions of msieve optimized every hit, which sometimes took hours or even days). The number of hits that will be optimized is set by the define ROOT_HEAP_SIZE in gnfs/poly/stage2/stage2.h, which you could increase to something larger if you want more polynomials to be saved.

I just tried ROOT_HEAP_SIZE of 2000 on your 124620 hit but did not find a better poly than yours.

Rather than spend much more time on that one hit, though, I'd suggest that running more stage1 would be a better use of the time.

Last fiddled with by jrk on 2011-06-12 at 04:49
jrk is offline   Reply With Quote
Old 2011-06-12, 13:02   #30
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

67258 Posts
Default

Actually it's more complex than that; the root sieve does not try every rotation polynomial but only sieves over lattices that start off with a high alpha value. The number of such lattices determines how much time that -np2 takes. For degree 5, the number of lattices is determined by XLINE_HEAP_SIZE in root_sieve_deg45_x.c, plus there's a second pruning step which throws away lattices whose alpha value has dropped too much compared to the best lattice in the list.

Older msieve versions chose the best few lattices for each line in the root sieve, and didn't do a good job mixing together polynomials that had large differences in size with large differences in alpha. The combination of those two factors is what made -np2 take so long. Based on the few large tests I've run, the current code finds about as many good polynomials as the old code did, but the polynomials found are all different.
jasonp is offline   Reply With Quote
Old 2011-06-15, 02:36   #31
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

100101000001012 Posts
Default

I have a spontaneous test case for a small c137:
Code:
n: 55113527302612334015041762286858109409418146419252683854367586075650292349447160708066082925907118458964816183993473338785573120046340939
type: gnfs
# norm 3.292896e-13 alpha -6.833476 e 3.241e-11 rroots 3
skew: 3696081.25
c0: -22019602909562003969310405023527185
c1: -201072884330895337125213165258
c2:  63857116377641212890151
c3:  23507631339659246
c4: -1209656558
c5:  420
Y0: -666195921926953147097928626
Y1:  496735677297533
 
n: 55113527302612334015041762286858109409418146419252683854367586075650292349447160708066082925907118458964816183993473338785573120046340939
type: gnfs
# norm 3.456451e-13 alpha -7.622327 e 3.243e-11 rroots 5
skew: 4102759.24
c0: -336738958941593502484536048912975117
c1:  1088112692255552010771427070134
c2:  561853043161021849277690
c3: -72676970404496174
c4: -30262489353
c5:  3420
Y0: -437974271697030703638562058
Y1:  1241617689900461
 
n: 55113527302612334015041762286858109409418146419252683854367586075650292349447160708066082925907118458964816183993473338785573120046340939
type: gnfs
# norm 3.381103e-13 alpha -7.961442 e 3.234e-11 rroots 5
skew: 2250081.72
c0:  10281403550470299300691706560892400
c1:  370715241135066511698623968756
c2: -364315592487140042932968
c3: -333075734391924531
c4:  60370062818
c5:  10200
Y0: -351990171410644704144295321
Y1:  732752881106657
These three polys have the same (and a fairly good) E, but the one with the smallest c5 sieves the best. Despite having every component worse than the second one. The third one is slower still.

Is there are c5 dependency that is not reflected in the combined score? Could it be tracked and hacked in? ;-)
_______

Fish 3: Jr ner abg ovt ba fpvrapr, jr whfg jnag vg snfg ...naq abj!
Batalov is offline   Reply With Quote
Old 2011-06-15, 02:38   #32
bai
 
May 2011

23 Posts
Default

Quote:
Originally Posted by jasonp View Post
One nontrivial possibility is that the E score calculation code has a bug; if you look at the polynomial Shi Bai found for RSA190, the E value reported by the CADO tools is ~10% different from that reported by Msieve. Of course the CADO tools use the midpoint rule with a fixed number of points to compute the integral, and Msieve uses a fancy adaptive numerical integrator, plus for smaller problems the two toolkits agree to several decimal places
The E scores for the two polynomials correlate with those computed in CADO.
Poly 1 (c5: 7731900) gives
Quote:
# skew: 259981312.000000
# lognorm 65.28, alpha -8.67, 5 rroots
# Murphy's E(Bf=10000000,Bg=5000000,area=1.00e+16)=5.00e-15
Poly 2 (c5: 124620) gives
Quote:
# skew: 1344798720.000000
# lognorm 65.00, alpha -8.78, 5 rroots
# Murphy's E(Bf=10000000,Bg=5000000,area=1.00e+16)=5.38e-15
Could you please try to use I=15 for the sieving test (would be better than 16 for this size I guess?). In my test, gnfs-lasieve4I15e on the poly 2 (with better E) outperforms the poly 1. poly 2 vs poly 1 --> 2.11306 sec/rel vs. 2.69200 sec/rel). Parameters are,

Quote:
lpbr: 32
lpba: 33
mfbr: 64
mfba: 96
rlambda: 2.2
alambda: 3.2
rlim: 150000000
alim: 300000000
type: gnfs
Quote:
Originally Posted by jasonp View Post
Another theoretical possibility is that the alpha score can drastically affect the E-value but affects actual sieving performance to a lesser degree.
I guess this could happen sometimes. In the alpha score model, we assume that a random integer being a root for the univariate polynomial f(x) in uniform chance. In practice, the homogenized polynomial F(x, y) where x coprime with y does not give r (in x/y = r (mod p)) in an uniform sense. However, in practice, i never found this is significant unless the alpha score are very very close.
bai is offline   Reply With Quote
Old 2011-06-15, 03:00   #33
jrk
 
jrk's Avatar
 
May 2008

3·5·73 Posts
Default

Quote:
Originally Posted by bai View Post
In my test, gnfs-lasieve4I15e on the poly 2 (with better E) outperforms the poly 1. poly 2 vs poly 1 --> 2.11306 sec/rel vs. 2.69200 sec/rel). Parameters are,

Code:
lpbr: 32
lpba: 33
mfbr: 64
mfba: 96
rlambda: 2.2
alambda: 3.2
rlim: 150000000
alim: 300000000
type: gnfs
Note that the lambda parameters differ between ggnfs lasieve4 and CADO. In ggnfs, the base for lambda is the factor base bound while in CADO it is the large prime bound. So you should use larger lambda with ggnfs, probably about 2.6 for rlambda and 3.6 for alambda in this case. Else you will lose a lot of relations.

I don't know how much difference that will make for your trials, though.
jrk is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
c176 polynomial search bdodson Msieve 45 2010-10-29 19:39
109!+1 polynomial search fivemack Factoring 122 2009-02-24 07:03
5^421-1 polynomial search fivemack Factoring 61 2008-07-21 11:16
6^383+1 by GNFS (polynomial search; now complete) fivemack Factoring 20 2007-12-26 10:36
GNFS polynomial search tools JHansen Factoring 0 2004-11-07 12:15

All times are UTC. The time now is 00:59.


Sat Jul 17 00:59:54 UTC 2021 up 49 days, 22:47, 1 user, load averages: 1.84, 1.43, 1.37

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.