mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2021-10-04, 20:58   #1
Plutie
 
"Evan"
Dec 2020
Montreal

23×32 Posts
Default Optimization and selection of the top scoring polynomial from a larger set

I'm working on a bash script to facilitate optimization and selection of the top scoring polynomial from a larger set (i.e. the msieve.dat.ms file from running msieve -np1 -nps). I have the actual optimization process done, but I'm still trying to figure out what the optimal choices for sopteffort and ropteffort based on the composite's size. I think that VBCurtis' optimized CADO parameters would be useful as a first reference, but those only have ropteffort values. Does anyone have an idea of what the scaling should be for sopteffort, as well as for ropteffort?

I'm attaching my current script to this post. The only requirement right now is that you have an input file that will work with CADO's sopt binary.

My current plans are:
- figure out a way to convert msieve's nps output into CADO-compatible polynomials
- split the sopt effort over multiple threads (currently only ropt has a threads parameter)
- implementing duplicate polynomial removal before every step

Feedback/advice is appreciated, thank you!

P.S. I'm not sure if this belongs here or if it should be moved to the programming sub-forum instead.
Attached Files
File Type: zip Scripts.zip (1.2 KB, 20 views)

Last fiddled with by EdH on 2021-10-06 at 12:08 Reason: forgot something!
Plutie is offline   Reply With Quote
Old 2021-10-04, 21:23   #2
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3×372 Posts
Default

Quote:
Originally Posted by Plutie View Post
. . .
Feedback/advice is appreciated, thank you!

P.S. I'm not sure if this belongs here or if it should be moved to the programming sub-forum instead.
In my spin scripts, based on Max0526's spin routine, I use sopt 0, 1, 10, 100 and ropt 0, 0.01, 1, 2, 10, 100. The initial discussion on the script is here. My current work is much more complicated, adding in some Msieve ropt. But, I'm not getting any better "spins." I look forward to seeing your script to see if I can find something to move mine forward. I'm also waiting to hear from Max on a couple notes I sent.

I think this thread might be better in the "Factoring" sub-forum, only because it involves more than just CADO-NFS, but let's leave it here for now and if the time comes to move it, I can do so.
EdH is offline   Reply With Quote
Old 2021-10-04, 22:20   #3
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

37·137 Posts
Default

Below 145 digits, I find that using sopt in a CADO poly select is slower than using that time to search a wider ad-range.

I've just recently begun trying to determine when sopt=1 provides a benefit greater than its time cost. By C180+, a high number for sopt like 8 or 10 seems to be worthwhile and perhaps even higher, so there ought to be some kind of scaling-up in between.
VBCurtis is offline   Reply With Quote
Old 2021-10-06, 03:19   #4
Plutie
 
"Evan"
Dec 2020
Montreal

1108 Posts
Default

Got the script into a working state, should be ready for use for anyone!

Unzip the scripts zip into your CADO build/[user]/polyselect directory. All you need in order to run it is a file from either msieve's size-optimization output (msieve -np1 -nps) or already in CADO's poly format. Run the script with ./polyFinder.sh (might need to run chmod +x polyFinder.sh first.) It should just run through all the steps on its own. The default sopteffort and ropteffort are higher, but I'll be fixing them up eventually. They shouldn't necessarily go any higher than what they're currently set to though.

Enjoy!

P.S. @mods: could the thread be renamed to something more appropriate? (and moved if necessary.)

Forgot to actually attach the zip, here it is. Could this be merged into the post above? (Done -EdH)
Attached Files
File Type: zip polyopt-1.0.0.zip (2.0 KB, 24 views)

Last fiddled with by EdH on 2021-10-06 at 12:12 Reason: OP Request for merge
Plutie is offline   Reply With Quote
Old 2021-10-06, 13:29   #5
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3×372 Posts
Default

I d/l'd your latest and am having some difficulties getting it set up.

First, I had to install pv and gawk. They were not defaults in my Ubuntu, but were readily available. I am totally unfamiliar with pv, but it looks like a pipe monitoring tool. Is this supplying the "0:00:00 [76.9k/s] [>..." line?

I then, first tried an Msieve polynomial:
Code:
n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509
R0: -633156115026647924966242353927160849
R1: 1838892199331450047
A0: -4055358059413560189793755056163937883873351483520
A1: 16581144338788276414460714496465421079736
A2: -636936169159140542086060941338
A3: -29838939405307828973869
A4: -3553046772992
A5: 3408
skew 1367091201.68
# size 6.377e-018, alpha -8.529, combined = 5.338e-014 rroots = 5
Then I tried manually converting it to CADO-NFS fomat:
Code:
n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509
Y0: -633156115026647924966242353927160849
Y1: 1838892199331450047
c0: -4055358059413560189793755056163937883873351483520
c1: 16581144338788276414460714496465421079736
c2: -636936169159140542086060941338
c3: -29838939405307828973869
c4: -3553046772992
c5: 3408
skew: 1367091201.68
# size 6.377e-018, alpha -8.529, combined = 5.338e-014 rroots = 5
For both, I got:
Code:
$ bash polyFinder.sh
Enter the number being factored: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509
What file are your polynomials stored in? polys
How many threads will you be using? 24
What degree polynomial are you optimizing? (4-6)5
How many polynomials would you like to run through root optimization? 12
Invalid input, attempting to reformat.
Parse error: parameter for key c0 is not an mpz: 
0:00:00 [76.9k/s] [>                              ]  1%             ETA 09:09:35
Size-optimization completed with (estimated) 0 polynomials.
Beginning root-optimization with best 12 polynomials.
# Warning: parameter B is checked by this program but is undocumented.
# Warning: parameter A is checked by this program but is undocumented.
0:00:00 [1.15k/s] [ <=>                                                       ] 
Best polynomial has been saved to the file: bestpoly
# (8ab2eea72) ./polyselect_ropt -inputpolys toppolys -ropteffort 100 -t 24
# Compiled with gcc 9.3.0
# Compilation flags (C) -std=c99 -g -W -Wall -O2  -msse3 -mssse3 -msse4.1 -mpopcnt -mavx -mpclmul
# Compilation flags (C++) -export-dynamic -std=c++11 -Wno-c++11-compat -g -W -Wall -O2  -Wno-literal-suffix -msse3 -mssse3 -msse4.1 -mpopcnt -mavx -mpclmul
# Info: Will use 24 threads
# Info: ropteffort = 100
# Info: L1_cachesize = 32768, size_tune_sievearray = 16384
# Reading polynomials from toppolys
# 0 polynomial(s) read.
# Info: Using OpenMP with 24 thread(s)


# Stat: total phase took 0.19s
# Stat: rootsieve took 0.00s
# Stat:  (stage 1 took 0.00s)
# Stat:  (tuning took 0.00s)
# Stat:  (stage 2 (sieving) took 0.00s)
# WARNING: No polynomials were found in the input file toppolys
Should I format the polynomial differently?
EdH is offline   Reply With Quote
Old 2021-10-06, 14:26   #6
Plutie
 
"Evan"
Dec 2020
Montreal

4816 Posts
Default

Quote:
Originally Posted by EdH View Post
I d/l'd your latest and am having some difficulties getting it set up.

First, I had to install pv and gawk. They were not defaults in my Ubuntu, but were readily available. I am totally unfamiliar with pv, but it looks like a pipe monitoring tool. Is this supplying the "0:00:00 [76.9k/s] [>..." line?

I then, first tried an Msieve polynomial:
Code:
n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509
R0: -633156115026647924966242353927160849
R1: 1838892199331450047
A0: -4055358059413560189793755056163937883873351483520
A1: 16581144338788276414460714496465421079736
A2: -636936169159140542086060941338
A3: -29838939405307828973869
A4: -3553046772992
A5: 3408
skew 1367091201.68
# size 6.377e-018, alpha -8.529, combined = 5.338e-014 rroots = 5
Then I tried manually converting it to CADO-NFS fomat:
Code:
n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509
Y0: -633156115026647924966242353927160849
Y1: 1838892199331450047
c0: -4055358059413560189793755056163937883873351483520
c1: 16581144338788276414460714496465421079736
c2: -636936169159140542086060941338
c3: -29838939405307828973869
c4: -3553046772992
c5: 3408
skew: 1367091201.68
# size 6.377e-018, alpha -8.529, combined = 5.338e-014 rroots = 5
For both, I got:
Code:
$ bash polyFinder.sh
Enter the number being factored: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509
What file are your polynomials stored in? polys
How many threads will you be using? 24
What degree polynomial are you optimizing? (4-6)5
How many polynomials would you like to run through root optimization? 12
Invalid input, attempting to reformat.
Parse error: parameter for key c0 is not an mpz: 
0:00:00 [76.9k/s] [>                              ]  1%             ETA 09:09:35
Size-optimization completed with (estimated) 0 polynomials.
Beginning root-optimization with best 12 polynomials.
# Warning: parameter B is checked by this program but is undocumented.
# Warning: parameter A is checked by this program but is undocumented.
0:00:00 [1.15k/s] [ <=>                                                       ] 
Best polynomial has been saved to the file: bestpoly
# (8ab2eea72) ./polyselect_ropt -inputpolys toppolys -ropteffort 100 -t 24
# Compiled with gcc 9.3.0
# Compilation flags (C) -std=c99 -g -W -Wall -O2  -msse3 -mssse3 -msse4.1 -mpopcnt -mavx -mpclmul
# Compilation flags (C++) -export-dynamic -std=c++11 -Wno-c++11-compat -g -W -Wall -O2  -Wno-literal-suffix -msse3 -mssse3 -msse4.1 -mpopcnt -mavx -mpclmul
# Info: Will use 24 threads
# Info: ropteffort = 100
# Info: L1_cachesize = 32768, size_tune_sievearray = 16384
# Reading polynomials from toppolys
# 0 polynomial(s) read.
# Info: Using OpenMP with 24 thread(s)


# Stat: total phase took 0.19s
# Stat: rootsieve took 0.00s
# Stat:  (stage 1 took 0.00s)
# Stat:  (tuning took 0.00s)
# Stat:  (stage 2 (sieving) took 0.00s)
# WARNING: No polynomials were found in the input file toppolys
Should I format the polynomial differently?
https://transfer.sh/KWRkr9/polyFinder.sh Replace the script with this one, I did some quick (not-so-clean) fixes. It works with your initial poly now.
Plutie is offline   Reply With Quote
Old 2021-10-06, 15:16   #7
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3·372 Posts
Default

Quote:
Originally Posted by Plutie View Post
https://transfer.sh/KWRkr9/polyFinder.sh Replace the script with this one, I did some quick (not-so-clean) fixes. It works with your initial poly now.
That got it working and it's underway ATM:
Code:
n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509
Y0: -633156115026647924966242353927160849
Y1: 1838892199331450047
c0: -4055358059413560189793755056163937883873351483520
c1: 16581144338788276414460714496465421079736
c2: -636936169159140542086060941338
c3: -29838939405307828973869
c4: -3553046772992
c5: 3408
skew: 1367091201.68
# size 6.377e-018, alpha -8.529, combined = 5.338e-014 rroots = 5

File is valid sopt input.
0:00:01 [36.9 /s] [==============================] 133%             ETA 11:03:16
Size-optimization completed with (estimated) 1 polynomials.
Beginning root-optimization with best 12 polynomials.
# Warning: parameter B is checked by this program but is undocumented.
# Warning: parameter A is checked by this program but is undocumented.
08:19 [46.1m/s] [=====================>         ] 74% ETA 0:02:53 ETA 11:14:29
What are the "undocumented" lines referring to?

The last line seems a bit misleading. It got to 74% and then just started clocking the time values upward.
EdH is offline   Reply With Quote
Old 2021-10-06, 15:41   #8
Plutie
 
"Evan"
Dec 2020
Montreal

23×32 Posts
Default

Quote:
Originally Posted by EdH View Post
That got it working and it's underway ATM:
Code:
n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509
Y0: -633156115026647924966242353927160849
Y1: 1838892199331450047
c0: -4055358059413560189793755056163937883873351483520
c1: 16581144338788276414460714496465421079736
c2: -636936169159140542086060941338
c3: -29838939405307828973869
c4: -3553046772992
c5: 3408
skew: 1367091201.68
# size 6.377e-018, alpha -8.529, combined = 5.338e-014 rroots = 5

File is valid sopt input.
0:00:01 [36.9 /s] [==============================] 133%             ETA 11:03:16
Size-optimization completed with (estimated) 1 polynomials.
Beginning root-optimization with best 12 polynomials.
# Warning: parameter B is checked by this program but is undocumented.
# Warning: parameter A is checked by this program but is undocumented.
08:19 [46.1m/s] [=====================>         ] 74% ETA 0:02:53 ETA 11:14:29
What are the "undocumented" lines referring to?

The last line seems a bit misleading. It got to 74% and then just started clocking the time values upward.
Both of the progress lines are (very) rough estimates of the time remaining. It's inherently flawed because it bases the progress on the lines output from the actual program. Since the ropt program takes a while to give any output once it starts working on a polynomial, it's going to be inaccurate. I initially didn't have any ETA functionality, but I gave it a try. The script is meant more for large lists, of possibly thousands of polys, in which case it should end up being a bit more accurate. I know the sopt progress bar is mostly correct, since there's a pretty consistent output from that binary.

The undocumented lines are output from CADO's ropt binary, they output to stderr. You can replace the line that calls ropt with this to hide them.
./polyselect_ropt -inputpolys toppolys -ropteffort 100 -t "$THREADS" 2>/dev/null | pv -pltIaes $(($(wc -l < toppolys) * 21 / 8)) > ropt.out


If you want a list to test the script on, here's one. https://transfer.sh/KagkW6/msieve.dat.ms. deg 6, head -1 msieve.dat.ms for composite.

Last fiddled with by Plutie on 2021-10-06 at 16:20 Reason: added example file
Plutie is offline   Reply With Quote
Old 2021-10-06, 16:38   #9
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3·372 Posts
Default

Success!

It completed with a better poly than supplied:
Code:
Best polynomial has been saved to the file: bestpoly
# Stat: rootsieve took 1625.22s
# Stat:  (stage 1 took 266.42s)
# Stat:  (tuning took 1317.08s)
# Stat:  (stage 2 (sieving) took 41.04s)
# Best polynomial found (revision 8ab2eea72):
# n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509
# Y0: -633156115434329678267970664676644305
# Y1: 1838892199331450047
# c0: -942416842833037178008794982721825660284661427968
# c1: 12659775140765359789715754874236115108152
# c2: 17789742230088331244638103557030
# c3: -25013045517601501098285
# c4: -7330808774912
# c5: 3408
# skew: 1038278114.355
# # lognorm 59.22, E 50.34, alpha -8.88 (proj -1.40), 5 real roots
# # MurphyE(Bf=1.000e+07,Bg=5.000e+06,area=1.000e+16)=5.688e-14
# Average exp_E: 51.00, average E: 50.34
Your script is much more clean and professional than mine. Mine has a lot more going on. I'll do some comparisons later. I'm interested in seeing if yours will find the same Best poly with subsequent runs and what my script might turn up.

For info, the poly I chose for input was from the Polynomial Request Thread, and has some spun versions later in the thread.
EdH is offline   Reply With Quote
Old 2021-10-07, 03:39   #10
Plutie
 
"Evan"
Dec 2020
Montreal

23·32 Posts
Default

"Release" 1.0.1


- Added some more conversion logic, should be compatible with the standard poly format used on this forum now (just make sure it has the n value added manually, I'll figure that out eventually.)
Attached Files
File Type: zip polyopt-1.0.1.zip (2.4 KB, 32 views)
Plutie is offline   Reply With Quote
Old 2021-10-07, 12:42   #11
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3×372 Posts
Default

My script did come up with a better scoring polynomial, but it came up with the same one as yours initially:
Code:
# MurphyE(Bf=1.000e+07,Bg=5.000e+06,area=1.000e+16)=5.882e-14
Best poly cownoise values: 1431009352.14992      5.96454364e-14
The better one was after going back to Msieve and then through CADO a second time. I'll eventually post in the other thread about the changes to that script, but let's keep this thread focused on building your script.
EdH is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
we all make choices MattcAnderson MattcAnderson 0 2021-08-16 00:39
Choices for Manual Assignments Rodrigo Information & Answers 67 2019-09-20 06:33
Please respect my options and choices about email notifications retina Forum Feedback 9 2012-07-13 02:03
Parameter Underestimation R.D. Silverman Cunningham Tables 14 2010-09-29 19:56
ECM Work and Parameter Choices R.D. Silverman Cunningham Tables 11 2006-03-06 18:46

All times are UTC. The time now is 22:35.


Wed Dec 1 22:35:01 UTC 2021 up 131 days, 17:04, 1 user, load averages: 1.34, 1.34, 1.41

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.