 2021-08-26, 12:18 #12 swellman     Jun 2012 32×192 Posts What was command line for msieve-GPU?
 2021-08-26, 16:34 #13 VBCurtis     "Curtis" Feb 2005 Riverside, CA 13D016 Posts CPU search on msieve is hopeless- all our norm suggestions were intended to manage the firehose of output that a GPU generates. A CPU core finds about 2% of the raw polys of a ~2017 GPU (say, a 980 or 1070 level card), so your overnight run was maybe 90 minutes worth of GPU searching. msieve does not ever default to degree 6. Also, the norm guidance doesn't work so well for deg 6- I thought you were trying to automate msieve poly select for small numbers, not run a search for the 4788 C220. size-opt for degree 6 is quite slow, what with an extra term to cycle through.
2021-08-27, 01:57   #14
EdH

"Ed Hall"
Dec 2009

32×457 Posts

Quote:
 Originally Posted by swellman What was command line for msieve-GPU?
If you mean the build command for my Msieve trials, I think it was the basic:
Code:
make all ECM=1 CUDA=1 (NO_ZLIB=1 - tried with and without)
Msieve returned without errors, but there were no .so files, so after it started, it would say it couldn't find them. I have a thread somewhere on my failures.

 2021-08-27, 05:14 #15 VBCurtis     "Curtis" Feb 2005 Riverside, CA 24·317 Posts No, he meant the invocation you used to call msieve. We've all had empty output files unexpectedly, and oftentimes there's a typo to be found in the chain-of-flags on the command line that explains the output. Then again, sometimes it's just too-tight norm settings!
2021-08-27, 12:17   #16
EdH

"Ed Hall"
Dec 2009

32·457 Posts

Quote:
 Originally Posted by VBCurtis No, he meant the invocation you used to call msieve. We've all had empty output files unexpectedly, and oftentimes there's a typo to be found in the chain-of-flags on the command line that explains the output. Then again, sometimes it's just too-tight norm settings!
I haven't gotten the GPU version to run, I'm trying to script something to use all the threads of my farm with Msieve as an experiment. The command is within a script. I relaxed stage2_norm and that gave me some results.

 2021-08-27, 18:30 #17 VBCurtis     "Curtis" Feb 2005 Riverside, CA 24·317 Posts Well, the reason we control either stage1 or stage2 norms from the command line is because msieve's defaults are from the era before GPUs, while GPUs produce 50-100x more data from stage1. So, if you're running CPU only, I'd relax stage1 norm to half of default, or maybe even default. That should yield enough raw polys for size-opt to have useful work to do.
2021-08-27, 19:07   #18
EdH

"Ed Hall"
Dec 2009

101116 Posts

Quote:
 Originally Posted by VBCurtis Well, the reason we control either stage1 or stage2 norms from the command line is because msieve's defaults are from the era before GPUs, while GPUs produce 50-100x more data from stage1. So, if you're running CPU only, I'd relax stage1 norm to half of default, or maybe even default. That should yield enough raw polys for size-opt to have useful work to do.
But, my problem was that I was piling up too many hits from stage 1 - several hundred thousand in five minutes. Maybe I really needed to restrict stage 1 via its norm, but leave stage 2 default in place.

Another query: I'm engaging all the threads of my machines via a script and providing the ranges via the script. I find that many ranges quit almost immediately with no results and those that continue always have the same coefficients. I thought that Msieve chooses random coefficients within the given range. Is the randomness only in the multiple of the small primes? And, does this mean if the ranges are too small, there will be no appropriate coefficients within some? I've written my script to take these skipped instances into account when assigning threads.

Additionally, once Msieve has chosen a coefficient within a range, it never releases that coefficient to move to another.* Should I choose ranges based on expecting each to hold only one appropriate coefficient, perhaps 120120 (2^3*3*5*7*11*13), which has been the first coefficient for every run?

I know Msieve README.nfs says each range can be run on multiple machines and the randomness will minimize collisions, maybe my understanding of randomness is(was) incorrect and two machines (threads) will work the same coefficient, but the randomness is in the work within that coefficient. Am I learning or losing it?

* I'm guessing that's because of the "deadline: 8640000 CPU-seconds per coefficient" and it really would move on after a few months.

 2021-08-27, 20:09 #19 VBCurtis     "Curtis" Feb 2005 Riverside, CA 10011110100002 Posts Far as I know (it has been years since I paid any attention to msieve poly select, so some of this is as vague as my memory): Leading coeff is not randomized. Multiples of 12 at the very smallest searches, 60 often, sometimes larger. I forget if it's the larger the input size or the larger the coeff itself that determines what the msieve version of "incr" is. It also may be the case that msieve samples some property other than divisibility of small primes to filter which leading coeff's to search deeply on. On big poly select jobs, the space of second and third coeffs within a leading coeff is so large that a slice of the space of a leading coeff is searched. For C190+, there can be 30 or more slices, so many machines can work on the same coefficient with little work duplicated.
2021-08-27, 20:31   #20
charybdis

Apr 2020

547 Posts

Quote:
 Originally Posted by VBCurtis Leading coeff is not randomized. Multiples of 12 at the very smallest searches, 60 often, sometimes larger. I forget if it's the larger the input size or the larger the coeff itself that determines what the msieve version of "incr" is.
From a cursory glance of the code, it's 420 for degree 4 searches, and then it's 12 up to 120 digits, 60 for 121-200 digits, and 120120 for >200 digits.

2021-08-27, 20:41   #21
EdH

"Ed Hall"
Dec 2009

32·457 Posts

Quote:
 Originally Posted by VBCurtis . . . On big poly select jobs, the space of second and third coeffs within a leading coeff is so large that a slice of the space of a leading coeff is searched. For C190+, there can be 30 or more slices, so many machines can work on the same coefficient with little work duplicated.
Ah Ha! This gives me a definite adjustment for my experiments. I'll try to work all threads of a single machine against a single coefficient.

Quote:
 Originally Posted by charybdis From a cursory glance of the code, it's 420 for degree 4 searches, and then it's 12 up to 120 digits, 60 for 121-200 digits, and 120120 for >200 digits.
I had noticed all the coefficients were multiples of 120120, but hesitated stating so. If I base my ranges such that each encloses a multiple, I should be able to set my scripts such that I can run lots of threads against those multiples and maybe even have some of my 8-thread machines working on the same coefficients.

Thanks!

 2021-08-27, 22:20 #22 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 32×457 Posts Hmm. . . I guess the random threads didn't work: Code: 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 120120 442190148297076679 593919502199669884690565060176108763 for my 24 threads. They must have all seeded tthe same or they aren't random. . .

