 2017-10-01, 01:03 #430 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 3·1,229 Posts Follow-up: /trunk is still 1.34.5, but /wip is now 1.35-beta and all is well...
 2017-10-09, 18:04 #431 jcrombie     "Jonathan" Jul 2010 In a tangled web... 2·107 Posts Ben, Found a problem and a fix. If your snfs number has >2 prime factors, yafu may switch over to siqs after finding a single factor. This eventually caused stability issues on one of my machines. It's a one-liner code fix for me. Code: svn diff nfs.c Index: nfs.c =================================================================== --- nfs.c (revision 366) +++ nfs.c (working copy) @@ -237,6 +237,7 @@ // convert input to msieve bigint notation and initialize a list of factors gmp2mp_t(fobj->nfs_obj.gmp_n, &mpN); factor_list_init(&factor_list); + factor_list_add( obj, &factor_list, &mpN ); if (fobj->nfs_obj.rangeq > 0) { Cheers.
2017-10-10, 03:43   #432
Dubslow

"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts

Quote:
 Originally Posted by jcrombie This eventually caused stability issues on one of my machines.
?!?

2017-10-10, 04:18   #433
bsquared

"Ben"
Feb 2007

22×23×37 Posts

Quote:
 Originally Posted by jcrombie Ben, Found a problem and a fix. If your snfs number has >2 prime factors, yafu may switch over to siqs after finding a single factor. This eventually caused stability issues on one of my machines. It's a one-liner code fix for me. Cheers.
Thanks for the testing and the fix!

2017-10-10, 04:56   #434
jcrombie

"Jonathan"
Jul 2010
In a tangled web...

2×107 Posts

Quote:
 Originally Posted by Dubslow ?!?
Ooops, guess that wasn't clear. Yafu crashed after about 20 more batch entries. Only crashes on one machine so far.

 2017-11-08, 18:43 #435 Googulator   Dec 2015 22×5 Posts Poor behavior on many-core machines Today I had a chance to test yafu v1.34 on a 12-core, 24-thread system. During NFS sieving on a 100-digit semiprime, I got the attached utilization graph in Task Manager. Very clearly visible is an alternating high-low-high-low pattern, roughly in sync on all cores. Based on the graph, I'd estimate that about 25-30% of CPU time is being wasted. I believe the cause is that yafu waits for all 24 running work units to complete before dispatching another set of 24, rather than continuously dispatching new work units from a queue to any CPU core that becomes free. As a result, "lucky" cores that get assigned easier work units end up waiting for the other, "unlucky" cores. This, coupled with excessively small work units (10-20s on average per work unit, sometimes even shorter), with each new work unit incurring an additional penalty from initialization and teardown of a new ggnfs-lasieve process instance, causes the severe losses seen in the graph. With higher core counts, this effect is only expected to get worse. With Supermicro now selling a 448-core server for compute-heavy applications, this is far from theoretical. Systems with heterogenous multiprocessing, like ARM-based servers with big.LITTLE, are expected to be hit especially hard, as the faster "big" cores need to wait for the slower "little" cores. The fix is twofold: 1. A workqueue-based system needs to be implemented, so that a core that completes its current work unit can immediately receive a new one, rather than waiting for all other cores to finish. 2. Work unit size needs to be configurable, or perhaps auto-detected based on tune data and the overall expected relation count for each job. This way, process setup/teardown losses can be minimized. EDIT: Here is my nfs.job file that generated these graphs: Code: n: 8963380089744885777558797928146885007880782063175717257264780599471852039683421259806891474176679107 skew: 1098693.69 c0: 2120154191134731112499436882 c1: -1193060185506873214867 c2: -6467608951126288 c3: 427938998 c4: 3780 Y0: -1240923247985560434952675 Y1: 8317858792493 rlim: 1800000 alim: 1800000 lpbr: 26 lpba: 26 mfbr: 52 mfba: 52 rlambda: 2.5 alambda: 2.5 Also, attached a 2nd picture showing the overall utilization across all 24 cores combined. Attached Thumbnails     Last fiddled with by Googulator on 2017-11-08 at 18:55
 2017-11-08, 20:01 #436 bsquared     "Ben" Feb 2007 340410 Posts I agree with everything you said, and have seen the issue myself. There is even a workqueue based threading system in place in other routines within yafu (ECM, siqs). But I haven't had a chance to apply it to the NFS routines. Nor to tackle item 2) of your proposed fix. And that will probably remain the case for the forseeable future.
 2017-11-27, 23:33 #437 danaj   "Dana Jacobsen" Feb 2011 Bangkok, TH 22×227 Posts I'm not sure if this is just a known limitation, but sieverange gives incorrect output with a depth greater than 2^31 (it's not just that it stops sieving past there -- there is some sort of aliasing going on so values that should be output are not). Code: yafu "sieverange(10^50-10^7,10^50,2^32,0)" -pfile -v -v For example, is missing the prime 99999999999999999999999999999999999999999990000293 and does not have the composite with 32-bit factor 99999999999999999999999999999999999999999990004049 Last fiddled with by danaj on 2017-11-27 at 23:35
2017-11-28, 14:45   #438
bsquared

"Ben"
Feb 2007

22·23·37 Posts

Quote:
 Originally Posted by danaj I'm not sure if this is just a known limitation, but sieverange gives incorrect output with a depth greater than 2^31 (it's not just that it stops sieving past there -- there is some sort of aliasing going on so values that should be output are not). Code: yafu "sieverange(10^50-10^7,10^50,2^32,0)" -pfile -v -v For example, is missing the prime 99999999999999999999999999999999999999999990000293 and does not have the composite with 32-bit factor 99999999999999999999999999999999999999999990004049
The normal sieve routine (primes()) refuses to sieve above 4*10^18, so I guess I was aware of this back when I designed it. Must have forgot to impose the same limitation for the sieverange() function. Thanks for letting me know.

2018-02-03, 00:16   #439
Happy5214

"Alexander"
Nov 2008
The Alamo City

2·7·41 Posts
Tune info ignored

Apparently, trunk is ignoring my tuning data.

Version 1.34.5:
Quote:
 02/02/18 18:10:35 v1.34.5 @ happy5214-desktop, System/Build Info: Using GMP-ECM 6.4.4, Powered by GMP 5.1.1 detected Intel(R) Core(TM)2 Quad CPU Q8300 @ 2.50GHz detected L1 = 32768 bytes, L2 = 2097152 bytes, CL = 64 bytes measured cpu frequency ~= 2493.699670 using 20 random witnesses for Rabin-Miller PRP checks =============================================================== ======= Welcome to YAFU (Yet Another Factoring Utility) ======= ======= bbuhrow@gmail.com ======= ======= Type help at any time, or quit to quit ======= =============================================================== cached 78498 primes. pmax = 999983 >> factor(111111111111111111111) fac: factoring 111111111111111111111 fac: using pretesting plan: normal fac: using tune info for qs/gnfs crossover div: primes less than 10000 Total factoring time = 0.0021 seconds ***factors found*** P1 = 3 P2 = 37 P2 = 43 P3 = 239 P4 = 1933 P4 = 4649 P8 = 10838689 ans = 1
trunk:
Quote:
 02/02/18 18:05:02 v1.34.5 @ happy5214-desktop, System/Build Info: Using GMP-ECM 7.0.4, Powered by GMP 6.1.2 detected Intel(R) Core(TM)2 Quad CPU Q8300 @ 2.50GHz detected L1 = 32768 bytes, L2 = 2097152 bytes, CL = 64 bytes measured cpu frequency ~= 2493.686100 using 1 random witnesses for Rabin-Miller PRP checks =============================================================== ======= Welcome to YAFU (Yet Another Factoring Utility) ======= ======= bbuhrow@gmail.com ======= ======= Type help at any time, or quit to quit ======= =============================================================== cached 78498 primes. pmax = 999983 >> factor(111111111111111111111) fac: factoring 111111111111111111111 fac: using pretesting plan: normal fac: no tune info: using qs/gnfs crossover of 95 digits div: primes less than 10000 Total factoring time = 0.0140 seconds ***factors found*** P1 = 3 P2 = 37 P2 = 43 P3 = 239 P4 = 1933 P4 = 4649 P8 = 10838689 1
I tested trunk with both tune info from the old version and freshly generated tune info.

 2018-03-03, 18:53 #440 Dubslow Basketry That Evening!     "Bunslow the Bold" Jun 2011 40

