mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Msieve with GNFS support (https://www.mersenneforum.org/showthread.php?t=5413)

jbristow 2008-03-23 22:31

I tried a 124-digit GNFS using 64-bit MSVC and the square root failed. If I remember correctly, this worked in version 1.32 but not 1.33.

[code]
Sun Mar 23 14:16:08 2008
Sun Mar 23 14:16:08 2008
Sun Mar 23 14:16:08 2008 Msieve v. 1.34
Sun Mar 23 14:16:08 2008 random seeds: 2202b480 546fc432
Sun Mar 23 14:16:08 2008 factoring 2176886341730533741193704509392964187277206738611830167557220084710219673049248272600669861234269717975084518784604080685007 (124 digits)
Sun Mar 23 14:16:09 2008 no P-1/P+1/ECM available, skipping
Sun Mar 23 14:16:09 2008 commencing number field sieve (124-digit input)
Sun Mar 23 14:16:09 2008 R0: -514326065570184095717202
Sun Mar 23 14:16:09 2008 R1: 32319196968661
Sun Mar 23 14:16:09 2008 A0: -780149716838940394182442925
Sun Mar 23 14:16:09 2008 A1: 6676057545981059752964370
Sun Mar 23 14:16:09 2008 A2: -308583217397268277577
Sun Mar 23 14:16:09 2008 A3: 345282264568872
Sun Mar 23 14:16:09 2008 A4: 69367667284
Sun Mar 23 14:16:09 2008 A5: 60480
Sun Mar 23 14:16:09 2008 size score = 8.714383e-013, Murphy alpha = -5.924028, combined = 6.278089e-012
Sun Mar 23 14:16:28 2008 restarting with 16975404 relations
Sun Mar 23 14:16:31 2008 added 2965 free relations
Sun Mar 23 14:16:31 2008
Sun Mar 23 14:16:31 2008 commencing relation filtering
Sun Mar 23 14:16:31 2008 commencing duplicate removal, pass 1
Sun Mar 23 14:18:37 2008 found 2406146 hash collisions in 16978369 relations
Sun Mar 23 14:18:37 2008 commencing duplicate removal, pass 2
Sun Mar 23 14:18:52 2008 found 2285444 duplicates and 14692925 unique relations
Sun Mar 23 14:18:52 2008 memory use: 94.6 MB
Sun Mar 23 14:18:57 2008 ignoring smallest 731186 rational and 730206 algebraic ideals
Sun Mar 23 14:18:57 2008 filtering rational ideals above 11075584
Sun Mar 23 14:18:57 2008 filtering algebraic ideals above 11075584
Sun Mar 23 14:18:57 2008 need 2192088 more relations than ideals
Sun Mar 23 14:18:57 2008 commencing singleton removal, pass 1
Sun Mar 23 14:20:54 2008 relations with 0 large ideals: 647489
Sun Mar 23 14:20:54 2008 relations with 1 large ideals: 3045713
Sun Mar 23 14:20:54 2008 relations with 2 large ideals: 5461523
Sun Mar 23 14:20:54 2008 relations with 3 large ideals: 4288563
Sun Mar 23 14:20:54 2008 relations with 4 large ideals: 1244934
Sun Mar 23 14:20:54 2008 relations with 5 large ideals: 4703
Sun Mar 23 14:20:54 2008 relations with 6 large ideals: 0
Sun Mar 23 14:20:54 2008 relations with 7+ large ideals: 0
Sun Mar 23 14:20:54 2008 14692925 relations and about 10694122 large ideals
Sun Mar 23 14:20:54 2008 commencing singleton removal, pass 2
Sun Mar 23 14:22:52 2008 found 3403163 singletons
Sun Mar 23 14:22:52 2008 current dataset: 11289762 relations and about 6958937 large ideals
Sun Mar 23 14:22:52 2008 commencing singleton removal, pass 3
Sun Mar 23 14:24:24 2008 found 629940 singletons
Sun Mar 23 14:24:24 2008 current dataset: 10659822 relations and about 6314297 large ideals
Sun Mar 23 14:24:24 2008 commencing singleton removal, pass 4
Sun Mar 23 14:25:51 2008 found 114351 singletons
Sun Mar 23 14:25:51 2008 current dataset: 10545471 relations and about 6199448 large ideals
Sun Mar 23 14:25:52 2008 commencing singleton removal, final pass
Sun Mar 23 14:27:25 2008 memory use: 161.2 MB
Sun Mar 23 14:27:25 2008 commencing in-memory singleton removal
Sun Mar 23 14:27:26 2008 begin with 10545471 relations and 6574940 unique ideals
Sun Mar 23 14:27:34 2008 reduce to 10063396 relations and 6087727 ideals in 10 passes
Sun Mar 23 14:27:34 2008 max relations containing the same ideal: 50
Sun Mar 23 14:27:39 2008 removing 1592337 relations and 1192337 ideals in 400000 cliques
Sun Mar 23 14:27:40 2008 commencing in-memory singleton removal
Sun Mar 23 14:27:41 2008 begin with 8471059 relations and 6087727 unique ideals
Sun Mar 23 14:27:45 2008 reduce to 8347914 relations and 4766903 ideals in 6 passes
Sun Mar 23 14:27:45 2008 max relations containing the same ideal: 46
Sun Mar 23 14:27:48 2008 removing 1230712 relations and 830712 ideals in 400000 cliques
Sun Mar 23 14:27:48 2008 commencing in-memory singleton removal
Sun Mar 23 14:27:49 2008 begin with 7117202 relations and 4766903 unique ideals
Sun Mar 23 14:27:53 2008 reduce to 7027271 relations and 3842561 ideals in 7 passes
Sun Mar 23 14:27:53 2008 max relations containing the same ideal: 41
Sun Mar 23 14:27:55 2008 removing 1094378 relations and 694378 ideals in 400000 cliques
Sun Mar 23 14:27:56 2008 commencing in-memory singleton removal
Sun Mar 23 14:27:56 2008 begin with 5932893 relations and 3842561 unique ideals
Sun Mar 23 14:27:59 2008 reduce to 5838254 relations and 3048757 ideals in 6 passes
Sun Mar 23 14:27:59 2008 max relations containing the same ideal: 33
Sun Mar 23 14:28:01 2008 removing 1010348 relations and 610348 ideals in 400000 cliques
Sun Mar 23 14:28:01 2008 commencing in-memory singleton removal
Sun Mar 23 14:28:01 2008 begin with 4827906 relations and 3048757 unique ideals
Sun Mar 23 14:28:03 2008 reduce to 4726070 relations and 2329802 ideals in 6 passes
Sun Mar 23 14:28:03 2008 max relations containing the same ideal: 30
Sun Mar 23 14:28:05 2008 removing 572047 relations and 367868 ideals in 204179 cliques
Sun Mar 23 14:28:05 2008 commencing in-memory singleton removal
Sun Mar 23 14:28:05 2008 begin with 4154023 relations and 2329802 unique ideals
Sun Mar 23 14:28:06 2008 reduce to 4120906 relations and 1927400 ideals in 5 passes
Sun Mar 23 14:28:06 2008 max relations containing the same ideal: 28
Sun Mar 23 14:28:08 2008 filtering rational ideals above 750000
Sun Mar 23 14:28:08 2008 filtering algebraic ideals above 750000
Sun Mar 23 14:28:08 2008 need 119855 more relations than ideals
Sun Mar 23 14:28:08 2008 commencing singleton removal, final pass
Sun Mar 23 14:30:03 2008 keeping 6617269 ideals with weight <= 20, new excess is 931445
Sun Mar 23 14:30:14 2008 memory use: 248.9 MB
Sun Mar 23 14:30:14 2008 commencing in-memory singleton removal
Sun Mar 23 14:30:15 2008 begin with 10063396 relations and 6617269 unique ideals
Sun Mar 23 14:30:21 2008 reduce to 10059164 relations and 6613037 ideals in 6 passes
Sun Mar 23 14:30:21 2008 max relations containing the same ideal: 20
Sun Mar 23 14:30:27 2008 removing 1597049 relations and 1197049 ideals in 400000 cliques
Sun Mar 23 14:30:27 2008 commencing in-memory singleton removal
Sun Mar 23 14:30:28 2008 begin with 8462115 relations and 6613037 unique ideals
Sun Mar 23 14:30:33 2008 reduce to 8336735 relations and 5285211 ideals in 6 passes
Sun Mar 23 14:30:33 2008 max relations containing the same ideal: 20
Sun Mar 23 14:30:37 2008 removing 1237783 relations and 837783 ideals in 400000 cliques
Sun Mar 23 14:30:37 2008 commencing in-memory singleton removal
Sun Mar 23 14:30:38 2008 begin with 7098952 relations and 5285211 unique ideals
Sun Mar 23 14:30:42 2008 reduce to 7004686 relations and 4349318 ideals in 6 passes
Sun Mar 23 14:30:42 2008 max relations containing the same ideal: 20
Sun Mar 23 14:30:45 2008 removing 1105343 relations and 705343 ideals in 400000 cliques
Sun Mar 23 14:30:45 2008 commencing in-memory singleton removal
Sun Mar 23 14:30:46 2008 begin with 5899343 relations and 4349318 unique ideals
Sun Mar 23 14:30:49 2008 reduce to 5799382 relations and 3539064 ideals in 6 passes
Sun Mar 23 14:30:49 2008 max relations containing the same ideal: 20
Sun Mar 23 14:30:52 2008 removing 1035086 relations and 635086 ideals in 400000 cliques
Sun Mar 23 14:30:52 2008 commencing in-memory singleton removal
Sun Mar 23 14:30:52 2008 begin with 4764296 relations and 3539064 unique ideals
Sun Mar 23 14:30:55 2008 reduce to 4649458 relations and 2781625 ideals in 7 passes
Sun Mar 23 14:30:55 2008 max relations containing the same ideal: 19
Sun Mar 23 14:30:57 2008 removing 992139 relations and 592139 ideals in 400000 cliques
Sun Mar 23 14:30:57 2008 commencing in-memory singleton removal
Sun Mar 23 14:30:58 2008 begin with 3657319 relations and 2781625 unique ideals
Sun Mar 23 14:31:00 2008 reduce to 3534086 relations and 2056127 ideals in 7 passes
Sun Mar 23 14:31:00 2008 max relations containing the same ideal: 18
Sun Mar 23 14:31:01 2008 removing 959466 relations and 561984 ideals in 397482 cliques
Sun Mar 23 14:31:01 2008 commencing in-memory singleton removal
Sun Mar 23 14:31:01 2008 begin with 2574620 relations and 2056127 unique ideals
Sun Mar 23 14:31:02 2008 reduce to 2441071 relations and 1346306 ideals in 6 passes
Sun Mar 23 14:31:02 2008 max relations containing the same ideal: 15
Sun Mar 23 14:31:03 2008 removing 66154 relations and 51866 ideals in 14288 cliques
Sun Mar 23 14:31:03 2008 commencing in-memory singleton removal
Sun Mar 23 14:31:03 2008 begin with 2374917 relations and 1346306 unique ideals
Sun Mar 23 14:31:04 2008 reduce to 2373946 relations and 1293458 ideals in 4 passes
Sun Mar 23 14:31:04 2008 max relations containing the same ideal: 15
Sun Mar 23 14:31:04 2008 relations with 0 large ideals: 265136
Sun Mar 23 14:31:04 2008 relations with 1 large ideals: 846532
Sun Mar 23 14:31:04 2008 relations with 2 large ideals: 844581
Sun Mar 23 14:31:04 2008 relations with 3 large ideals: 345838
Sun Mar 23 14:31:04 2008 relations with 4 large ideals: 66535
Sun Mar 23 14:31:04 2008 relations with 5 large ideals: 5142
Sun Mar 23 14:31:04 2008 relations with 6 large ideals: 180
Sun Mar 23 14:31:04 2008 relations with 7+ large ideals: 2
Sun Mar 23 14:31:04 2008 commencing 2-way merge
Sun Mar 23 14:31:05 2008 reduce to 1749373 relation sets and 668885 unique ideals
Sun Mar 23 14:31:05 2008 commencing full merge
Sun Mar 23 14:31:10 2008 memory use: 65.5 MB
Sun Mar 23 14:31:10 2008 found 1080298 cycles, need 931645
Sun Mar 23 14:31:10 2008 weight of 931645 cycles is about 39176921 (42.05/cycle)
Sun Mar 23 14:31:10 2008 distribution of cycle lengths:
Sun Mar 23 14:31:10 2008 1 relations: 265136
Sun Mar 23 14:31:10 2008 2 relations: 159017
Sun Mar 23 14:31:10 2008 3 relations: 132021
Sun Mar 23 14:31:10 2008 4 relations: 108703
Sun Mar 23 14:31:10 2008 5 relations: 90288
Sun Mar 23 14:31:10 2008 6 relations: 71572
Sun Mar 23 14:31:10 2008 7 relations: 54848
Sun Mar 23 14:31:10 2008 8 relations: 37598
Sun Mar 23 14:31:10 2008 9 relations: 12384
Sun Mar 23 14:31:10 2008 10+ relations: 78
Sun Mar 23 14:31:10 2008 heaviest cycle: 10 relations
Sun Mar 23 14:31:10 2008 matrix not dense enough, retrying
Sun Mar 23 14:31:10 2008 filtering rational ideals above 750000
Sun Mar 23 14:31:10 2008 filtering algebraic ideals above 750000
Sun Mar 23 14:31:10 2008 need 119855 more relations than ideals
Sun Mar 23 14:31:10 2008 commencing singleton removal, final pass
Sun Mar 23 14:33:06 2008 keeping 6732772 ideals with weight <= 25, new excess is 811710
Sun Mar 23 14:33:16 2008 memory use: 248.9 MB
Sun Mar 23 14:33:16 2008 commencing in-memory singleton removal
Sun Mar 23 14:33:17 2008 begin with 10059164 relations and 6732772 unique ideals
Sun Mar 23 14:33:18 2008 reduce to 10059164 relations and 6732772 ideals in 1 passes
Sun Mar 23 14:33:18 2008 max relations containing the same ideal: 25
Sun Mar 23 14:33:24 2008 removing 1597032 relations and 1197032 ideals in 400000 cliques
Sun Mar 23 14:33:24 2008 commencing in-memory singleton removal
Sun Mar 23 14:33:25 2008 begin with 8462132 relations and 6732772 unique ideals
Sun Mar 23 14:33:31 2008 reduce to 8336782 relations and 5404985 ideals in 6 passes
Sun Mar 23 14:33:31 2008 max relations containing the same ideal: 25
Sun Mar 23 14:33:35 2008 removing 1237797 relations and 837797 ideals in 400000 cliques
Sun Mar 23 14:33:35 2008 commencing in-memory singleton removal
Sun Mar 23 14:33:36 2008 begin with 7098985 relations and 5404985 unique ideals
Sun Mar 23 14:33:40 2008 reduce to 7004692 relations and 4469052 ideals in 6 passes
Sun Mar 23 14:33:40 2008 max relations containing the same ideal: 25
Sun Mar 23 14:33:44 2008 removing 1105418 relations and 705418 ideals in 400000 cliques
Sun Mar 23 14:33:44 2008 commencing in-memory singleton removal
Sun Mar 23 14:33:44 2008 begin with 5899274 relations and 4469052 unique ideals
Sun Mar 23 14:33:48 2008 reduce to 5799305 relations and 3658734 ideals in 6 passes
Sun Mar 23 14:33:48 2008 max relations containing the same ideal: 24
Sun Mar 23 14:33:51 2008 removing 1034877 relations and 634877 ideals in 400000 cliques
Sun Mar 23 14:33:51 2008 commencing in-memory singleton removal
Sun Mar 23 14:33:51 2008 begin with 4764428 relations and 3658734 unique ideals
Sun Mar 23 14:33:55 2008 reduce to 4649533 relations and 2901419 ideals in 7 passes
Sun Mar 23 14:33:55 2008 max relations containing the same ideal: 23
Sun Mar 23 14:33:57 2008 removing 992120 relations and 592120 ideals in 400000 cliques
Sun Mar 23 14:33:57 2008 commencing in-memory singleton removal
Sun Mar 23 14:33:57 2008 begin with 3657413 relations and 2901419 unique ideals
Sun Mar 23 14:33:59 2008 reduce to 3534104 relations and 2175930 ideals in 7 passes
Sun Mar 23 14:33:59 2008 max relations containing the same ideal: 19
Sun Mar 23 14:34:01 2008 removing 964837 relations and 564837 ideals in 400000 cliques
Sun Mar 23 14:34:01 2008 commencing in-memory singleton removal
Sun Mar 23 14:34:01 2008 begin with 2569267 relations and 2175930 unique ideals
Sun Mar 23 14:34:03 2008 reduce to 2433178 relations and 1460464 ideals in 7 passes
Sun Mar 23 14:34:03 2008 max relations containing the same ideal: 17
Sun Mar 23 14:34:04 2008 removing 126166 relations and 95036 ideals in 31130 cliques
Sun Mar 23 14:34:04 2008 commencing in-memory singleton removal
Sun Mar 23 14:34:04 2008 begin with 2307012 relations and 1460464 unique ideals
Sun Mar 23 14:34:04 2008 reduce to 2303533 relations and 1361900 ideals in 4 passes
Sun Mar 23 14:34:04 2008 max relations containing the same ideal: 16
Sun Mar 23 14:34:05 2008 relations with 0 large ideals: 174180
Sun Mar 23 14:34:05 2008 relations with 1 large ideals: 654094
Sun Mar 23 14:34:05 2008 relations with 2 large ideals: 831446
Sun Mar 23 14:34:05 2008 relations with 3 large ideals: 477509
Sun Mar 23 14:34:05 2008 relations with 4 large ideals: 141853
Sun Mar 23 14:34:05 2008 relations with 5 large ideals: 22462
Sun Mar 23 14:34:05 2008 relations with 6 large ideals: 1905
Sun Mar 23 14:34:05 2008 relations with 7+ large ideals: 84
Sun Mar 23 14:34:05 2008 commencing 2-way merge
Sun Mar 23 14:34:06 2008 reduce to 1697526 relation sets and 755893 unique ideals
Sun Mar 23 14:34:06 2008 commencing full merge
Sun Mar 23 14:34:14 2008 memory use: 73.4 MB
Sun Mar 23 14:34:14 2008 found 932998 cycles, need 811910
Sun Mar 23 14:34:14 2008 weight of 811910 cycles is about 53394998 (65.76/cycle)
Sun Mar 23 14:34:14 2008 distribution of cycle lengths:
Sun Mar 23 14:34:14 2008 1 relations: 174180
Sun Mar 23 14:34:14 2008 2 relations: 97230
Sun Mar 23 14:34:14 2008 3 relations: 79525
Sun Mar 23 14:34:14 2008 4 relations: 69407
Sun Mar 23 14:34:14 2008 5 relations: 63234
Sun Mar 23 14:34:14 2008 6 relations: 56810
Sun Mar 23 14:34:14 2008 7 relations: 51727
Sun Mar 23 14:34:14 2008 8 relations: 46598
Sun Mar 23 14:34:14 2008 9 relations: 41272
Sun Mar 23 14:34:14 2008 10+ relations: 131927
Sun Mar 23 14:34:14 2008 heaviest cycle: 17 relations
Sun Mar 23 14:34:14 2008 commencing cycle optimization
Sun Mar 23 14:34:15 2008 start with 4188597 relations
Sun Mar 23 14:34:23 2008 pruned 287180 relations
Sun Mar 23 14:34:23 2008 memory use: 121.2 MB
Sun Mar 23 14:34:23 2008 distribution of cycle lengths:
Sun Mar 23 14:34:23 2008 1 relations: 174180
Sun Mar 23 14:34:23 2008 2 relations: 100243
Sun Mar 23 14:34:23 2008 3 relations: 83869
Sun Mar 23 14:34:23 2008 4 relations: 74474
Sun Mar 23 14:34:23 2008 5 relations: 69928
Sun Mar 23 14:34:23 2008 6 relations: 63729
Sun Mar 23 14:34:23 2008 7 relations: 58420
Sun Mar 23 14:34:23 2008 8 relations: 51067
Sun Mar 23 14:34:23 2008 9 relations: 43735
Sun Mar 23 14:34:23 2008 10+ relations: 92265
Sun Mar 23 14:34:23 2008 heaviest cycle: 16 relations
Sun Mar 23 14:34:24 2008
Sun Mar 23 14:34:24 2008 commencing linear algebra
Sun Mar 23 14:34:24 2008 read 811910 cycles
Sun Mar 23 14:34:25 2008 cycles contain 1961722 unique relations
Sun Mar 23 14:34:45 2008 read 1961722 relations
Sun Mar 23 14:34:48 2008 using 32 quadratic characters above 134216262
Sun Mar 23 14:35:05 2008 building initial matrix
Sun Mar 23 14:35:28 2008 memory use: 265.9 MB
Sun Mar 23 14:35:29 2008 read 811910 cycles
Sun Mar 23 14:35:30 2008 matrix is 811586 x 811910 (217.3 MB) with weight 71944603 (88.61/col)
Sun Mar 23 14:35:30 2008 sparse part has weight 47233217 (58.18/col)
Sun Mar 23 14:35:39 2008 filtering completed in 3 passes
Sun Mar 23 14:35:40 2008 matrix is 807698 x 807898 (216.8 MB) with weight 71728035 (88.78/col)
Sun Mar 23 14:35:40 2008 sparse part has weight 47126790 (58.33/col)
Sun Mar 23 14:35:45 2008 read 807898 cycles
Sun Mar 23 14:35:46 2008 matrix is 807698 x 807898 (216.8 MB) with weight 71728035 (88.78/col)
Sun Mar 23 14:35:46 2008 sparse part has weight 47126790 (58.33/col)
Sun Mar 23 14:35:46 2008 saving the first 48 matrix rows for later
Sun Mar 23 14:35:46 2008 matrix is 807650 x 807898 (208.0 MB) with weight 54691277 (67.70/col)
Sun Mar 23 14:35:46 2008 sparse part has weight 46433959 (57.48/col)
Sun Mar 23 14:35:46 2008 matrix includes 64 packed rows
Sun Mar 23 14:35:46 2008 using block size 65536 for processor cache size 4096 kB
Sun Mar 23 14:35:50 2008 commencing Lanczos iteration (4 threads)
Sun Mar 23 14:35:50 2008 memory use: 215.8 MB
Sun Mar 23 15:14:21 2008 lanczos halted after 12774 iterations (dim = 807650)
Sun Mar 23 15:14:22 2008 recovered 43 nontrivial dependencies
Sun Mar 23 15:14:22 2008
Sun Mar 23 15:14:22 2008 commencing square root phase
Sun Mar 23 15:14:22 2008 reading relations for dependency 1
Sun Mar 23 15:14:23 2008 read 403464 cycles
Sun Mar 23 15:14:23 2008 cycles contain 1214570 unique relations
Sun Mar 23 15:14:37 2008 read 1214570 relations
Sun Mar 23 15:14:44 2008 multiplying 1947120 relations
Sun Mar 23 15:19:32 2008 multiply complete, coefficients have about 95.26 million bits
Sun Mar 23 15:19:33 2008 error: relation product is incorrect
Sun Mar 23 15:19:33 2008 algebraic square root failed
Sun Mar 23 15:19:33 2008 reading relations for dependency 2
Sun Mar 23 15:19:33 2008 read 403355 cycles
Sun Mar 23 15:19:34 2008 cycles contain 1213255 unique relations
Sun Mar 23 15:19:47 2008 read 1213255 relations
Sun Mar 23 15:19:54 2008 multiplying 1945522 relations
Sun Mar 23 15:24:44 2008 multiply complete, coefficients have about 95.19 million bits
Sun Mar 23 15:24:45 2008 error: relation product is incorrect
Sun Mar 23 15:24:45 2008 algebraic square root failed
Sun Mar 23 15:24:45 2008 reading relations for dependency 3
Sun Mar 23 15:24:45 2008 read 403937 cycles
Sun Mar 23 15:24:46 2008 cycles contain 1214985 unique relations
[/code]

R.D. Silverman 2008-04-14 12:21

[QUOTE=jasonp;131489]Now available. Highlights include

- A lot of work on NFS polynomial selection (still a major work in progress)

- A bunch of fixes for Visual Studio. This should fix porting problems in the last version, upgrade the build project to use MSVC 9.0, and should also fix a bug in the NFS square root that showed up in MSVC but also occurs with gcc (thanks to the MSVC users for helping out with this one).

Future work will continue the overhaul of the polynomial selection, and if time allows I'll experiment with using 128-bit vectors to speed up the linear algebra.

Please post followups in one of the other threads in this subforum. Happy factoring,

jasonp[/QUOTE]

Based on comments that have been made, it seems that the msieve
LA code is quite a bit faster than the CWI code. Does anyone know
why? One of the things on my "to do" list, is to improve the CWI LA
code.

jasonp 2008-04-14 13:28

[QUOTE=R.D. Silverman;131514]Based on comments that have been made, it seems that the msieve
LA code is quite a bit faster than the CWI code. Does anyone know
why? One of the things on my "to do" list, is to improve the CWI LA
code.[/QUOTE]
I can only speculate, not having seen the CWI version. Does the CWI code find 32 dependencies or 64? The latter is worth a noticeable speedup, and GGNFS switched to that many eventually. Otherwise, there are three reasons I can think of:

- msieve's linear algebra uses a block decomposition for the matrix multiply, but within a single block all of the coordinates to XOR are packed contiguously into a single long array. The much longer inner loops this generates should not make a difference on a modern out-of-order processor, but this alone makes a matrix multiply twice as fast as the one in GGNFS. Possibly that's because it eliminates all the branch mispredictions. Ironically, this trick was suggested to me by Peter Montgomery back in 1997 :)

- The x86 assembly code uses two tricks to reduce the number of memory operations. First, loads and stores of vectors use an MMX register to handle 64 bits at a time. Second, the coordinates to use are packed into a pair of 16-bit unsigned integers, and both coordinates are pulled into registers using a single 32-bit load and then separated via shift-and-mask. These 16-bit numbers are treated as offsets from the upper left corner of each block in the matrix, meaning blocks can be a maximum of 64k x 64k in size. The reduced number of memory operations lets the processor buffer many more of them, tolerating memory latency much better. Just using a single MMX register doubled the matrix multiply speed on a Core2 processor, with the addressing trick adding another ~6%

- Before the matrix solve begins, the matrix columns are sorted in order of increasing density, and then the heaviest columns are interleaved with the lightest columns. This is a static partitioning of the data so that all regions of the matrix have about the same number of nonzero elements, and helps balance the load when using multiple threads. There is a huge literature on doing this for ordinary floating point matrices, and the most effective techniques use graph partitioning, but this is extremely memory intensive and using state of the art graph partitioning packages to do this is infeasible for even modest-size NFS matrices unless the partitioning itself happens in parallel, spread across many machines


The other possibilities for improving performance include:

- finding more than 64 dependencies. This is what the all the major academic groups are doing with their code, and it has the nice advantage of increasing the amount of work that one thread has to do before synchronizing with other threads

- in a similar vein, but IMHO totally untried, it's possible to rearrange the matrix columns to contain an unusually large number of 1x2, 2x1 and 2x2 (or even k x k) matrices of nonzero elements by approximately solving a sparse traveling salesman problem, or by running standard column reordering algorithms and then a simple greedy procedure on the result. If there are enough of these dense submatrices, then the memory footprint needed to represent them is reduced, and you can use hardwired dense multiply code that has much lower overhead than the ordinary sparse multiply code. Again, there's a sizable literture on using this technique for floating point sparse matrix problems, and I've been meaning to test this idea but haven't had the time. It makes a difference with highly regular sparse matrices but probably won't pan out for NFS matrices

R.D. Silverman 2008-04-14 15:31

[QUOTE=jasonp;131522]I can only speculate, not having seen the CWI version. Does the CWI code find 32 dependencies or 64? The latter is worth a noticeable speedup, and GGNFS switched to that many eventually. Otherwise, there are three reasons I can think of:


<snip>
[/QUOTE]


The CWI code has no assembler at all and does not use the MMX registers....

bdodson 2008-04-14 21:40

[QUOTE=Wacky;126211] ...
I am comparing Greg's just completed run with the one that I have in progress.

He ran 292.69 hours to process "matrix is 9862941 x 9863188 with weight 664531270 (avg 67.37/col)"

I am estimating 543 hours to process "matrix is 13558127 x 13558374 (3675.6 MB) with weight 906838153 (66.88/col)"

The hardware is similar. Dual dual-core Xeon processors each running 4 threads.
....
Since we have only a limited number of machines in the class necessary to handle a matrix of this size, and since the execution time is non-trivial, are we reaching the optimum filtering point. For problems of this class, have we analyzed the tradeoff if instead using a smaller, but more dense matrix? (Assuming that we still have sufficient memory to avoid swapping)

What about using 128 or 256 bit blocks? ...[/QUOTE]

I seem to have started off in the deep end of the pool; and am wondering
whether I ought to be paddling towards the ladder out! Almost all of the
sieving for 3,536+ C252 was done in 10 days, up to 300 cores (and then
several more days for the last few stragglers, with jobs that kept getting
reset by other users on the cluster). I used Greg's duplicate remover to
take out 34.38M duplicates, then added 750K stragglers, after which
msieve removed another 280K duplicates (due the the stragglers), leaving

> 173806695 [unique] relations and about 92202326 large ideals,

reasonably above the 23.5M extra needed (?). The matrix started,
4 threads on one quadcore core (1st question: was I suppose to be able
to use more processors, 4 threads each as in Richard's post?). As I'm
reading over the available logs, I gather that I won't hear anything
until the matrix finishes (if I'm lucky), and it looks like that will be some
extended time. Perhaps I ought to have over-sieved or over-sieved more,
or even stop now, and pick up more relns?

The matrix build took "5016.4 MB", after which msieve reports

[code]
Mon Apr 14 10:53:00 2008 matrix is 14279232 x 14280170 (4031.9 MB)
with weight 1268445843 (88.83/col)
Mon Apr 14 10:53:00 2008 sparse part has weight 899865640 (63.02/col)

Mon Apr 14 11:51:36 2008 filtering completed in 3 passes
Mon Apr 14 11:52:19 2008 matrix is 14229076 x 14229276 (4025.4 MB)
with weight 1266010783 (88.97/col)
Mon Apr 14 11:52:19 2008 sparse part has weight 898702280 (63.16/col)
Mon Apr 14 12:21:50 2008 read 14229276 cycles
Mon Apr 14 12:26:13 2008 matrix is 14229076 x 14229276 (4025.4 MB)
with weight 1266010783 (88.97/col)
Mon Apr 14 12:26:13 2008 sparse part has weight 898702280 (63.16/col)
Mon Apr 14 12:26:15 2008 saving the first 48 matrix rows for later
Mon Apr 14 12:26:54 2008 matrix is 14229028 x 14229276 (3878.5 MB)
with weight 960343917 (67.49/col)
Mon Apr 14 12:26:55 2008 sparse part has weight 874437208 (61.45/col)
Mon Apr 14 12:26:55 2008 matrix includes 64 packed rows
Mon Apr 14 12:26:55 2008 using block size 65536 for processor cache size 6144 kB
Mon Apr 14 12:35:20 2008 commencing Lanczos iteration (4 threads)
Mon Apr 14 12:35:20 2008 memory use: 4162.4 MB
[/code]

Looks larger, harder than the above. From the date, late Feb, the above
is 2,841- not 12,241- so difficulty 253, not sure how that compares with
3,536+ c252 (which is off my table, up to c250!); but in the same range
(and under the number sieving now, 10,257- C241 at diff 257)? -Bruce

jasonp 2008-04-14 22:10

[QUOTE=bdodson;131538]
> 173806695 [unique] relations and about 92202326 large ideals,

reasonably above the 23.5M extra needed (?).
[/QUOTE]
The exact number of ideals used in the filtering is logged in the singleton removal pass that says 'final pass', the figure above is only approximate
[QUOTE]
The matrix started,
4 threads on one quadcore core (1st question: was I suppose to be able
to use more processors, 4 threads each as in Richard's post?). As I'm
reading over the available logs, I gather that I won't hear anything
until the matrix finishes (if I'm lucky), and it looks like that will be some
extended time. Perhaps I ought to have over-sieved or over-sieved more,
or even stop now, and pick up more relns?
[/QUOTE]
I don't think anyone has tried runs with more than 4 threads, and I don't have any intuition about what to expect when trying a large number of threads packed into a small number of processors. The library has a hard limit of 32 threads, but this is a rather large example to experiment with. Note that interrupting the run will write a savefile, and the restart can use a different number of threads, if you want to run short experiments.

The matrix size and density looks to be approximately in line with what people have been reporting for 250+ digit jobs, at least to me, though there aren't too many data points in this range. Has it run through enough iterations that you can estimate how long the matrix should take?

PS: you're correct that the 13.5M matrix Richard described is for 2,841-

Wacky 2008-04-14 22:18

You should be somewhat better off than I am on 12,241-. Mine has
[code]matrix is 18720804 x 18721052 (5170.3 MB) with weight 1285924286 (68.69/col)
sparse part has weight 1168145840 (62.40/col)
matrix includes 64 packed rows
using block size 65536 for processor cache size 4096 kB
commencing Lanczos iteration (4 threads)
memory use: 5547.0 MB[/code]
rkw 62500 386.1 17.1 3338192 1433224 s003 R+ 21Mar08 117669:59.58 build/Release/Msieve -s 12m241.dat -l 12m241.log -i 12m241.worktodo.ini -v -t 4 -nc -nf 12m241.msieve.fb

It is currently at 47.4% and will need most of an additional month to finish.

You should be able to make a reasonable projection if you can see the intermediate progress report on the "console".

R.D. Silverman 2008-04-15 01:26

[QUOTE=R.D. Silverman;131523]The CWI code has no assembler at all and does not use the MMX registers....[/QUOTE]


It is also just single threaded......

frmky 2008-04-15 03:27

The matrix for 12,227+ was
[CODE]
Thu Mar 27 15:31:25 2008 matrix is 11775851 x 11776145 (3359.8 MB) with weight 1054487285 (89.54/col)
Thu Mar 27 15:31:25 2008 sparse part has weight 751222879 (63.79/col)
Thu Mar 27 15:35:01 2008 filtering completed in 3 passes
Thu Mar 27 15:35:04 2008 matrix is 11746333 x 11746533 (3356.9 MB) with weight 1053290577 (89.67/col)
Thu Mar 27 15:35:04 2008 sparse part has weight 750768272 (63.91/col)
Thu Mar 27 15:36:50 2008 read 11746533 cycles
Thu Mar 27 15:36:56 2008 matrix is 11746333 x 11746533 (3356.9 MB) with weight 1053290577 (89.67/col)
Thu Mar 27 15:36:56 2008 sparse part has weight 750768272 (63.91/col)
Thu Mar 27 15:36:56 2008 saving the first 48 matrix rows for later
Thu Mar 27 15:37:01 2008 matrix is 11746285 x 11746533 (3221.9 MB) with weight 803452505 (68.40/col)
Thu Mar 27 15:37:01 2008 sparse part has weight 727145147 (61.90/col)
Thu Mar 27 15:37:01 2008 matrix includes 64 packed rows
Thu Mar 27 15:37:01 2008 using block size 65536 for processor cache size 4096 kB
Thu Mar 27 15:37:43 2008 commencing Lanczos iteration (4 threads)
Thu Mar 27 15:37:43 2008 memory use: 3445.7 MB
[/CODE]

and took 18 days on a quad-core Core 2 to run. Unfortunately, it failed but I strongly suspect that was due to a hardware error. I'll rerun it on a different computer in a week or so.

Greg

xilman 2008-04-15 07:52

[QUOTE=jasonp;131522]I can only speculate, not having seen the CWI version. Does the CWI code find 32 dependencies or 64?[/QUOTE]It's a compile-time constant. I've experimented with block sizes up to 256 and found that, for my environment, 128 is about right. 256 is neither markedly worse or markedly better. 64 is definitely inferior.


Paul

bdodson 2008-04-15 11:11

[QUOTE=jasonp;131540]...Note that interrupting the run will write a savefile, and the restart can use a different number of threads, if you want to run short experiments.

The matrix size and density looks to be approximately in line with what people have been reporting for 250+ digit jobs, at least to me, though there aren't too many data points in this range. Has it run through enough iterations that you can estimate how long the matrix should take?

PS: you're correct that the 13.5M matrix Richard described is for 2,841-[/QUOTE]

Thanks. The new sgi compute server was rebooted yesterday evening
before msieve wrote the first checkpoint; so -ncr didn't work, of
course, but -nc2 picked up from the last cycle write, and went
directly to the two 14.2M^2 matrices (with and without the first
48 rows; ... uhm, that's "read the cycles; read the relns; re-computed
the quad characters" then went directly to the matrix). These quads
don't open officially until c. June1 (shortly before the old, leased,
sgi altrix Itanium is boxed-up and shipped back), with several 100s
more software packages to be installed --- and then there'll be
more/other users.

Does "interrupt" mean something like kill -TERM pid, or kill -TSP pid?
I'm not seeing anything on my "console", as I'm logging in remotely,
running

(./msieve -v -t 4 -nc2 ) > ms2.mat3 &

and logging off. The stdout (stderr?) has only dumped once so far, when
the buffer for msieve.log hit 3200-or-so bits during filtering. After the
reboot, when msieve hit the first checkpoint (!), but msieve.log hasn't said
anything yet ... as I was saying, some of the logfiles posted here say

"starting lanczos, memory xxx",

then nothing for 10days, two weeks, to the lanczos finish. I'll try
an interrupt and restarting on an open term on my office pc, and see
whether that does better at collecting timings --- as I read the docs,
they say that -v just writes a 2nd copy of the msieve.log, with it's
week-or-more of "no comment". Did I miss the part of the instructions
(on the 512-thread; thanks) about enabling timing info?

Anyway, I'm less inclined to consider extra sieving; seems this matrix
is within/near the range I ought to have expected (I don't think I
did any with the cwi suite of sgi/irix binaries --- for which the lanczos
in "gauss-64" certainly included mips assembler --- above 3M^2, although
I did ship larger ones off to Xilman at msr, and to Peter at cwi). The
checkpoint's a good sign. Also, I liked the top/monitor report:

[code]

%cpu
5885 bad0 25 0 4771m 4.6g 484 R 399 4.2 1071:13 msieve
6073 bad0 25 0 521m 223m 560 R 101 0.2 396:16.77 gnfs-lasieve4I
---

5885 bad0 17 0 4771m 4.6g 484 R 343 4.2 1068:42 msieve
[/code]

which clearly indicates that the threads are threading (the "bad0"
being my Lehigh id ...). -Bruce


All times are UTC. The time now is 22:43.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.