mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2012-12-03, 18:55   #34
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by jasonp View Post
(At this point I'm proceeding as if whatever is going on is either a hitherto-unknown problem in the algorithm that only manifests itself sporadically or with low amounts of oversieving, or a compiler issue)
Hmm... but these issues have only started recently, around the time you mention came that optimization.

I can't test it ATM (I'm booted to Windows) but I'll try and test it tonight. If that doesn't work I'll try undoing the 6-weeks-ago-changes.
Dubslow is offline   Reply With Quote
Old 2012-12-03, 19:39   #35
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

23×3×5×72 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Hmm... but these issues have only started recently, around the time you mention came that optimization.

I can't test it ATM (I'm booted to Windows) but I'll try and test it tonight. If that doesn't work I'll try undoing the 6-weeks-ago-changes.
Have you upgraded gcc or any of your compile chain recently which could have caused this?
henryzz is offline   Reply With Quote
Old 2012-12-05, 20:59   #36
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Quote:
Originally Posted by henryzz View Post
Have you upgraded gcc or any of your compile chain recently which could have caused this?
Nope, not since I installed the OS in April 2011. I'd also note that xilman's run into this issue as well, I'm just the more verbose about reporting such problems. (And I guess I do a lot of small jobs, where he does fewer larger ones.)

@jasonp: I commented out the "filter_postproc_rels(obj, &merge);" call around 8 lines from the bottom of gnfs/filter/filter.c and tried again; no dice.
Code:
commencing relation filtering 
estimated available RAM is 11949.2 MB 
commencing duplicate removal, pass 1 
read 10M relations 
found 894746 hash collisions in 10945954 relations
 commencing duplicate removal, pass 2 
found 561259 duplicates and 10384695 unique relations 
memory use: 49.3 MB 
reading ideals above 720000 
commencing singleton removal, initial pass 
memory use: 344.5 MB 
reading all ideals from disk 
memory use: 289.7 MB 
keeping 10907617 ideals with weight <= 200, target excess is 116031 commencing in-memory singleton removal 
begin with 10384695 relations and 10907617 unique ideals 
reduce to 4317179 relations and 3799573 ideals in 17 passes 
max relations containing the same ideal: 99 removing 
1117344 relations and 925839 ideals in 191505 cliques 
commencing in-memory singleton removal 
begin with 3199835 relations and 3799573 unique ideals 
reduce to 3013618 relations and 2677504 ideals in 10 passes 
max relations containing the same ideal: 74 
removing 824656 relations and 633151 ideals in 191505 cliques 
commencing in-memory singleton removal 
begin with 2188962 relations and 2677504 unique ideals 
reduce to 2025471 relations and 1870220 ideals in 12 passes 
max relations containing the same ideal: 56 
removing 148353 relations and 127698 ideals in 20655 cliques 
commencing in-memory singleton removal 
begin with 1877118 relations and 1870220 unique ideals 
reduce to 1869550 relations and 1734877 ideals in 7 passes 
max relations containing the same ideal: 51 
relations with 0 large ideals: 449 
relations with 1 large ideals: 4666 
relations with 2 large ideals: 50034 
relations with 3 large ideals: 215958
relations with 4 large ideals: 468714 
relations with 5 large ideals: 561836 
relations with 6 large ideals: 382484 
relations with 7+ large ideals: 185409 
commencing 2-way merge 
reduce to 1118199 relation sets and 983526 unique ideals 
commencing full merge 
memory use: 105.1 MB 
found 549498 cycles, need 531726 
weight of 531726 cycles is about 37495284 (70.52/cycle) 
distribution of cycle lengths: 
1 relations: 57799 
2 relations: 55946 
3 relations: 58003 
4 relations: 55103 
5 relations: 51026 
6 relations: 45680 
7 relations: 39833 
8 relations: 34759 
9 relations: 29487 
10+ relations: 104090 
heaviest cycle: 19 
relations RelProcTime: 300 
nfs: commencing msieve linear algebra 

commencing linear algebra 
read 531726 cycles 
cycles contain 1766484 unique relations 
read 1766484 relations using 20 quadratic characters above 134216630 building initial matrix 
memory use: 224.2 MB 
read 531726 cycles 
matrix is 550499 x 531726 (164.5 MB) with weight 51855411 (97.52/col) 
sparse part has weight 36750644 (69.12/col) filtering completed in 2 passes 
matrix is 534302 x 524247 (162.2 MB) with weight 51091073 (97.46/col) 
sparse part has weight 36217721 (69.09/col) 
matrix starts at (0, 0) 
matrix is 534302 x 524247 (162.2 MB) with weight 51091073 (97.46/col) 
sparse part has weight 36217721 (69.09/col) 
matrix needs more columns than rows; try adding 2-3% more relations
I'll try undoing those recent changes (with the call to filter_postproc_rels readded), but that'll need to be edited in here in ~half an hour.

Actually... it seems that whatever the most recent Msieve that YAFU will compile with (being the one that I'm using) is older than those filtering changes due to C. Bouvier; so, it's already like the "old" code.

I'm not quite sure what else to test, then. Here's the dataset. (It's uncompressed, to avoid corruption, since it seems a few hundred rels make the difference. It's a bit under 1GB. Compression on request.)

Last fiddled with by Dubslow on 2012-12-05 at 21:53
Dubslow is offline   Reply With Quote
Old 2012-12-06, 07:33   #37
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

Gah! It's happened again! I'll look more closely at the revision logs tomorrow... I'll even try the old suggested fix on this new dataset...
Dubslow is offline   Reply With Quote
Old 2013-01-03, 16:28   #38
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Jason, a couple of requests for Msieve, both of which stem from bringing YAFU up to date with the new api.

1) Could you please modify analyze_one_poly() to return whatever values it calculates (size, root, and combined score, and possibly rroots; I'm not sure which ones Ben might use for YAFU's SNFS stuff)?

2) Could you put un/load_dynamic_lib() inside #ifdef HAVE_CUDA statements? They're only called by CUDA code, but are compiled even without CUDA, which causes the following warning when I compile YAFU:
Code:
/home/bill/msievecuda/libmsieve.a(util.o): In function `load_dynamic_lib':
util.c:(.text+0x8d7): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
I get the same warning if I link demo.c statically as well, but static is the default for YAFU.

-Bill

Last fiddled with by Dubslow on 2013-01-03 at 16:28
Dubslow is offline   Reply With Quote
Old 2013-01-04, 11:22   #39
debrouxl
 
debrouxl's Avatar
 
Sep 2009

11110100012 Posts
Default

I get a reproducable segfault when GPU-based polsel stage 1 ends. For instance:
Code:
[asus2:12834] *** Process received signal ***
[asus2:12834] Signal: Segmentation fault (11)
[asus2:12834] Signal code: Address not mapped (1)
[asus2:12834] Failing at address: 0x500
[asus2:12834] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x7f04f5daa030]
[asus2:12834] [ 1] ./msieve(free_prime_sieve+0x4) [0x414394]
[asus2:12834] [ 2] ./msieve(sieve_fb_free+0x10) [0x454500]
[asus2:12834] [ 3] ./msieve() [0x438fc1]
[asus2:12834] [ 4] ./msieve() [0x453747]
[asus2:12834] [ 5] /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f04f5da1b50]
[asus2:12834] [ 6] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f04f4f79a7d]
(usually, the failing address is different, and I don't get a backtrace)


More details:
* Debian unstable amd64, with a 3.7.1 kernel from Aptosid and nvidia 310.19 drivers from Debian experimental (which, unlike the 304.x drivers from unstable, support 3.7 kernels - that way, I regained the ability to perform GPU-based polsel );

* the executable which segfaults was produced by a
Code:
CUDA=1 ECM=1 MPI=1 make all
build of SVN r838; on this computer, my updated version of poily's MPI polsel patch is not applied.

* sample command-line invocation:
Code:
LD_LIBRARY_PATH=. ./msieve -np1 'stage1_norm=2e33 23456700000000,23456750000000' -v

Last fiddled with by debrouxl on 2013-01-04 at 11:24
debrouxl is offline   Reply With Quote
Old 2013-02-17, 22:08   #40
LangerJan
 
LangerJan's Avatar
 
Feb 2013

3·5 Posts
Default

Yay, it's released \o/

I will provide a patch about error-handling for v1.51 instead of 1.50 then.
LangerJan is offline   Reply With Quote
Old 2013-02-18, 01:00   #41
LangerJan
 
LangerJan's Avatar
 
Feb 2013

3·5 Posts
Default

(Sorry, I don't know how to edit my previous post)

I made a patch for my aforementioned error-handling additions to msieve. If anybody is interested, you can download it here. So far, it catches the following errors:
  • Error converting input
  • Tiny factoring failed
  • matrix needs more columns than rows; try adding 2-3% more relations
The demo-application ist patched too. One can try catching "Tiny factoring failed" with:
Code:
./msieve 194688114416199995117
LangerJan is offline   Reply With Quote
Old 2013-02-19, 20:31   #42
LangerJan
 
LangerJan's Avatar
 
Feb 2013

3×5 Posts
Default

I got a nasty segmentation fault here:

Code:
./msieve 1201344281581552984739248655827
sieving in progress (press Ctrl-C to pause)
Speicherzugriffsfehler (Speicherabzug geschrieben)ial), need 393
Contents of msieve.dat after the crash:

Code:
6a2444e41dd0a7d3a47cf1a0c5
Contents of msieve.log:

Code:
Tue Feb 19 21:11:19 2013  Msieve v. 1.51 (SVN Nicht versioniertes Verzeichnis)
Tue Feb 19 21:11:19 2013  random seeds: d091bb5c 22ae9ef6
Tue Feb 19 21:11:19 2013  factoring 1201344281581552984739248655827 (31 digits)
Tue Feb 19 21:11:19 2013  no P-1/P+1/ECM available, skipping
Tue Feb 19 21:11:20 2013  commencing quadratic sieve (31-digit input)
Tue Feb 19 21:11:20 2013  using multiplier of 7
Tue Feb 19 21:11:20 2013  using generic 32kb sieve core
Tue Feb 19 21:11:20 2013  sieve interval: 4 blocks of size 32768
Tue Feb 19 21:11:20 2013  processing polynomials in batches of 51
Tue Feb 19 21:11:20 2013  using a sieve bound of 4217 (297 primes)
Tue Feb 19 21:11:20 2013  using large prime bound of 168680 (17 bits)
Tue Feb 19 21:11:20 2013  polynomial 'A' values have 3 factors
This error is only repeatable, if msieve.dat is empty/non-existent and, so far, only with this seeds.

I compiled with debug-flags and was able to track it down to mpqs/sieve.c: check_sieve_val():line 949

Code:
        if (j == root1 || j == root2) {
            do {
                fb_offsets[num_factors++] = i; // <- crash
                mp_divrem_1(&res, prime, &res);
                j = mp_mod_1(&res, prime);
            } while (j == 0);
        }
On my machine, the array fb_offsets can carry 256 elements. The loop never ends, because res becomes 0 and therefore, j stays 0 indefinitely. Writing in fb_offsets[num_factors > 255] eventually ends in a segmentation fault.

This is a very nasty bug since its so hard to find, yet it's very unlikely to happen. An additional loop-condition might help, but I'm not familiar with the algorithm to tell how to handle this situation any further.
LangerJan is offline   Reply With Quote
Old 2013-02-20, 11:54   #43
LangerJan
 
LangerJan's Avatar
 
Feb 2013

3×5 Posts
Default

There is a "double free()" issue in the factor_mpqs() part of msieve:

Code:
./msieve 158391981172732964211556699471
sieving complete, commencing postprocessing
*** glibc detected *** ./msieve: double free or corruption (top): 0x083b3c60 ***
======= Backtrace: =========
<snip>
luckily, msieve.log helps a lot tracking it down in the code:

Code:
Wed Feb 20 12:44:54 2013  Msieve v. 1.51 (SVN Nicht versioniertes Verzeichnis)
Wed Feb 20 12:44:54 2013  random seeds: 12bbf0b7 7b61a7f5
Wed Feb 20 12:44:54 2013  factoring 158391981172732964211556699471 (30 digits)
Wed Feb 20 12:44:54 2013  no P-1/P+1/ECM available, skipping
Wed Feb 20 12:44:54 2013  commencing quadratic sieve (30-digit input)
Wed Feb 20 12:44:54 2013  using multiplier of 31
Wed Feb 20 12:44:54 2013  using generic 32kb sieve core
Wed Feb 20 12:44:54 2013  sieve interval: 4 blocks of size 32768
Wed Feb 20 12:44:54 2013  processing polynomials in batches of 51
Wed Feb 20 12:44:54 2013  using a sieve bound of 3923 (280 primes)
Wed Feb 20 12:44:54 2013  using large prime bound of 156920 (17 bits)
Wed Feb 20 12:44:54 2013  polynomial 'A' values have 3 factors
Wed Feb 20 12:44:54 2013  restarting with 260 full and 1506 partial relations
Wed Feb 20 12:44:54 2013  1073 relations (260 full + 813 combined from 1506 partial), need 376
Wed Feb 20 12:44:54 2013  begin with 1766 relations
Wed Feb 20 12:44:54 2013  reduce to 1766 relations in 1 passes
Wed Feb 20 12:44:54 2013  attempting to read 1766 relations
Wed Feb 20 12:44:54 2013  recovered 1766 relations
Wed Feb 20 12:44:54 2013  recovered 8 polynomials
Wed Feb 20 12:44:54 2013  freed 883 duplicate relations
Wed Feb 20 12:44:54 2013  attempting to build 190 cycles
Wed Feb 20 12:44:54 2013  found 190 cycles in 1 passes
Wed Feb 20 12:44:54 2013  distribution of cycle lengths:
Wed Feb 20 12:44:54 2013     length 1 : 130
Wed Feb 20 12:44:54 2013     length 2 : 60
Wed Feb 20 12:44:54 2013  largest cycle: 2 relations
Wed Feb 20 12:44:54 2013  matrix is 280 x 190 (0.0 MB) with weight 2268 (11.94/col)
Wed Feb 20 12:44:54 2013  sparse part has weight 2268 (11.94/col)
Wed Feb 20 12:44:54 2013  matrix is corrupt; skipping linear algebra
The first free() happens at mpqs/gf2.c:50 solve_linear_system()

Code:
    if (ncols == 0) {
        logprintf(obj, "matrix is corrupt; skipping linear algebra\n");
        free(cols);
        *num_cycles = 0;
        return;
    }
This function is being called from factor_mpqs(), which does a free_cycle_list() shortly after it.

Solution:
You can safely delete the line "free(cols);" above.
LangerJan is offline   Reply With Quote
Old 2013-02-21, 02:39   #44
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,541 Posts
Default

Thanks for the first fix to the QS code in 6 years :)

I committed the change to avoid the double-free. The first bug you found is a lot more subtle; you get a crash because the values of root1 and/or root2 are not correct, and the prime in the loop does not actually divide the sieve value. Unfortunately any checks for something wrong in the sieve would take time, so I don't yet know why it's wrong, and it could be literally anything.
jasonp is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Msieve 1.53 feedback xilman Msieve 149 2018-11-12 06:37
Msieve 1.50 feedback firejuggler Msieve 99 2013-02-17 11:53
Msieve 1.43 feedback Jeff Gilchrist Msieve 47 2009-11-24 15:53
Msieve 1.42 feedback Andi47 Msieve 167 2009-10-18 19:37
Msieve 1.41 Feedback Batalov Msieve 130 2009-06-09 16:01

All times are UTC. The time now is 00:50.


Sat Jul 17 00:51:00 UTC 2021 up 49 days, 22:38, 1 user, load averages: 1.71, 1.56, 1.43

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.