mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2013-08-30, 12:31   #12
JP12
 
Aug 2013

112 Posts
Default

Quote:
Originally Posted by jasonp View Post
No; this is a structure that was built with multithreading in mind, but is needed for single-thread runs as well, so it always has to get cleaned up. Plus the code you quote only runs when shutting down, which would not explain why the linear algebra itself fails early on.

My current guess is that there is a buffer overflow early on, which happens to stomp on memory that is used by the multithreading, because the current SVN allocates multithreading memory in a different place now. But I don't have the time to diagnose it.
Hi Jason,
You are probably right...
The attached stack trace may give you clues.
Regards,

Jose
Attached Files
File Type: txt stacktrace.txt (1.2 KB, 109 views)
JP12 is offline   Reply With Quote
Old 2013-08-30, 12:40   #13
debrouxl
 
debrouxl's Avatar
 
Sep 2009

977 Posts
Default

The C100 is small enough, so I re-sieved it on my own with yafu 1.34, compiled a version of msieve under GCC 4.8.1 with -fsanitize=address (hence the "M" version number in the traces below; I also used -march=native on a Sandy Bridge Core i7), and ran post-processing in msieve.
The instrumented msieve binary runs a bit slower than the non-instrumented one (but still much faster than the non-instrumented binary is emulated by valgrind !), and it does indeed point out a memory fault, though not a buffer overflow:
Code:
./msieve -i nfs.ini -s nfs.dat -nf nfs.fb -nc -v -t 4


Msieve v. 1.52 (SVN 944M)
Fri Aug 30 14:21:57 2013
random seeds: 2c313887 70fb081b
factoring 2881039827457895971881627053137530734638790825166127496066674320241571446494762386620442953820735453 (100 digits)
no P-1/P+1/ECM available, skipping
commencing number field sieve (100-digit input)
R0: -656817823909567619825292
R1: 1672296237551
A0: -45812042361281393527729219
A1: 6922179506956865775324
A2: 6810874320961339
A3: -19778945124
A4: 15480
skew 1.00, size 8.256e-14, alpha -5.148, combined = 1.863e-10 rroots = 2

commencing relation filtering
estimated available RAM is 3863.8 MB
commencing duplicate removal, pass 1
found 323140 hash collisions in 4781979 relations
added 155672 free relations
commencing duplicate removal, pass 2
found 187044 duplicates and 4750607 unique relations
memory use: 16.3 MB
reading ideals above 100000
commencing singleton removal, initial pass
memory use: 172.2 MB
reading all ideals from disk
memory use: 147.2 MB
keeping 5359137 ideals with weight <= 200, target excess is 43315
commencing in-memory singleton removal
begin with 4750607 relations and 5359137 unique ideals
reduce to 1523144 relations and 1418541 ideals in 19 passes
max relations containing the same ideal: 96
removing 252245 relations and 225066 ideals in 27179 cliques
commencing in-memory singleton removal
begin with 1270899 relations and 1418541 unique ideals
reduce to 1233402 relations and 1154998 ideals in 10 passes
max relations containing the same ideal: 83
removing 185708 relations and 158529 ideals in 27179 cliques
commencing in-memory singleton removal
begin with 1047694 relations and 1154998 unique ideals
reduce to 1021807 relations and 969939 ideals in 9 passes
max relations containing the same ideal: 72
relations with 0 large ideals: 623
relations with 1 large ideals: 5622
relations with 2 large ideals: 34671
relations with 3 large ideals: 131782
relations with 4 large ideals: 271144
relations with 5 large ideals: 317204
relations with 6 large ideals: 188935
relations with 7+ large ideals: 71826
commencing 2-way merge
reduce to 565039 relation sets and 513171 unique ideals
commencing full merge
memory use: 51.7 MB
found 263841 cycles, need 257371
weight of 257371 cycles is about 18121353 (70.41/cycle)
distribution of cycle lengths:
1 relations: 26761
2 relations: 25851
3 relations: 25979
4 relations: 24150
5 relations: 22061
6 relations: 19402
7 relations: 16941
8 relations: 15319
9 relations: 13564
10+ relations: 67343
heaviest cycle: 24 relations
commencing cycle optimization
start with 1751281 relations
pruned 39571 relations
memory use: 57.4 MB
distribution of cycle lengths:
1 relations: 26761
2 relations: 26380
3 relations: 26835
4 relations: 24711
5 relations: 22566
6 relations: 19628
7 relations: 17211
8 relations: 15384
9 relations: 13623
10+ relations: 64272
heaviest cycle: 24 relations
RelProcTime: 122

commencing linear algebra
read 257371 cycles
cycles contain 955164 unique relations
read 955164 relations
using 20 quadratic characters above 67106484
building initial matrix
memory use: 114.0 MB
read 257371 cycles
matrix is 257184 x 257371 (78.2 MB) with weight 24648824 (95.77/col)
sparse part has weight 17402494 (67.62/col)
filtering completed in 2 passes
matrix is 256111 x 256297 (78.0 MB) with weight 24594745 (95.96/col)
sparse part has weight 17380545 (67.81/col)
matrix starts at (0, 0)
matrix is 256111 x 256297 (78.0 MB) with weight 24594745 (95.96/col)
sparse part has weight 17380545 (67.81/col)
saving the first 48 matrix rows for later
matrix includes 64 packed rows
matrix is 256063 x 256297 (74.7 MB) with weight 19511081 (76.13/col)
sparse part has weight 17006737 (66.36/col)
using block size 8192 and superblock size 589824 for processor cache size 6144 kB
commencing Lanczos iteration (4 threads)
memory use: 56.6 MB
=================================================================
==13479== ERROR: AddressSanitizer: heap-use-after-free on address 0x603401b3e7bf at pc 0x495a62 bp 0x7fff585feba0 sp 0x7fff585feb98
WRITE of size 1 at 0x603401b3e7bf thread T0
    #0 0x495a61 in mul_packed_small_core ??:?
    #1 0x490d2b in mul_packed lanczos_matmul0.c:?
    #2 0x493933 in mul_sym_NxN_Nx64 ??:?
    #3 0x48d7ae in block_lanczos ??:?
    #4 0x474fe1 in nfs_solve_linear_system ??:?
    #5 0x434c45 in factor_gnfs ??:?
    #6 0x408fd0 in msieve_run ??:?
    #7 0x405d84 in factor_integer ??:?
    #8 0x404b0a in main ??:?
    #9 0x7f799c07e994 in __libc_start_main /build/eglibc-MUWt1e/eglibc-2.17/csu/libc-start.c:260
    #10 0x405784 in _start ??:?
0x603401b3e7bf is located 319 bytes inside of 512-byte region [0x603401b3e680,0x603401b3e880)
freed by thread T0 here:
    #0 0x7f799cfd530a in __interceptor_free ??:?
    #1 0x4920b5 in packed_matrix_init ??:?
    #2 0x489de9 in block_lanczos ??:?
    #3 0x474fe1 in nfs_solve_linear_system ??:?
    #4 0x434c45 in factor_gnfs ??:?
    #5 0x408fd0 in msieve_run ??:?
    #6 0x405d84 in factor_integer ??:?
    #7 0xa
previously allocated by thread T0 here:
    #0 0x7f799cfd552f in __interceptor_realloc ??:?
    #1 0x48aae5 in block_lanczos ??:?
    #2 0x474fe1 in nfs_solve_linear_system ??:?
    #3 0x434c45 in factor_gnfs ??:?
    #4 0x408fd0 in msieve_run ??:?
    #5 0x405d84 in factor_integer ??:?
    #6 0xa
Shadow bytes around the buggy address:
  0x0c070035fca0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c070035fcb0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c070035fcc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c070035fcd0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c070035fce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c070035fcf0: fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd
  0x0c070035fd00: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c070035fd10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c070035fd20: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c070035fd30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c070035fd40: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:     fa
  Heap righ redzone:     fb
  Freed Heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==13479== ABORTING
(in the backtrace, I replaced the raw AddressSanitizer output with the readable output from the asan-symbolize.py script from the Clang repo)
Hope that this can help
AddressSanitizer, originally known from LLVM/Clang, was added to GCC 4.8 several months ago, and it can prove a fantastic tool.
debrouxl is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
New build help Prime95 Hardware 147 2018-11-10 00:58
I'm trying to speedup AES MSIEVE factoring using CUDA build but... loopdemack Msieve 11 2016-01-18 13:44
32-bit CUDA build? f1pokerspeed Msieve 2 2013-12-30 01:14
Windows build for GPU CUDA code Brian Gladman GMP-ECM 13 2013-05-13 15:00
Windows 7 Windows Update & Prime95 issue!!! Unregistered Information & Answers 14 2010-04-10 21:47

All times are UTC. The time now is 00:52.


Sat Jul 17 00:52:24 UTC 2021 up 49 days, 22:39, 1 user, load averages: 1.38, 1.48, 1.41

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.