mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2009-12-19, 09:54   #1
Raman
Noodles
 
Raman's Avatar
 
"Mr. Tuch"
Dec 2007
Chennai, India

4E916 Posts
Default GGNFS 64-bit sieve crash

While sieving for 7,320+
Code:
n: 154758218728434566356200475118402575823277636002771024732774611297531505856368662705022790238791621423554625142974233023501037614707598549788891266955273526560342732155180961801351291833302132131894401
m: 1219760487635835700138573862562971820755615294131238401
c4: 1
c3: -1
c2: 1
c1: -1
c0: 1
skew: 1
type: snfs
rlim: 125000000
alim: 25000000
lpbr: 31
lpba: 29
mfbr: 62
mfba: 58
rlambda: 2.6
alambda: 2.6
Rational side, at q = 28677949, I am getting up segmentation fault in GGNFS. Why?
Here is the job file for your reference purposes only.

By the way, I am sieving another 10 million range of special-q on the algebraic side for 2,1778L.

Will the matrix size be smaller for any number if used up with 30 bit large primes instead of 31 bit large primes, or 29 instead of 30? In order to accommodate up the matrix within 2 GB of RAM itself, only, if the 4 GB RAM system will not at all be ready very soon.

Last fiddled with by Raman on 2009-12-19 at 10:22
Raman is offline   Reply With Quote
Old 2009-12-20, 21:03   #2
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

9,497 Posts
Default

Quote:
Originally Posted by Raman View Post
gnfs-lasieve4I15e for 7,320+. 64 bit Linux binary, Core 2 Duo.
Hmm. Still doesn't crash here. (This is like downforeveryoneorjustme.com - if a crash cannot be reproduced, it cannot be debugged. Even though I am interested to debug it.)

Following-up:
Can you reproduce it or was it just then?
What was the rlim for that particular chunk* (if you were using the perl script)?
In other words, what was the particular command line?

It would be most helpful if you could produce the exact command line and probably the pointer to the exact binary.

____
*if you don't know it, have a look at any consecutive command lines that are running now or before; using qintsize value it is possible to step back to the defective region. rlim is adjusted/lowered to the start_q once and then doesn't grow during the run of the program.
Batalov is offline   Reply With Quote
Old 2009-12-21, 05:00   #3
Raman
Noodles
 
Raman's Avatar
 
"Mr. Tuch"
Dec 2007
Chennai, India

125710 Posts
Default

Quote:
Originally Posted by Batalov View Post
Following-up:
Can you reproduce it or was it just then?
What was the rlim for that particular chunk* (if you were using the perl script)?
In other words, what was the particular command line?

It would be most helpful if you could produce the exact command line and probably the pointer to the exact binary.
rlim was exactly equal the same value: 125000000, alim was 25000000. Just simply added up
q0: 28677949
qintsize: 322051
#q1: 29000000
within the job file.
Yes, it is true that the gnfs binary lowers down the value of special-q, thus setting it up to 28677948.
So, it has been lowered down to exactly equal to this value. Why is it so? Thus, does it have any effect
on cutting out some of the relations, which would have otherwise been produced up?

64 bit Linux binary whose source code was being downloaded up from the website of Mr. Jeff Gilchrist. Being compiled up by myself.

Command line is as follows:
nohup ~/64bit/gnfs-lasieve4I15e -k -o spairs2.out -v -n0 -r 7_320P.job
No perl script was being used up at all by me, just simply sieving up a different sieve range for each of the different computers, by using this command line itself only, thus.

No problem that I skipped up this special-q value from sieving, anyway, however.

HAPPY WINTER SOLSTICE DAY!
(NORTHERN HEMISPHERE - WINTER SOLSTICE)
(SOUTHERN HEMISPHERE - SUMMER SOLSTICE)
Remember that today is December 21 again, right back.

Last fiddled with by Raman on 2009-12-21 at 05:05
Raman is offline   Reply With Quote
Old 2009-12-21, 05:13   #4
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

9,497 Posts
Default

You did the right thing by skipping.
Then, there's nothing left to do for now, because it is not reproducible.

Happy Solstice Day!
(I do know quite a few people who celebrate it.)
Batalov is offline   Reply With Quote
Old 2009-12-21, 05:36   #5
jrk
 
jrk's Avatar
 
May 2008

44716 Posts
Default

Quote:
Originally Posted by Batalov View Post
Then, there's nothing left to do for now, because it is not reproducible.
FYI: I can reproduce it.

Code:
$ ~/ggnfs/bin/gnfs-lasieve4I15e -k -o spairs2.out -v -n0 -r 7_320P.job
Warning:  lowering FB_bound to 28677948.
FBsize 1565317+0 (deg 4), 1780990+0 (deg 1)
Segmentation fault
This is the experimental 64bit (athlon64) lasieve4I15e from SVN 353.
jrk is offline   Reply With Quote
Old 2009-12-21, 05:41   #6
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

9,497 Posts
Default

Ok, can you do it under gdb and do "bt" when it crashes?
Thanks.

I have this:
Code:
$ ~/KF/gnfs_lasieve_source/gnfs-lasieve4I15e -r t.poly -f 28677949 -c 20
Warning:  lowering FB_bound to 28677948.
total yield: 56, q=28677989 (0.35679 sec/rel)
Aha. Found some old binary -- there's something; that's the unlikely old "mpqs failed" thing:
Code:
 
$ ../old_lasieve4_64/gnfs-lasieve4I15e -r t.poly -f 28677949 -c 22
Warning:  lowering FB_bound to 28677948.
mpqs failed for 2926569634690573369(a,b): 10245291 49019062
total yield: 55, q=28677989 (0.35600 sec/rel)
It still doesn't crash, but
I'll have a look at this one single different relation -- it has a prime square:
2926569634690573369 = 17107219632
Something is incompatible with your GMP library. Hmm. Will think.
Maybe it crashes in printf, actually.

Thanks for the case.

Last fiddled with by Batalov on 2009-12-21 at 05:55
Batalov is offline   Reply With Quote
Old 2009-12-21, 05:44   #7
jrk
 
jrk's Avatar
 
May 2008

21078 Posts
Default

Quote:
Originally Posted by jrk View Post
This is the experimental 64bit (athlon64) lasieve4I15e from SVN 353.
Also I should probably mention, it is dynamically linked with mpir 1.2.1 instead of gmp.
jrk is offline   Reply With Quote
Old 2009-12-21, 05:46   #8
jrk
 
jrk's Avatar
 
May 2008

3×5×73 Posts
Default

Quote:
Originally Posted by Batalov View Post
Ok, can you do it under gdb and do "bt" when it crashes?
Thanks.
That is one of the next things I was preparing to do. First I wanted to try the latest SVN (I needed an excuse to update anyway).

If that still fails, I'll get a trace for you.
jrk is offline   Reply With Quote
Old 2009-12-21, 06:48   #9
jrk
 
jrk's Avatar
 
May 2008

3·5·73 Posts
Default

Quote:
Originally Posted by jrk View Post
That is one of the next things I was preparing to do. First I wanted to try the latest SVN (I needed an excuse to update anyway).

If that still fails, I'll get a trace for you.
Yes latest SVN still crashes. Here is a backtrace from gdb.

Code:
#0  0x0000000000418a9a in mpqs_decompose () at mpqs.c:1334
#1  0x000000000041ac20 in mpqs_factor0 (N=0x721130, max_bits=31, 
    factors=0x7fff26c45b18, retry=1) at mpqs.c:1911
#2  0x000000000041ad3a in mpqs_factor (N=0x721130, max_bits=31, 
    factors=0x7fff26c45b18) at mpqs.c:1958
#3  0x000000000040eda0 in output_tdsurvivor (fbp_buf0=0x28d2a28, 
    fbp_buf0_ub=0x28d2a4c, fbp_buf1=0x28d2a4c, fbp_buf1_ub=0x28d2a58, 
    lf0=0x28ba4e0, lf1=0x28ba4f0) at gnfs-lasieve4e.c:3944
#4  0x000000000040eb8b in output_all_tdsurvivors () at gnfs-lasieve4e.c:3895
#5  0x000000000040acdb in main (argc=8, argv=0x7fff26c46d38)
    at gnfs-lasieve4e.c:2686
It only crashes if it is dynamically linked (default is to build static). Also it did not crash from within gdb so I had to load a core file into gdb instead.

I will post a valgrind output
jrk is offline   Reply With Quote
Old 2009-12-21, 06:50   #10
jrk
 
jrk's Avatar
 
May 2008

3×5×73 Posts
Default

Code:
$ valgrind ./gnfs-lasieve4I15e -k -o spairs2.out -v -n0 -r 7_320P.job
==15362== Memcheck, a memory error detector.
==15362== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==15362== Using LibVEX rev 1804, a library for dynamic binary translation.
==15362== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==15362== Using valgrind-3.3.0, a dynamic binary instrumentation framework.
==15362== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==15362== For more details, rerun with: -v
==15362== 
Warning:  lowering FB_bound to 28677948.
FBsize 1565317+0 (deg 4), 1780990+0 (deg 1)
==15362== Warning: set address range perms: large range 139230208 (undefined)
==15362== Conditional jump or move depends on uninitialised value(s)
==15362==    at 0x40A5DB: main (gnfs-lasieve4e.c:2405)
==15362== 
==15362== Invalid read of size 8
==15362==    at 0x41D531: (within /home/prime/tmp/gnfs-lasieve4I15e)
==15362==    by 0x40AC4F: main (gnfs-lasieve4e.c:2651)
==15362==  Address 0x4eec5e3 is not stack'd, malloc'd or (recently) free'd
==15362== 
==15362== Invalid read of size 1
==15362==    at 0x41D7EF: (within /home/prime/tmp/gnfs-lasieve4I15e)
==15362==    by 0x40AC4F: main (gnfs-lasieve4e.c:2651)
==15362==  Address 0x4eec66c is not stack'd, malloc'd or (recently) free'd
==15362== 
==15362== Conditional jump or move depends on uninitialised value(s)
==15362==    at 0x41D7F3: (within /home/prime/tmp/gnfs-lasieve4I15e)
==15362==    by 0x40AC4F: main (gnfs-lasieve4e.c:2651)
==15362== 
==15362== Conditional jump or move depends on uninitialised value(s)
==15362==    at 0x41D7F7: (within /home/prime/tmp/gnfs-lasieve4I15e)
==15362==    by 0x40AC4F: main (gnfs-lasieve4e.c:2651)
==15362== 
==15362== Invalid read of size 8
==15362==    at 0x41D4FE: (within /home/prime/tmp/gnfs-lasieve4I15e)
==15362==    by 0x40AC4F: main (gnfs-lasieve4e.c:2651)
==15362==  Address 0x4eebff9 is 65,529 bytes inside a block of size 65,536 alloc'd
==15362==    at 0x4A04FC0: memalign (vg_replace_malloc.c:460)
==15362==    by 0x40F3CF: xvalloc (if.w:103)
==15362==    by 0x404EA8: main (gnfs-lasieve4e.c:999)
==15362== 
==15362== Invalid read of size 4
==15362==    at 0x41C2AA: (within /home/prime/tmp/gnfs-lasieve4I15e)
==15362==    by 0xA90A423: ???
==15362==    by 0x7FEFFF43F: ???
==15362==  Address 0xa90a428 is 0 bytes after a block of size 7,123,960 alloc'd
==15362==    at 0x4A0739E: malloc (vg_replace_malloc.c:207)
==15362==    by 0x40F381: xmalloc (if.w:93)
==15362==    by 0x405326: main (gnfs-lasieve4e.c:1080)
==15362== 
==15362== Invalid read of size 4
==15362==    at 0x41C2AA: (within /home/prime/tmp/gnfs-lasieve4I15e)
==15362==    by 0x964CA3F: ???
==15362==    by 0x7FEFFF43F: ???
==15362==  Address 0x964ca44 is 0 bytes after a block of size 6,261,268 alloc'd
==15362==    at 0x4A0739E: malloc (vg_replace_malloc.c:207)
==15362==    by 0x40F381: xmalloc (if.w:93)
==15362==    by 0x405326: main (gnfs-lasieve4e.c:1080)
==15362== 
==15362== Use of uninitialised value of size 8
==15362==    at 0x418A9A: mpqs_decompose (mpqs.c:1334)
==15362==    by 0x41AC1F: mpqs_factor0 (mpqs.c:1911)
==15362==    by 0x41AD39: mpqs_factor (mpqs.c:1958)
==15362==    by 0x40ED9F: output_tdsurvivor (gnfs-lasieve4e.c:3944)
==15362==    by 0x40EB8A: output_all_tdsurvivors (gnfs-lasieve4e.c:3895)
==15362==    by 0x40ACDA: main (gnfs-lasieve4e.c:2686)
==15362== 
==15362== Invalid read of size 2
==15362==    at 0x418A9A: mpqs_decompose (mpqs.c:1334)
==15362==    by 0x41AC1F: mpqs_factor0 (mpqs.c:1911)
==15362==    by 0x41AD39: mpqs_factor (mpqs.c:1958)
==15362==    by 0x40ED9F: output_tdsurvivor (gnfs-lasieve4e.c:3944)
==15362==    by 0x40EB8A: output_all_tdsurvivors (gnfs-lasieve4e.c:3895)
==15362==    by 0x40ACDA: main (gnfs-lasieve4e.c:2686)
==15362==  Address 0x7348b8 is not stack'd, malloc'd or (recently) free'd
==15362== 
==15362== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==15362==  Access not within mapped region at address 0x7348B8
==15362==    at 0x418A9A: mpqs_decompose (mpqs.c:1334)
==15362==    by 0x41AC1F: mpqs_factor0 (mpqs.c:1911)
==15362==    by 0x41AD39: mpqs_factor (mpqs.c:1958)
==15362==    by 0x40ED9F: output_tdsurvivor (gnfs-lasieve4e.c:3944)
==15362==    by 0x40EB8A: output_all_tdsurvivors (gnfs-lasieve4e.c:3895)
==15362==    by 0x40ACDA: main (gnfs-lasieve4e.c:2686)
==15362== 
==15362== ERROR SUMMARY: 421663 errors from 10 contexts (suppressed: 4 from 1)
==15362== malloc/free: in use at exit: 215,797,575 bytes in 68,357 blocks.
==15362== malloc/free: 68,809 allocs, 452 frees, 310,122,029 bytes allocated.
==15362== For counts of detected errors, rerun with: -v
==15362== searching for pointers to 68,357 not-freed blocks.
==15362== checked 94,228,568 bytes.
==15362== 
==15362== LEAK SUMMARY:
==15362==    definitely lost: 53,264 bytes in 12 blocks.
==15362==      possibly lost: 0 bytes in 0 blocks.
==15362==    still reachable: 215,744,311 bytes in 68,345 blocks.
==15362==         suppressed: 0 bytes in 0 blocks.
==15362== Rerun with --leak-check=full to see details of leaked memory.
Segmentation fault
jrk is offline   Reply With Quote
Old 2009-12-21, 06:54   #11
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

9,497 Posts
Default

I'll need to find a Core2, then. Static linking will help against linking to a wrong lib.

Earlier wrote:
I've looked up the version 353 - that's fairly old (almost the original source), surely it did have problems. Crashed on me too (and it had zero yield in some ranges).

The new version however will need care at compilation time.
Re-read the INSTALL file --
========
on Core2 replace in athlon64/ls-defs.asm
- define(l1_bits,16)dnl
+ define(l1_bits,15)dnl
========
The C source part will use L1_BITS 15, but the asm part is not under control of any scripts, so simply edit that manually for Intel CPUs, all of them. If it is miscompiled, it will be easy to see though - it will complain all over the place (at run-time).

Athlons, Phenoms: will get a boost and nothing will need to be changed for them. (They will use L1_BITS 16 and define(l1_bits,16)dnl.)

Good luck. --S

Last fiddled with by Batalov on 2009-12-21 at 06:58 Reason: (two more postings while writing; will have to address later)
Batalov is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Advantage of lattice sieve over line sieve binu Factoring 3 2013-04-13 16:32
gmp-ecm crash yoyo GMP-ECM 26 2011-06-01 06:31
GMP-ECM crash lavalamp GMP-ECM 55 2011-04-03 01:58
Crash! storm5510 Software 8 2009-08-31 02:07
Crash? remaker Software 4 2007-05-03 16:39

All times are UTC. The time now is 15:39.


Fri Aug 6 15:39:29 UTC 2021 up 14 days, 10:08, 1 user, load averages: 2.52, 2.58, 2.71

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.