mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2021-01-18, 22:49   #892
charybdis
 
Apr 2020

22910 Posts
Default

How's CPU usage looking? Is msieve actually doing anything or is it just hanging? And while we're at it, how's memory usage?
charybdis is offline   Reply With Quote
Old 2021-01-19, 05:43   #893
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2×3×347 Posts
Default

Quote:
Originally Posted by pinhodecarlos View Post
Maybe get in touch with Greg from NFS@Home to see if he can give any support or advise?!
This is well outside of anything I've run. My largest has been about 1 billion relations. However, if the relations are available online to download I'll be happy to play with it.
frmky is offline   Reply With Quote
Old 2021-01-19, 10:13   #894
wreck
 
wreck's Avatar
 
"Bo Chen"
Oct 2005
Wuhan,China

101001112 Posts
Default

If possible , could you give it another try with unique relations less
than 1600M?
As a comparison, VBCurtis done 2,2330L (gnfs 207) with 162M unique relations.
And , in my memory, there is a time that fivemack finish a nfs job using relations count 720M successfully, while 800M failed (using lpb33 ).
A rough guess is that sometimes ago, there is a barrier near 800M, now it jump to 1600M for some reason.
wreck is offline   Reply With Quote
Old 2021-01-19, 15:18   #895
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

22×7×132 Posts
Default

No, I didn't use 162M uniques. Did you miss a zero? This job is tougher than the GNFS-207 by quite a lot, and uses bounds which are expected to require more relations (36/33 should require more than 35/34). 2e9 relations may not be enough, but is quite surely not too many.

Citing relations counts for 33-lp jobs is totally irrelevant to this job, which is using much larger bounds. The number of relations left heading into merge shows rather clearly that this is not oversieved.

There is no reason to think the old msieve large-dataset bug is the culprit here. However, Charybdis' idea to cull all 36-bit-large-prime relations from the dataset and try to filter as a 33/35 job has merit.
VBCurtis is offline   Reply With Quote
Old 2021-01-19, 15:25   #896
ryanp
 
ryanp's Avatar
 
Jun 2012
Boulder, CO

11416 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
There is no reason to think the old msieve large-dataset bug is the culprit here. However, Charybdis' idea to cull all 36-bit-large-prime relations from the dataset and try to filter as a 33/35 job has merit.
I'm willing to try culling the 36-bit large prime relations. Would you be able to construct the "grep" command? I don't quite know the msieve relation format well enough.
ryanp is offline   Reply With Quote
Old 2021-01-19, 15:47   #897
charybdis
 
Apr 2020

229 Posts
Default

Quote:
Originally Posted by ryanp View Post
I'm willing to try culling the 36-bit large prime relations. Would you be able to construct the "grep" command? I don't quite know the msieve relation format well enough.
grep -v ",[8-9a-f]........$" should remove all lines ending with a 36-bit prime, which is what we need.
charybdis is offline   Reply With Quote
Old 2021-01-23, 22:13   #898
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2×3×347 Posts
Default

There's definitely an msieve filtering bug. Good to have a data set that triggers it. Unfortunate that msieve needs to run for 15 hours to trigger it. Let's see what gdb says...

Code:
commencing singleton removal, initial pass
memory use: 41024.0 MB
reading all ideals from disk
memory use: 39309.4 MB
commencing in-memory singleton removal
begin with 2074342591 relations and 1985137022 unique ideals
reduce to 992888838 relations and 765115141 ideals in 20 passes
max relations containing the same ideal: 35
reading ideals above 720000
commencing singleton removal, initial pass
memory use: 21024.0 MB
reading all ideals from disk
memory use: 46989.5 MB
keeping 913886427 ideals with weight <= 200, target excess is 5352837
commencing in-memory singleton removal
begin with 992888838 relations and 913886427 unique ideals
reduce to 992241034 relations and 913238552 ideals in 15 passes
max relations containing the same ideal: 200
removing 8630643 relations and 8331224 ideals in 2000000 cliques
commencing in-memory singleton removal
[kepler-0-0:29616] *** Process received signal ***
[kepler-0-0:29616] Signal: Segmentation fault (11)
[kepler-0-0:29616] Signal code: Address not mapped (1)
[kepler-0-0:29616] Failing at address: 0x7f001013550c
[kepler-0-0:29616] [ 0] /lib64/libpthread.so.0(+0xf5e0)[0x7eff125965e0]
[kepler-0-0:29616] [ 1] ./msieve993_new[0x43ffd0]
[kepler-0-0:29616] [ 2] ./msieve993_new[0x463ae7]
[kepler-0-0:29616] [ 3] ./msieve993_new[0x43c2fb]
[kepler-0-0:29616] [ 4] ./msieve993_new[0x4288dd]
[kepler-0-0:29616] [ 5] ./msieve993_new[0x415bc4]
[kepler-0-0:29616] [ 6] ./msieve993_new[0x405b1b]
[kepler-0-0:29616] [ 7] ./msieve993_new[0x404987]
[kepler-0-0:29616] [ 8] ./msieve993_new[0x40454c]
[kepler-0-0:29616] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7eff11f03c05]
[kepler-0-0:29616] [10] ./msieve993_new[0x4045f2]
[kepler-0-0:29616] *** End of error message ***
frmky is offline   Reply With Quote
Old 2021-01-25, 01:21   #899
wreck
 
wreck's Avatar
 
"Bo Chen"
Oct 2005
Wuhan,China

167 Posts
Default

Could you compile a debug version to see
which line it crashing?
wreck is offline   Reply With Quote
Old 2021-01-27, 02:42   #900
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2·3·347 Posts
Default

We’re in filter_purge_singletons_core() at common/filter/singleton.c:441, looping through the relations counting the number of times that each ideal occurs. But the last relation is broken! We’re at the last relation since i = num_relations-1. The ideal_list for this last relation contains entries greater than num_ideals. Don’t know why though… Probably an overflow somewhere. But where? Trying to track it down but might take a while since I don't have a lot of time to devote to this and individual tests take a day.

Code:
read 2050M relations
read 2060M relations
read 2070M relations
found 578077506 hash collisions in 2074342600 relations
commencing duplicate removal, pass 2
found 9 duplicates and 2074342591 unique relations
memory use: 16280.0 MB
reading ideals above 1549860864
commencing singleton removal, initial pass
memory use: 41024.0 MB
reading all ideals from disk
memory use: 39309.4 MB
commencing in-memory singleton removal
begin with 2074342591 relations and 1985137022 unique ideals
reduce to 992888838 relations and 765115141 ideals in 20 passes
max relations containing the same ideal: 35
reading ideals above 720000
commencing singleton removal, initial pass
memory use: 21024.0 MB
reading all ideals from disk
memory use: 46989.5 MB
keeping 913886427 ideals with weight <= 200, target excess is 5352837
commencing in-memory singleton removal
begin with 992888838 relations and 913886427 unique ideals
reduce to 992241034 relations and 913238552 ideals in 15 passes
max relations containing the same ideal: 200
removing 8630643 relations and 8331224 ideals in 2000000 cliques
commencing in-memory singleton removal

Program received signal SIGSEGV, Segmentation fault.
0x000000000044a178 in filter_purge_singletons_core (obj=0x6de250, filter=0x7fffffffc710) at common/filter/singleton.c:441
441        freqtable[ideal]++;
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7_4.2.x86_64 gmp-6.0.0-15.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) backtrace
#0  0x000000000044a178 in filter_purge_singletons_core (obj=0x6de250, filter=0x7fffffffc710) at common/filter/singleton.c:441
#1  0x0000000000475e26 in filter_purge_cliques (obj=0x6de250, filter=0x7fffffffc710) at common/filter/clique.c:646
#2  0x0000000000443cf6 in filter_make_relsets (obj=0x6de250, filter=0x7fffffffc710, merge=0x7fffffffc6e0, min_cycles=5352837)
    at common/filter/filter.c:65
#3  0x000000000042f0fb in do_merge (obj=0x6de250, filter=0x7fffffffc710, merge=0x7fffffffc6e0, target_density=130)
    at gnfs/filter/filter.c:187
#4  0x000000000042fad0 in nfs_filter_relations (obj=0x6de250, n=0x7fffffffc960) at gnfs/filter/filter.c:411
#5  0x00000000004172ac in factor_gnfs (obj=0x6de250, input_n=0x7fffffffcb40, factor_list=0x7fffffffcbd0) at gnfs/gnfs.c:153
#6  0x0000000000404dcd in msieve_run_core (obj=0x6de250, n=0x7fffffffcb40, factor_list=0x7fffffffcbd0) at common/driver.c:158
#7  0x00000000004051b4 in msieve_run (obj=0x6de250) at common/driver.c:268
#8  0x00000000004038a4 in factor_integer (
    buf=0x7fffffffd650 "38315657995194363034877423503084547947166751578940985843521212522635100246118059073205923746544331860205171086654671434719340358393954962433533212457600196112076644876654207767427267797808629935905445"..., flags=1027, savefile_name=0x0, 
    logfile_name=0x0, nfs_fbfile_name=0x0, seed1=0x7fffffffd64c, seed2=0x7fffffffd648, max_relations=0, cpu=cpu_core, 
    cache_size1=32768, cache_size2=20971520, num_threads=0, which_gpu=0, nfs_args=0x7fffffffdcee "target_density=130") at demo.c:235
#9  0x00000000004046bd in main (argc=4, argv=0x7fffffffd988) at demo.c:601
(gdb) info frame
Stack level 0, frame at 0x7fffffffc340:
rip = 0x44a178 in filter_purge_singletons_core (common/filter/singleton.c:441); saved rip 0x475e26
called by frame at 0x7fffffffc370
source language c.
Arglist at 0x7fffffffc2b8, args: obj=0x6de250, filter=0x7fffffffc710
Locals at 0x7fffffffc2b8, Previous frame's sp is 0x7fffffffc340
Saved registers:
  rip at 0x7fffffffc338
(gdb) info locals
ideal = 2057043263
i = 983610390
j = 5
freqtable = 0x7fff1d2ad010
relation_array = 0x7ff47e0f1010
curr_relation = 0x7ffc79bde3a0
old_relation = 0x7f1fd8001e8480
orig_num_ideals = 913238552
num_passes = 32767
num_relations = 983610391
num_ideals = 913238552
new_num_relations = 8630643
(gdb) print *curr_relation
$2 = {rel_index = 15834702, ideal_count = 36 '$', gf2_factors = 69 'E', connected = 156 '\234', ideal_list = {885450581, 598542783, 
    158747510, 638930804, 786848709, 2057043263, 3845, 186587920, 18476918, 67526419, 598542783, 872055544, 2057043265, 2046824196, 
    3942562, 102078889, 58908383, 865042570, 2057043267, 872418055, 9125741, 85351335, 11880544, 43981132, 865042570, 873512089, 
    893921179, 2057043271, 2567, 93072473, 26460704, 33365801, 865042570, 517341201, 275602560, 862343378, 2057043273, 83889159, 
    66167424, 46818875, 59842776, 59333874, 194384291, 865042570, 172206968, 2057043276, 50334725, 905653709, 628443801, 865042570, 
    801305779, 869019178, 2057043277, 2046821898, 20184373, 101514515, 16353075, 87715774, 36505563, 58989284, 865042570, 598565998, 
    334060622, 469101029, 2057043280, 83889158, 73623668, 106612925, 359795440, 9473259, 157931537, 772472752, 2057043282, 218106376, 
    140592574, 157045250, 477152215, 866943502, 6146950, 41607604, 44380953, 772472752, 2057043284, 150998022, 105306193, 842728936, 
    7879065, 444703037, 772472752, 403730401, 2057043289, 83889414, 320662844, 329981033, 248067990, 772472752, 23316642, 631501233, 
    2057043290, 822087174}}
(gdb)
frmky is offline   Reply With Quote
Old 2021-01-31, 05:49   #901
wreck
 
wreck's Avatar
 
"Bo Chen"
Oct 2005
Wuhan,China

167 Posts
Default

After read the code (msieve r1030) about eight hours
(1767 code line read, folder common/filter, file singleton.c, clique.c, etc.),
this filter problem seems not easy to solve.
But here are some thinkings.

1. From common/filter/filter_priv.h
The definition of ideal_map_t is
typedef struct {
uint32 payload : 30; /* offset in list of ideal_relation_t
structures where the linked list of
ideal_relation_t's for this ideal starts */
uint32 clique : 1; /* nonzero if this ideal can participate in
a clique */
uint32 connected : 1; /* nonzero if this ideal has already been
added to a clique under construction */
} ideal_map_t;
the maximum value of payload is 2^30, which is about 1000M.
If the ideal is more than 1000M in function purge_cliques_core(),
it is possible that the filter would not work properly.
Here when entering into purge_cliques_core function, the
relation count is 992888838, less than 2^30, so here this 30 bit should not
be the reason of the crash.

2.
2057043265 = 0x7A98FD41
0x3A98FD41 = 983104833
This number is near the num_relations (983610391).
It is possible that the ideal_map_t.clique bit is not cleared propered
in function purge_cliques_core().
But this is also a guess.

3. In function filter_purge_singletons_core().
curr_relation->ideal_count is 36, but there are 3 values
in curr_relation->ideal_list is the same (865042570).
curr_relation->ideal_list[17]
curr_relation->ideal_list[24]
curr_relation->ideal_list[32]
It is a little strange.

4. In function purge_cliques_core(), Line 370
ideal_map[ideal].payload = num_reverse++;
the variable num_reverse is possiblly exceed 2^32,
while its type is uint32.

5. A question.
Does ryanp give a try to use unique relations less than 1600M?
If done, what's the result?
wreck is offline   Reply With Quote
Old 2021-01-31, 07:53   #902
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

3·191 Posts
Default

The length of freqtable is num_ideals (line 430), and ideal (the index) is greater than that, so the array reference is out-of-bounds and thus we get the segfault. The real question is why there are so many entries in ideal_list that are above num_ideals.

Last fiddled with by Happy5214 on 2021-01-31 at 07:55
Happy5214 is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Passive Pascal Xyzzy GPU Computing 1 2017-05-17 20:22
Tesla P100 — 5.4 DP TeraFLOPS — Pascal Mark Rose GPU Computing 52 2016-07-02 12:11
Nvidia Pascal, a third of DP firejuggler GPU Computing 12 2016-02-23 06:55
Calculating perfect numbers in Pascal Elhueno Homework Help 5 2008-06-12 16:37
Factorization attempt to a c163 - a new Odd Perfect Number roadblock jchein1 Factoring 30 2005-05-30 14:43

All times are UTC. The time now is 06:13.

Thu Apr 15 06:13:31 UTC 2021 up 7 days, 54 mins, 0 users, load averages: 2.21, 1.97, 1.80

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.