mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Msieve v1.46 feedback (https://www.mersenneforum.org/showthread.php?t=13676)

jasonp 2010-08-19 21:22

If you want to enforce a hard deadline, you can add '-d X' to the command line and msieve will interrupt itself after X wall-clock minutes.

axn 2010-08-19 23:24

[QUOTE=jasonp;226129]"run at full speed when the CPU is idle, but stop after a given time, and make the time longer when the CPU is busy, but ignore all of that when CPU is in standby". That should be perfect :)[/QUOTE]

LOL. But why do a variable search in the first place? Can't the search be parameterised so that it does a fixed amount of work (modulo the use of randomness in the search space)? Ideally speaking, the amount of work (not time) to be expended is a function on the amount of work to be expended on the other phases (sieving, LA, etc.) -- why should having a faster CPU or GPU cause me to do a deeper search?

jasonp 2010-08-20 01:51

[QUOTE=axn;226305]But why do a variable search in the first place? Can't the search be parameterised so that it does a fixed amount of work (modulo the use of randomness in the search space)? Ideally speaking, the amount of work (not time) to be expended is a function on the amount of work to be expended on the other phases (sieving, LA, etc.) -- why should having a faster CPU or GPU cause me to do a deeper search?[/QUOTE]

I wish I knew how to predict how long the search would take, so that searching a single leading algebraic coefficient would take a reasonable time. pol5 doesn't have this problem because its search space is really quite small; Kleinjung's improved algorithm has a drastically larger limit, but the larger limit is what finds good hits, so you don't want to severely constrain it just to something you can exhaustively search. Under those circumstances it makes more sense to me to just search as far as you can in a fixed fraction of the total sieving time.

EdH 2010-08-20 03:23

[quote=jasonp;226275]If you want to enforce a hard deadline, you can add '-d X' to the command line and msieve will interrupt itself after X wall-clock minutes.[/quote]
Unfortunately, that would entail modifying Aliqueit and then trying to figure out the X for each sample. And, it's not a hard deadline I'm concerned about. I'm wondering why the polynomial selection doesn't finish on this machine.

This is the only machine that is displaying this. I'm wondering why, since this CPU is a whole GHz faster than the one it replaced. At this point I'm actually wondering if it will ever quit. What is it really trying to reach and why should I bother to run ggnfs on this machine if the polynomial selection alone takes longer than SIQS? It's now been running polynomials for over 12 hours for a c99. My 1.8GHz machine never took this long and this is a 2.8GHz Pentium 4 with a 1M cache?

If this is still running polynomials when I shut it down tonight, I will run Aliqueit with "gnfs_cutoff = 100" tomorrow and see how long it takes to factor via SIQS.

Perhaps someone can help me understand CPU hours:

I'm assuming in this case, msieve is calculating the 0.33 hour relative to my CPU and therefore hasn't anything to do with the CPU speed, but simply percentage of use and time in use.

If msieve is using ~84% of the CPU when I check it with the system monitor, which itself is using ~12%, and the CPU is running at 100% when observed, I should be able to mathematically estimate completion of 0.33 CPU hours:

Roughly, if I shut down the monitor, I should gain ~12%CPU. I expect this goes to msieve, since it is running normal priority and all by itself. This means a conservative estimate for msieve is >90% of a 100% tasked CPU.

Therefore 1 clock hour * .9 = .9 CPU hour.

Shouldn't msieve have reached 0.33 CPU hour long ago? It's been 12.5+ clock hours. Is that not 11.25 CPU hours for msieve?

Sorry I'm so confused on this. . . Let me know how ignorant I am and maybe I'll go away, but probably not. I just won't run msieve on this machine, if I can't find a solution.

Thank you for any help. All comments welcome - even harsh ones. . .

jrk 2010-08-20 04:25

[QUOTE=EdH;226326]Shouldn't msieve have reached 0.33 CPU hour long ago? It's been 12.5+ clock hours. Is that not 11.25 CPU hours for msieve?[/QUOTE]

What does [FONT="Courier New"]top[/FONT] say?

10metreh 2010-08-20 07:05

I'll give that number a go with the CPU version on my i5. (just the poly selection)

Edit: well, I would if I had the number (forgot that there were two c99s)

jasonp 2010-08-20 11:49

Is this a unix machine? It's possible that getrusage is not accurately computing the actual CPU time the program uses; IIRC there are some bizarre caveats about which unix flavor implements what part of the return data from getrusage...

EdH 2010-08-20 14:03

This is my first implementation of Fedora 13 - possibly the issue. The 1.8GHz machine was running Fedora 11. I upgraded both machine and OS when that OS crashed.

[user@comp ~]$ uname -a
[code]
Linux comp.id 2.6.33.6-147.2.4.fc13.i686 #1 SMP Fri Jul 23 17:27:40 UTC 2010 i686 i686 i386 GNU/Linux
[/code][user@comp ~]$ top
[code]
top - 09:24:54 up 18 min, 3 users, load average: 0.74, 0.93, 0.72
Tasks: 143 total, 2 running, 141 sleeping, 0 stopped, 0 zombie
Cpu(s): 98.7%us, 1.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 508264k total, 416324k used, 91940k free, 21708k buffers
Swap: 1048568k total, 0k used, 1048568k free, 233672k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2080 user 20 0 25456 7912 1156 R 98.0 1.6 0:54.02 msieve
1240 root 20 0 31124 12m 7592 S 0.7 2.5 0:13.94 Xorg
1623 user 20 0 146m 19m 9320 S 0.7 3.9 0:02.81 gmixer
2046 user 20 0 95640 12m 9380 S 0.3 2.4 0:00.62 gnome-terminal
2095 user 20 0 2696 1120 864 R 0.3 0.2 0:00.03 top
1 root 20 0 2828 1384 1172 S 0.0 0.3 0:01.42 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
6 root 20 0 0 0 0 S 0.0 0.0 0:00.01 events/0
7 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuset
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 sync_supers
[/code]The c99 is:
[code]
618155374139563156953657470966251220509792800441172795436567726667286308678511882481861288669912867
[/code]It is from iteration 2039 of aliquot sequence [URL="http://www.factordb.com/search.php?se=1&aq=55888&action=last20&fr=&to="]55888[/URL]. I'm going to interrupt my 24/7 machine here to see what it does. That one is running a c107 (via AliWin) under WinXP on the other sequence I have reserved.

As a side, but not distant, subject. If I were to set up a machine to primarily do factoring, what would be the better OS? I seem to remember something about Xubuntu being math (or is it scientifically?) oriented. The current machine supports a lot of things when I'm using it and will probably be Fedora based, unless Fedora has become too flawed.

Thanks for all. . .

henryzz 2010-08-20 14:14

Most people on the forum use Ubuntu I think.

xilman 2010-08-20 15:19

[quote=henryzz;226376]Most people on the forum use Ubuntu I think.[/quote]Perhaps, though I don't use Ubuntu. I use RedHat (9 and EL5), Fedora, Snow Leopard, Win7, Vista, WinXP, FreeBSD and Suse. There may be one or two I've forgotten ...


Paul

xilman 2010-08-20 15:22

[quote=EdH;226374]This is my first implementation of Fedora 13 - possibly the issue.

...

As a side, but not distant, subject. If I were to set up a machine to primarily do factoring, what would be the better OS? I seem to remember something about Xubuntu being math (or is it scientifically?) oriented. The current machine supports a lot of things when I'm using it and will probably be Fedora based, unless Fedora has become too flawed.
[/quote]What makes you think Fedora may be flawed?

In my experience, it doesn't matter what underlying operating system is used for a factoring machine. My experience goes back over 20 years and, probably, about 20 operating systems.

Paul

10metreh 2010-08-20 17:13

[QUOTE=EdH;226374]The c99 is:
[code]
618155374139563156953657470966251220509792800441172795436567726667286308678511882481861288669912867
[/code][/QUOTE]

OK, I'll try to coax some polys out of that.

EdH 2010-08-20 17:22

[quote=xilman;226379]What makes you think Fedora may be flawed?. . .
Paul[/quote]
Just a thought of perhaps, actually. Fedora 11 ran Aliqueit, Msieve, etc. rather well. F13 has annoyed me already with several changes that were not for "my" benefit. Now I have to study even more about another OS version to make my computer work properly.:down:

I ran the c99 on my WinXP system and it started sieving after about .5 hour. I must admit, though, that SIQS is not looking very good on this machine either. The c91 only took a couple hours. This c99 is looking like >2 days. I do expect it to be factored by tonight on the WinXP machine.

My background with linux is very limited, even though I've been at it for several years. The reason I'm using Fedora is that it was the only one at the time that actually installed and found all my hardware on the laptop I was trying to use. It also worked on the desktop. SUSE and Ubuntu never ran. I did have Debian installed on the desktop for a while and that ran fine, but wouldn't work with the laptop.

Fedora appears to have almost all the programs I look for, represented within its repositories and they are all fairly current. This was not the case for some of the others I tried.

EdH 2010-08-20 17:25

[quote=10metreh;226393]OK, I'll try to coax some polys out of that.[/quote]
I've already gotten past the polys on the WinXp machine. It took about .5 hour. So unless you're interested "just for fun" you don't really need to run them, but thank you for the offer.

xilman 2010-08-20 17:47

[quote=EdH;226394]Just a thought of perhaps, actually. Fedora 11 ran Aliqueit, Msieve, etc. rather well. F13 has annoyed me already with several changes that were not for "my" benefit. Now I have to study even more about another OS version to make my computer work properly.:down:[/quote]If F11 isn't broken, why did you fix it?

That is, you could consider downgrading to F11.


Paul

10metreh 2010-08-20 17:47

OK, just for narrowing down purposes, msieve stops after the correct amount of time (0.33 hours) on my i5 (running Win7). So it looks like a Fedora issue.

henryzz 2010-08-20 20:18

I would have nothing against running Ferdora or any other flavor of linux except that because lots of people on the forum use Ubuntu we share solutions to problems when they arise. These problems often have different solutions and/or syptoms on different operating systems. The same sharing solutions works with windows for the same reason.

frmky 2010-08-20 20:31

There's more variety here than you might be thinking. I have four SSH terminals open on my desktop right now connected to different computers in the same department running CAOS, CentOS, Debian, and Ubuntu.

henryzz 2010-08-20 20:35

[quote=frmky;226412]There's more variety here than you might be thinking. I have four SSH terminals open on my desktop right now connected to different computers in the same department running CAOS, CentOS, Debian, and Ubuntu.[/quote]
It is the primesearching circles on mersenneforum that are almost exclusively ubuntu or windows.

Batalov 2010-08-20 21:38

[quote=frmky;226412]There's more variety here than you might be thinking. I have four SSH terminals open on my desktop right now connected to different computers in the same department running CAOS, CentOS, Debian, and Ubuntu.[/quote]
There may be less variety - depending on a place, on a strictness of company regulations. I, too, can open here 15 different terminals, and all of them will be SLES 10 and maybe some adventurous ...11s :razz:

em99010pepe 2010-08-21 08:38

[quote=Jeff Gilchrist;226182]With my latest RSALS job [[B]599_83_minus1 - SNFS(233)[/B]], I tried using LARGEBLOCKS and TARGET_DENSITY=80 to see how things changed.
In both target density cases, going from normal block size to large block size increased RAM usage a significant amount but decreased the ETA by 16% of the original time.

In both block size cases, increasing the target density from 70 to 80 increased RAM usage slightly but only decreased ETA by 2%.

[/quote]

Jeff,

Could you compile a new msieve windows binary 64-bit core2 version with LARGEBLOCKS and TARGET_DENSITY=80 and post it on your webpage? Thank you in advance.

EdH 2010-08-22 02:11

In case someone else cares, I happen to have acquired another 2.8GHz machine that happens to have 1.5G of RAM. This particular machine won't run Fedora 13 - it refuses! Based on the mention of Ubuntu, I am trying that OS (10.04.1) and the machine appears to like it so far. Depending on my testing, perhaps I will be running an Ubuntu machine here. . .

Of course, one of the first tests will be Msieve with the same number as on the Fedora machine.

Any words of wisdom, or thoughts on additional math programs I should include, besides the obvious Aliqueit, Msieve, GGNFS, YAFU, PARI, GMP-ECM? I may have forgotten to mention some, but please feel free to duplicate something I'm already including but didn't list. I will probably still be running aliquot sequences, but I might at least check out some other areas.

Sorry this is bordering on off-topic. Please move it if need be.

Thanks. . .

EdH 2010-08-24 02:49

That machine appears to be a no-go, although it did finish poly selection at the estimated time for a few tests. Unfortunately, aliqueit, ggnfs and other programs would arbitrarily shut down and occasionally the computer would shut off, all for no apparent reason. Maybe that's why Fedora 13 wouldn't even complete loading. . .

jasonp 2010-08-24 13:01

I noticed at work that the GUI of fedora 12 would freeze up at random on a 2GHz P4 I installed it on; FC11 and older had no trouble.

tgrdy 2010-08-25 02:59

maybe v1.47 svn bug?
 
[COLOR=Red]weight xxx/col print format error?
[/COLOR]both in intel(march=nocona) and amd(mach=k8) , Linux x86 32 bit version.

Here it is the log, thanks,
[code]
Tue Aug 24 17:33:40 2010
Tue Aug 24 17:33:40 2010
Tue Aug 24 17:33:40 2010 Msieve v. 1.47
Tue Aug 24 17:33:40 2010 random seeds: 9ae80f89 77f63583
Tue Aug 24 17:33:40 2010 factoring ...(157 digits)
...
Tue Aug 24 17:33:42 2010 skew 468676.75, size 2.585e-15, alpha -7.130, combined = 1.913e-12 rroots = 3
Tue Aug 24 17:33:42 2010
Tue Aug 24 17:33:42 2010 commencing linear algebra
Tue Aug 24 17:38:38 2010 matrix starts at (0, 0)
Tue Aug 24 17:38:40 2010 matrix is 4895033 x 4895210 (1408.6 MB) with weight 470679799 (96.15/col)
[COLOR=Red]Tue Aug 24 17:38:40 2010 sparse part has weight 330087858 (-1388557958659491449960930129592021408789550482319091262508786343129246955111700424862787405372458865398879990433798520910008535446241616403655082672363712028928001949282183195684168225812590430303018571883167524212052086660772268101647676122274677135852798941046701168373832920230896419954870822644796620800.00/col)[/COLOR]
Tue Aug 24 17:38:40 2010 saving the first 48 matrix rows for later
Tue Aug 24 17:38:42 2010 matrix includes 64 packed rows
Tue Aug 24 17:38:43 2010 matrix is 4894985 x 4895210 (1363.0 MB) with weight 375386110 (76.68/col)
[COLOR=Red]Tue Aug 24 17:38:43 2010 sparse part has weight 327941077 (-488438086314922565818526936846739985716284058195083095903320746364690187325138788817242597438608443259924512768.00/col)
[/COLOR]Tue Aug 24 17:38:43 2010 using block size 65536 for processor cache size 6144 kB
Tue Aug 24 17:39:12 2010 commencing Lanczos iteration (7 threads)
Tue Aug 24 17:39:12 2010 memory use: 1374.4 MB
Tue Aug 24 17:39:52 2010 restarting at iteration 46496 (dim = 2940041)
Tue Aug 24 17:40:34 2010 linear algebra at 60.1%, ETA 14h38m
Tue Aug 24 17:40:47 2010 checkpointing every 133356 dimensions
[COLOR=Red]Tue Aug 24 18:02:54 2010 error: corrupt state, please restart from checkpoint[/COLOR]

Tue Aug 24 18:12:01 2010
Tue Aug 24 18:12:01 2010
Tue Aug 24 18:12:01 2010 Msieve v. 1.47
Tue Aug 24 18:12:01 2010 random seeds: 710bb90c c434c400
Tue Aug 24 18:12:01 2010 factoring ...... (157 digits)
Tue Aug 24 18:12:02 2010 ......
Tue Aug 24 18:12:02 2010 commencing linear algebra
Tue Aug 24 18:12:23 2010 matrix starts at (0, 0)
Tue Aug 24 18:12:25 2010 matrix is 4895033 x 4895210 (1408.6 MB) with weight 470679799 (96.15/col)
[COLOR=Red]Tue Aug 24 18:12:25 2010 sparse part has weight 330087858 (-1388557958659491449960930129592021408789550482319091262508786343129246955111700424862787405372458865398879990433798520910008535446241616403655082672363712028928001949282183195684168225812590430303018571883167524212052086660772268101647676122274677135852798941046701168373832920230896419954870822644796620800.00/col)
[/COLOR]Tue Aug 24 18:12:25 2010 saving the first 48 matrix rows for later
Tue Aug 24 18:12:27 2010 matrix includes 64 packed rows
Tue Aug 24 18:12:29 2010 matrix is 4894985 x 4895210 (1363.0 MB) with weight 375386110 (76.68/col)
[COLOR=Red]Tue Aug 24 18:12:29 2010 sparse part has weight 327941077 (-488438086314922565818526936846739985716284058195083095903320746364690187325138788817242597438608443259924512768.00/col)
[/COLOR]Tue Aug 24 18:12:29 2010 using block size 43690 for processor cache size 1024 kB
Tue Aug 24 18:13:01 2010 commencing Lanczos iteration (8 threads)
Tue Aug 24 18:13:01 2010 memory use: 1412.4 MB
Tue Aug 24 18:13:03 2010 restarting at iteration 46656 (dim = 2950159)
[/code]
but v1.45 windows x86 32 bit version works well:
[code]
Mon Aug 23 15:48:06 2010
Mon Aug 23 15:48:06 2010
Mon Aug 23 15:48:06 2010 Msieve v. 1.45
Mon Aug 23 15:48:06 2010 random seeds: d9eb38b8 d2609e66
Mon Aug 23 15:48:06 2010 factoring ...... (157 digits)
Mon Aug 23 15:48:08 2010 searching for 15-digit factors
Mon Aug 23 15:48:09 2010 commencing number field sieve (157-digit input)
Mon Aug 23 15:48:09 2010 R0: -718286775230264074412218462160
Mon Aug 23 15:48:09 2010 R1: 330882384079102889
Mon Aug 23 15:48:09 2010 A0: -574785673446953991103337093662087263
Mon Aug 23 15:48:09 2010 A1: -2849845913901309456445653108969
Mon Aug 23 15:48:09 2010 A2: 19403845861276275944787296
Mon Aug 23 15:48:09 2010 A3: 40609767955771924744
Mon Aug 23 15:48:09 2010 A4: 185982982727102
Mon Aug 23 15:48:09 2010 A5: 33078600
Mon Aug 23 15:48:09 2010 skew 468676.75, size 2.585210e-015, alpha -7.129592, combined = 1.913363e-012
Mon Aug 23 15:48:09 2010
Mon Aug 23 15:48:09 2010 commencing linear algebra
Mon Aug 23 15:48:12 2010 read 4895210 cycles
Mon Aug 23 15:48:20 2010 matrix is 4895033 x 4895210 (1408.6 MB) with weight 470679799 (96.15/col)
Mon Aug 23 15:48:20 2010 sparse part has weight 330087858 (67.43/col)
Mon Aug 23 15:48:21 2010 saving the first 48 matrix rows for later
Mon Aug 23 15:48:24 2010 matrix is 4894985 x 4895210 (1363.0 MB) with weight 375386110 (76.68/col)
Mon Aug 23 15:48:24 2010 sparse part has weight 327941077 (66.99/col)
Mon Aug 23 15:48:24 2010 matrix includes 64 packed rows
Mon Aug 23 15:48:24 2010 using block size 65536 for processor cache size 8192 kB
Mon Aug 23 15:49:09 2010 commencing Lanczos iteration (8 threads)
Mon Aug 23 15:49:09 2010 memory use: 1677.9 MB
Mon Aug 23 15:49:11 2010 restarting at iteration 169 (dim = 10691)
Mon Aug 23 15:49:37 2010 linear algebra at 0.2%, ETA 46h35m
Wed Aug 25 08:09:27 2010 lanczos halted after 77412 iterations (dim = 4894985)
Wed Aug 25 08:09:50 2010 recovered 31 nontrivial dependencies
Wed Aug 25 08:10:04 2010 BLanczosTime: 145315
Wed Aug 25 08:10:04 2010 elapsed time 40:21:58
Wed Aug 25 08:10:05 2010
Wed Aug 25 08:10:05 2010
Wed Aug 25 08:10:05 2010 Msieve v. 1.45
Wed Aug 25 08:10:05 2010 random seeds: f3029690 d5850aba
Wed Aug 25 08:10:05 2010 factoring ...... (157 digits)
Wed Aug 25 08:10:07 2010 searching for 15-digit factors
Wed Aug 25 08:10:08 2010 commencing number field sieve (157-digit input)
Wed Aug 25 08:10:08 2010 ......
Wed Aug 25 08:10:08 2010 commencing square root phase
Wed Aug 25 08:10:08 2010 reading relations for dependency 10
Wed Aug 25 08:10:10 2010 read 2446011 cycles
Wed Aug 25 08:10:17 2010 cycles contain 6625862 unique relations
Wed Aug 25 08:12:30 2010 read 6625862 relations
Wed Aug 25 08:13:36 2010 multiplying 6625862 relations
Wed Aug 25 09:03:44 2010 multiply complete, coefficients have about 405.83 million bits
Wed Aug 25 09:03:57 2010 initial square root is modulo 19138993
......
[/code]

jasonp 2010-08-25 11:30

Does it work any better if you change '%5.2f' to '%5.2lf' , i.e. add an L to the floating point format code, in common/lanczos/lanczos_pre.c lines 75 and 80?

tgrdy 2010-08-25 13:32

I'll test it this Friday.
Sorry but be busy on Thursday.
Thanks,

[QUOTE=jasonp;227008]Does it work any better if you change '%5.2f' to '%5.2lf' , i.e. add an L to the floating point format code, in common/lanczos/lanczos_pre.c lines 75 and 80?[/QUOTE]

tgrdy 2010-08-27 07:11

Filtering C163: million lines of error -11
 
I am doing relation filtering on a C163, but there are
too many error -11 in log file, the log file has more the 100,000,000 bytes (nearly 100MB). My last C157 done, and it only created about 30KB log file.
Why there are so many (error -11)?

All the relations from 12143120 to 13778734 are printed error -11. That is 1635615 lines.

Does it matter if there are 1 million of error -11 in 100 millions of relations?
I have check the C163.dat, it has no messy dat or non-ascii data.
Thanks,

[QUOTE]
The error -11 from:
/gnfs/relation.c
line 198, func: int32 nfs_read_relation

if (!mp_is_one(&polyval.num))
return -11;
[/QUOTE]

[QUOTE]
08/18/2010 20:48 165 c163.ini
08/27/2010 14:11 485 step345.bat
08/27/2010 13:50 490 c163.fb
08/20/2010 16:13 902,144 msieve.exe
08/27/2010 14:48 6,587,248 c163.dat.br
08/27/2010 14:48 78,260,920 c163.dat.hc
08/27/2010 14:48 103,747,209 c163_step345.log
08/27/2010 13:22 17,960,780,775 c163.dat
[/QUOTE]

Fri Aug 27 14:11:27 2010 commencing number field sieve (163-digit input)
......
Fri Aug 27 14:11:27 2010 A0: 360487238294144721913628110556205285
Fri Aug 27 14:11:27 2010 A1: 1132654025229352805189159662989
Fri Aug 27 14:11:27 2010 A2: -214331411232140294532642735
Fri Aug 27 14:11:27 2010 A3: 1188242218322082959447
Fri Aug 27 14:11:27 2010 A4: 3601468549877046
Fri Aug 27 14:11:27 2010 A5: 152680320
Fri Aug 27 14:11:27 2010 skew 244861.34, size 6.185862e-016, alpha -7.003328, combined = 8.120503e-013
Fri Aug 27 14:11:27 2010
Fri Aug 27 14:11:27 2010 commencing relation filtering
Fri Aug 27 14:11:27 2010 estimated available RAM is 4096.0 MB
Fri Aug 27 14:11:27 2010 commencing duplicate removal, pass 1
Fri Aug 27 14:11:32 2010 error -15 reading relation 317645
Fri Aug 27 14:11:36 2010 error -15 reading relation 630571
Fri Aug 27 14:11:41 2010 error -15 reading relation 940806
Fri Aug 27 14:11:46 2010 error -1 reading relation 1248485
Fri Aug 27 14:14:31 2010 error -15 reading relation 12143118
[COLOR="Red"]Fri Aug 27 14:14:31 2010 error -11 reading relation 12143119
Fri Aug 27 14:14:31 2010 error -11 reading relation 12143120
...... 1635613 lines error -11
Fri Aug 27 14:18:50 2010 error -11 reading relation 13778734[/COLOR]

Fri Aug 27 14:18:51 2010 error -1 reading relation 13832363
......
Fri Aug 27 14:38:53 2010 error -1 reading relation 91168233
Fri Aug 27 14:38:53 2010 error -11 reading relation 91168564
Fri Aug 27 14:39:26 2010 error -5 reading relation 93245957
Fri Aug 27 14:46:01 2010 error -15 reading relation 120741558
Fri Aug 27 14:46:22 2010 error -15 reading relation 122241478
Fri Aug 27 14:48:46 2010 found 19565230 hash collisions in 130489264 relations

Batalov 2010-08-27 07:21

these relations are from another project (or another polynomial for the same number).

frmky 2010-08-27 07:36

[QUOTE=tgrdy;227241]
Does it matter if there are 1 million of error -11 in 100 millions of relations?
[/QUOTE]
And no, it does not matter. You can safely ignore these errors.

Andi47 2010-08-27 08:29

[QUOTE=frmky;227246]And no, it does not matter. You can safely ignore these errors.[/QUOTE]

The only problems are an unnecessary big logfile and much time wasted for printing thousands of error -11 lines to screen and logfile.

If the logfile (100+ MB :shock: ) doesn't eat up all of your disk space, your factorization should savely continue.

jrk 2010-08-27 08:44

[QUOTE=tgrdy;227241]All the relations from 12143120 to 13778734 are printed error -11. That is 1635615 lines.

Does it matter if there are 1 million of error -11 in 100 millions of relations?
I have check the C163.dat, it has no messy dat or non-ascii data.
Thanks,[/QUOTE]
I agree with Batalov that those relations are probably from the wrong polynomial.

Maybe it would be nicer to limit the maximum number of read error reports, and afterward printing something like "too many read errors, further errors will not be reported." Since if there are more than a million such errors, something is likely to be systematically wrong.

tgrdy 2010-08-27 13:36

Thanks,

I am sieveing the C163 on 20+ Linux PCs.
More than 1M relations are wrong, because one PC takes a wrong poly for another number.
I 'll delete the bad part, and sieve more again.

[QUOTE=Batalov;227243]these relations are from another project (or another polynomial for the same number).[/QUOTE]

jasonp 2010-09-06 21:29

[QUOTE=tgrdy;226959]
Tue Aug 24 18:12:25 2010 sparse part has weight 330087858 (-13885579586594914499609301295920214087895504823190912...44796620800.00/col)
[/QUOTE]
This should be fixed in SVN now.

frmky 2010-09-16 18:00

Large filtering job
 
1 Attachment(s)
This is a test run and not the final matrix that I'll be using for 5,409-, but I thought you might enjoy seeing msieve filter nearly 720M unique relations down to a 28M matrix.

em99010pepe 2010-09-16 18:46

Got this:

[code]Thu Sep 16 07:38:14 2010
Thu Sep 16 07:38:14 2010
Thu Sep 16 07:38:14 2010 Msieve v. 1.47
Thu Sep 16 08:19:01 2010 building initial matrix
Thu Sep 16 08:22:27 2010 memory use: 2400.0 MB
Thu Sep 16 08:22:35 2010 read 6507711 cycles
Thu Sep 16 08:22:36 2010 matrix is 6507534 x 6507711 (2172.8 MB) with weight 630534883 (96.89/col)
Thu Sep 16 08:22:36 2010 sparse part has weight 498001196 (76.52/col)
Thu Sep 16 08:23:34 2010 filtering completed in 2 passes
Thu Sep 16 08:23:35 2010 matrix is 6507000 x 6507177 (2172.8 MB) with weight 630519782 (96.90/col)
Thu Sep 16 08:23:35 2010 sparse part has weight 497996534 (76.53/col)
Thu Sep 16 08:24:24 2010 matrix starts at (0, 0)
Thu Sep 16 08:24:26 2010 matrix is 6507000 x 6507177 (2172.8 MB) with weight 630519782 (96.90/col)
Thu Sep 16 08:24:26 2010 sparse part has weight 497996534 (76.53/col)
Thu Sep 16 08:24:26 2010 saving the first 48 matrix rows for later
Thu Sep 16 08:24:28 2010 matrix includes 64 packed rows
Thu Sep 16 08:24:29 2010 matrix is 6506952 x 6507177 (2068.7 MB) with weight 516887531 (79.43/col)
Thu Sep 16 08:24:29 2010 sparse part has weight 477218263 (73.34/col)
Thu Sep 16 08:24:29 2010 using block size 262144 for processor cache size 8192 kB
Thu Sep 16 08:24:40 2010 commencing Lanczos iteration (4 threads)
Thu Sep 16 08:24:40 2010 memory use: 2528.5 MB
Thu Sep 16 08:25:30 2010 linear algebra at 0.0%, ETA 57h26m
Thu Sep 16 08:25:46 2010 checkpointing every 120000 dimensions
Thu Sep 16 09:54:31 2010 [B]error: corrupt state, please restart from checkpoint[/B][/code]

em99010pepe 2010-09-17 06:28

[quote=jasonp;225368]Carlos: interesting! v1.46 has more multithreading in the LA and so will probably push the memory bus harder than v1.45 did.[/quote]

I suppose my issue I posted earlier was due to the fact I had overclocked even more the processor. I think for sure v1.47 pushes even harder for the processor than v1.45 but yet not as much as prime95.

kar_bon 2010-09-17 08:28

[QUOTE=em99010pepe;230059]I suppose my issue I posted earlier was due to the fact I had overclocked even more the processor.[/QUOTE]

So that's the same thing causing errors with LLRnet2010.

Overclocking and expecting programs working correct is foolish.

em99010pepe 2010-09-17 09:39

No it is not. With your arrogance you were assuming I was running the cores in the same conditions for LLR, msieve or srsieve. As usual you judge people without knowing the background.To run LLR I underclock all my processores by 15 %. They run stable, I find primes and confirmed primes.

kar_bon 2010-09-17 09:50

[QUOTE=em99010pepe;230072]No it is not.[/QUOTE]

Sure it was overclocking! To remember see [url=http://www.mersenneforum.org/showpost.php?p=209967&postcount=43]here[/url].

Point.

em99010pepe 2010-09-17 11:21

I don\'t have issues with prpclient or LLR standard client, only with llrnet client at the same conditions.As I said before I am not going to waste my time with you.Best Regards,Carlos

jasonp 2010-09-17 12:58

[QUOTE=em99010pepe;230059]I suppose my issue I posted earlier was due to the fact I had overclocked even more the processor. I think for sure v1.47 pushes even harder for the processor than v1.45 but yet not as much as prime95.[/QUOTE]
Do you get stable operation with less overclocking? And you get computation errors with Prime95 earlier than with msieve? The latter somewhat surprises me, because I'd heard reports of the opposite.

Back in 2000 when all the overclockers posted on usenet newsgroups there were endless arguments about how much of it was a good idea. None of it convinced anyone of anything.

em99010pepe 2010-09-17 13:03

[quote=jasonp;230093]Do you get stable operation with less overclocking? And you get computation errors with Prime95 earlier than with msieve? The latter somewhat surprises me, because I'd heard reports of the opposite.
[/quote]

1st question: yes. 2nd question: yes.

em99010pepe 2010-09-17 20:41

[quote=em99010pepe;230095]1st question: yes. 2nd question: yes.[/quote]

[B]For the same ambient conditions (Summer) and running two machines at the same room space[/B]

On my core i5 msieve v1.47 is table until 177 x 21 = 3717 MHz but with v1.45 I can go up to 180 x 21. Temps are under 60 ºC while doing LA on 4 cores using 6GB of memory.
177 x 21 is also stable for running srsieve, NFS@home and ecm but these two last programs put the CPU at ~68 ºC.

Prime95 (or LLR/cLLR prpclient) is only stable to run until ~165 x 21.

CPU vcore is a little bit increased but right now I don't know how much, probably +6,25 mV or +12,50 mV. Other settings are also changed to have as much as possible a stable CPU overclock.

Merfighters 2010-09-23 12:16

When factoring relatively small numbers...

[CODE]
Msieve v. 1.46
Thu Sep 23 21:07:13 2010
random seeds: 1ca18188 9c4fe088
factoring 1000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000031 (94 digits)
searching for 15-digit factors
commencing number field sieve (94-digit input)
R0: -16666666666666666666665
R1: 1
A0: 100031
A1: 240000
A2: 216000
A3: 86400
A4: 12960
skew 1.67, size 4.691e-010, alpha -0.608, combined = 4.730e-007 rroots = 0
commencing linear algebra
read 45827 cycles
cycles contain 154995 unique relations
read 154995 relations
using 20 quadratic characters above 33362292
building initial matrix
memory use: 15.6 MB
read 45827 cycles
matrix is 45644 x 45827 (13.2 MB) with weight 4156628 (90.70/col)
sparse part has weight 3127744 (68.25/col)
filtering completed in 1 passes
matrix is 45644 x 45827 (13.2 MB) with weight 4156628 (90.70/col)
sparse part has weight 3127744 (68.25/col)
matrix starts at (0, 0)
matrix is 45644 x 45827 (13.2 MB) with weight 4156628 (90.70/col)
sparse part has weight 3127744 (68.25/col)
saving the first 48 matrix rows for later
matrix includes 64 packed rows
matrix is 45596 x 45827 (12.2 MB) with weight 3290658 (71.81/col)
sparse part has weight 2925713 (63.84/col)
using block size 18238 for processor cache size 3072 kB
commencing Lanczos iteration
memory use: 9.3 MB
<Msieve crashed at this point>
Return value 65280. Terminating...
[/CODE]

What happened?

Mini-Geek 2010-09-23 12:21

[QUOTE=Merfighters;231091]When factoring relatively small numbers...
...
What happened?[/QUOTE]

I don't know. Here's some advice while waiting for a better answer: sieve a few more relations and try again.

jasonp 2010-09-23 14:51

Did you compile this using MSVC or use one of Jeff Gilchrist's precompiled windows binaries? If yes, there's a bug in the linear algebra that Brian Gladman fixed a few days ago, and as a stopgap you can use the v1.47 windows binary from the sourceforge page (which doesn't have the bug because it's compiled with gcc).

Andi47 2010-10-06 05:19

c145: 86 cpu-hours of -np and not any poly found
 
I am currently GNFSing a c145 (a cofactor of the latest iteration of aliquot sequence 10212) and now I am ~86 cpu-hours (i7 @ 2.8 GHz) into poly search using Msieve 1.46 (CPU-version): So far I have not found any polynomial.

the screen output looks like this:

[CODE]random seeds: 36047390 94219475
factoring 3732013142391051119910921824210118145697203242752440325707923031128358903935348233134692108821450288
107857802781290883472699630382457381898449683 (145 digits)
searching for 15-digit factors
commencing number field sieve (145-digit input)
commencing number field sieve polynomial selection
time limit set to 97.75 hours
searching leading coefficients from 1 to 564255
deadline: 400 seconds per coefficient
coeff 60-600 64433833 83763983 83763984 108893179 lattice 8388832
p 64433833 83763983 83763984 108893179 lattice 8388832
batch 5000 78318761
batch 2042 83763991
p 49564486 64433832 108893183 141561139 lattice 4963806
batch 5000 64180301
batch 80 64433857
p 38126527 49564485 141561141 184029484 lattice 2937163
batch 3638 49564499
deadline: 400 seconds per coefficient
coeff 660-1200 71074014 92396218 92396219 120115084 lattice 10944462
p 71074014 92396218 92396219 120115084 lattice 10944462
batch 5000 84281851
batch 3189 92396303
p 54672318 71074013 120115088 156149613 lattice 6476013
batch 5000 68482951
deadline: 400 seconds per coefficient
coeff 1260-1800 74609853 96992808 96992809 126090651 lattice 13014213
p 74609853 96992808 96992809 126090651 lattice 13014213
batch 5000 88614661
batch 3102 96992813
p 57392194 74609852 126090653 163917849 lattice 7700718
batch 5000 72343651
deadline: 400 seconds per coefficient
coeff 1860-2400 77077762 100201090 100201091 130261418 lattice 14772180
p 77077762 100201090 100201091 130261418 lattice 14772180
batch 5000 90092381
batch 3996 100201403
p 59290586 77077761 130261421 169339846 lattice 8740934
batch 5000 72667501
deadline: 400 seconds per coefficient
coeff 2460-5400 81894218 106462483 106462484 138401229 lattice 22469046
p 81894218 106462483 106462484 138401229 lattice 22469046
batch 5000 87016351
batch 5000 92079131
batch 5000 97056511
batch 5000 101986741
batch 4529 106462487
deadline: 400 seconds per coefficient
coeff 5460-8400 86645227 112638795 112638796 146430434 lattice 26947936
p 86645227 112638795 112638796 146430434 lattice 26947936
batch 5000 91657651
batch 5000 96525151
batch 5000 101427731
batch 5000 106228921
deadline: 400 seconds per coefficient
coeff 8460-11400 89806889 116748956 116748957 151773644 lattice 30747693
p 89806889 116748956 116748957 151773644 lattice 30747693
batch 5000 94835621
batch 5000 99771211
batch 5000 104644391
batch 5000 109495361
deadline: 400 seconds per coefficient
coeff 11460-14400 92202823 119863670 119863671 155822772 lattice 34086455
p 92202823 119863670 119863671 155822772 lattice 34086455
batch 5000 97129031
batch 5000 102034411
batch 5000 106879141
batch 5000 111671281
deadline: 400 seconds per coefficient
coeff 14460-17400 94142778 122385611 122385612 159101295 lattice 37092613
p 94142778 122385611 122385612 159101295 lattice 37092613
batch 5000 99029701
batch 5000 103906051
batch 5000 108716941
batch 5000 113481551
deadline: 400 seconds per coefficient
coeff 17460-20400 95778395 124511913 124511914 161865488 lattice 39845592
p 95778395 124511913 124511914 161865488 lattice 39845592
batch 5000 100656431
batch 5000 105529351
batch 5000 110355061
batch 5000 115103551
deadline: 400 seconds per coefficient
coeff 20460-23400 97195678 126354381 126354382 164260696 lattice 42397991
p 97195678 126354381 126354382 164260696 lattice 42397991
batch 5000 102106021
batch 5000 106951661
batch 5000 111813571
batch 5000 116551601
deadline: 400 seconds per coefficient
coeff 23460-26400 98448290 127982777 127982778 166377611 lattice 44786612
p 98448290 127982777 127982778 166377611 lattice 44786612
batch 5000 103333511
batch 5000 108128021
batch 5000 112930151
batch 5000 117655361
deadline: 400 seconds per coefficient
coeff 26460-29400 99572048 129443662 129443663 168276761 lattice 47038336
p 99572048 129443662 129443663 168276761 lattice 47038336
batch 5000 104386031
batch 5000 109166501
batch 5000 113933591
batch 5000 118596601
deadline: 400 seconds per coefficient
coeff 29460-32400 100592086 130769712 130769713 170000626 lattice 49173485
p 100592086 130769712 130769713 170000626 lattice 49173485
batch 5000 105465691
batch 5000 110300051
batch 5000 115003591
batch 5000 119711881
deadline: 400 seconds per coefficient
coeff 32460-35400 101526751 131984776 131984777 171580210 lattice 51207842
p 101526751 131984776 131984777 171580210 lattice 51207842
batch 5000 106323361
batch 5000 111124841
batch 5000 115872161
batch 5000 120587381
deadline: 400 seconds per coefficient
coeff 35460-38400 102389853 133106808 133106809 173038851 lattice 53153956
p 102389853 133106808 133106809 173038851 lattice 53153956
batch 5000 107232101
batch 5000 112004171
batch 5000 116687111
deadline: 400 seconds per coefficient
coeff 38460-41400 103192057 134149674 134149675 174394577 lattice 55021991
p 103192057 134149674 134149675 174394577 lattice 55021991
batch 5000 107974381
batch 5000 112731391
batch 5000 117443231
batch 5000 122118671
batch 5000 126764951
batch 5000 131375731
deadline: 400 seconds per coefficient
coeff 41460-44400 103941779 135124313 135124314 175661608 lattice 56820314
p 103941779 135124313 135124314 175661608 lattice 56820314
batch 5000 108780781
batch 5000 113579161
batch 5000 118309141
batch 5000 123015041
batch 5000 127655951
deadline: 400 seconds per coefficient
coeff 44460-47400 104645781 136039515 136039516 176851370 lattice 58555909
p 104645781 136039515 136039516 176851370 lattice 58555909
batch 5000 109526671
batch 5000 114322441
batch 5000 119052331
batch 5000 123759101
batch 5000 128477861 <-- last line as of now
[/CODE]

compared to this, a GPU-run using ver. 1.45 (running overnight, ~8 hours) gives hundreds of these lines in the screen output (as expected; search range is 60000 - as far as it comes, currently approx. 65000)...

[CODE]poly 11 p 135152767 q 145304629 coeff 19638322667258443
poly 11 p 135208399 q 145270709 coeff 19641819985484891
poly 30 p 135198403 q 145148233 coeff 19623809299871899
poly 33 p 135226127 q 145130767 coeff 19625471529949409
poly 30 p 135228749 q 145321849 coeff 19651691842636901
poly 10 p 135217013 q 145174349 coeff 19630041835999537[/CODE]

...and has found [I]dozens[/I] of polys.

Have the parameters been changed between 1.45 and 1.46, so that 1.46 only outputs [I]superkalifragelistigexpialigoric[/I] polynomials? Or was it just bad luck with 1.46 and good luck with the 1.45 GPU-run?

jasonp 2010-10-06 17:37

None of the degree-5 code has changed between v1.45 and v1.46; the GPU code got somewhat faster, but I think most of the difference is the performance gap between CPU and GPU.

Andi47 2010-10-06 17:54

[QUOTE=jasonp;232739]None of the degree-5 code has changed between v1.45 and v1.46; the GPU code got somewhat faster, but I think most of the difference is the performance gap between CPU and GPU.[/QUOTE]

So it is just a strange fortune that I have found not a single poly after 86 CPU-hours (coeff. 1 to 47000), and approx. hundred of them (coeff. 60000 to ~67000) within 8 GPU-hours? Or is the GForce GTS 250 that much faster than an Intel i7?

BTW: my best poly is:

[code]R0: -8963497409556758629345860972
R1: 19683233448182377
A0: 39013925483873724802274206728137815
A1: 1062511623552559794023800147263
A2: 1519551721902671844300379
A3: -342040625795062587
A4: -273011920586
A5: 64500
skew 1953916.78, size 5.101027e-014, alpha -6.628828, combined = [B]1.101153e-011[/B][/code]

jasonp 2010-10-07 02:52

Well, the CPU code is not that well optimized, so a factor of 10 difference is not impossible. How many of the polynomials found have the same A5 and R1? Polynomials with matching A5/R1 are derived from the same stage 1 hit, so getting lucky once can generate hundreds to tens of thousands of polynomials.

Andi47 2010-10-07 05:20

1 Attachment(s)
[QUOTE=jasonp;232803]Well, the CPU code is not that well optimized, so a factor of 10 difference is not impossible. How many of the polynomials found have the same A5 and R1? Polynomials with matching A5/R1 are derived from the same stage 1 hit, so getting lucky once can generate hundreds to tens of thousands of polynomials.[/QUOTE]

see my attached msieve.dat.p file (.txt extension added to make attachment possible)

jasonp 2010-10-07 22:17

I count just over 20 hits in your file; about half the polynomials came from just one of those hits. Sounds reasonable to me...

henryzz 2010-12-11 16:30

I am running a polynomial selection job on a cpu for a c84 currently. I am getting loads on polynomials(too many). msieve is output loads of 7e-8 polynomials but it looks like the limit should be much higher as 8e-8 polys are common with there still being quite a few(probably plenty enough at least including >8.5e-8) polys from 9e-8-1.176e-7. Is this huge amount of extra polynomials just because I have hit a lucky number or do the params need adjusting.

jasonp 2010-12-11 19:07

Very likely the parameters need adjusting. The original parameters were derived by experiment, you pick a bound and see how many stage 1 hits you get, then adjust the bound so that only the top ~10% of polynomials are found at all. Unfortunately this method breaks down when computers get faster and/or the code is ported to use a GPU :)

The amount of testing at the very small sizes (< c90) has also been very small.

henryzz 2010-12-12 13:24

[QUOTE=jasonp;241330]Very likely the parameters need adjusting. The original parameters were derived by experiment, you pick a bound and see how many stage 1 hits you get, then adjust the bound so that only the top ~10% of polynomials are found at all. Unfortunately this method breaks down when computers get faster and/or the code is ported to use a GPU :)

The amount of testing at the very small sizes (< c90) has also been very small.[/QUOTE]
What tests do you want?
I could run a few dozen cpu runs over christmas(is it based on cpu time or clock time?).

jasonp 2010-12-12 14:21

There's an additional constraint with c85-size numbers, in that the whole job should take 30 minutes or less with QS, so it may be easier to stop after you find 10 polynomials and just sieve with the best one. The postprocessing would probably take 2-5 minutes, since it has to read several files from disk several times, so it would be best to give over as much time as possible to the sieving.


All times are UTC. The time now is 04:50.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.