![]() |
2^859-1 sieving
After a discussion thread which came to a consensus which turned out not to be usable, I've decided to pick a number out of the air. Please finish any jobs you've agreed to run on 109!+1 before starting on this one.
2^859-1 is the smallest Mersenne number whose complete factorisation is unknown; C203 cofactor, SNFS difficulty 258.58. [b]Reservations closed at 1030 28 May[/b] Sieve 40M - 120M both sides; use lasieve4I15e; 1MQ a+r will take about 14 days on a 2.4GHz Core2. Polynomial is the very boring [code] n: 40408389115643940521033480968678506700953715205316912938887204062563858647028004670539633098136029193976662157464610911705434456944861732186510748004762758405328979466396236111507945374914053285803350311 c6: 2 c0: -1 Y1: -1 Y0: 11150372599265311570767859136324180752990208 type: snfs skew: 0.89 rlambda: 2.6 alambda: 2.6 lpbr: 31 lpba: 31 mfbr: 62 mfba: 62 alim: 100000000 rlim: 100000000 [/code]and the upload directory is the equally predictable M859. [B]Contributions[/B] Xyzzy 40M-48M (done 9/5 in 1600 pieces;A 13132578, R 13961984;11.6Mcpus on ZX-81 cluster) bsquared 48M-54M (done 19/5; A 9575224, R 10210440; 14Mcpus on K8/1400 cluster) fivemack 54M-60M (done 26/5; A 9386197, R 9992498; 8453kcpus on various machines) J.F. 60M-70M (done 6/5) - around 10.3Mcpus on K8/2000 cluster fivemack 70M-71M (done 11/5) (A 1445551, R 1553368) fivemack 71M-74M (done 1/5) (A 4396628, R 4763914) fivemack 74M-75M (done 28/4) (A 1447669, R 1572047) (1118kcpus on C2/3000) fivemack 75M-77M (done 8/5) (A 2897367, R 3138691) (2252kcpus on C2/3000) fivemack 77M-78M (done 12/5) (A 1444820, R 1562767) (1128kcpus on C2/3000) fivemack 78M-80M (done 11/5) (A 2582239, R 2805728) J.F. 80M-81M (done 21/4) (A 1428284, R 1554945) J.F. 81M-90M (done 28/4); 10.47Mcpus on K8/2000 cluster batalov 90M-91M (R received 30/4, A received 2/5) antiroach 91-92M (done 2/6) fivemack 92M-96M (done 26/5) (A 5538324, R 5978357) (5499kcpus on Q6600 and K8/2500) bsquared 96M-100M (arrived 29/5) (A 5483932, R 5921254) (10Ms, K8/1400 cluster) fivemack 100M-110M (done 31/5) (A 12046700, R 13011689) (<=32 cores, various machines over about a week) bsquared 110M-115M (done 1/6) fivemack 115M-120M (done 2/6) (A 6444955, R 6879805) (26 cores, various machines over about five days) [b]Counts[/b] 16 May: 39MQ, 120831808 relations. 22173399 dup, 98658409 unique. All gone by end of singleton removal. 27 May: 55MQ, 171449528 relations. 42781888 dup, 128667640 unique. 32M/36M left by end of singleton removal. |
We'll take 40-48M.
How soon are you looking at having all this done? [b]fivemack:[/b] It's about 30 CPU-months of effort, I suspect I'll be able to scare up about 30 CPUs and so finish sieving mid-May |
First pancakes are on me!
Fivemack: would you please taste 'm first, before I start 81-90? [b]fivemack:[/b] those appear to be fine pancakes, go ahead with 81-90 |
best packing for FTP
Let's revive the "best packing" topic.
[FONT=Courier New]7z[/FONT] is OK but needs installation etc etc. [FONT=Courier New]bzip2[/FONT] is definitely better than [FONT=Courier New]gzip[/FONT]. But whatever you use, I recommend the following filter (you can call it [FONT=Courier New]lc[/FONT] if you don't have a [FONT=Courier New]tr [/FONT]binary ) -- it reformats the file to lowercase letters (and digits of course): [code]#!/usr/bin/perl while(<>) { print lc($_); }[/code] or use [code]cat myRels | tr '[A-Z]' '[a-z]' | bzip2 -9c > Axx.bz2[/code] Results (on A80-81M.bz2 file): [code]76221762 2009-04-25 21:32 a80-81M.bz2 (this file is all lowercase) 78670955 2009-04-21 09:41 A80-81M.bz2 85365026 2009-04-25 21:22 a80-81M.gz (gzip for comparison) [/code] [FONT=Courier New]msieve[/FONT] doesn't need the uppercase letters in the reln files. Neither do any other filtering programs. |
[QUOTE=Batalov;170988]Let's revive the "best packing" topic.
[FONT=Courier New]7z[/FONT] is OK but needs installation etc etc. [FONT=Courier New]bzip2[/FONT] is definitely better than [FONT=Courier New]gzip[/FONT]. [/QUOTE]By "better" do you mean "uses several times the amount of computation"? At FlyBase we ship multi-gigabyte gzipped files around. The uncompressed data is around 20GB per file. We found that the time taken to transfer a somewhat larger gzipped file was much less than the increased time to bzip2 and bzcat the same data. YMMV Paul |
[QUOTE=xilman;170996]We found that the time taken to transfer a somewhat larger gzipped file was much less than the increased time to bzip2 and bzcat the same data.[/QUOTE]
That is a good point that people should consider. It all depends on your transfer speed, if you get charged by the GB or have a monthly bandwidth cap, and how many processors you have available to compress/uncompress the data. 7zip get better compression than zip and the speed isn't bad either, much faster than bzip2, plus it can use multiple processors so in certain cases it would be faster than using gzip and get better compression. The lower-case conversion in this case seems like a very fast thing to do in order to get better compression from whatever algorithm you are using. Of course if you are going to use bzip2 and you have multiple cores, you should consider using [URL="http://compression.ca/pbzip2/"]pbzip2[/URL] which can use them. |
Paul, my message was about using lowercase. (This reduces alphabet of the file and it will pack better with any compressor, even RLE. :smile:)
And I don't think anyone will argue that [FONT=Courier New]tr[/FONT] takes any appreciable time. Everything else was already discussed before, so it was not my intention to add any oil to that fire. |
[QUOTE=Batalov;171047]Paul, my message was about using lowercase.[/QUOTE]
What's the deal with uppercase in reln files, anyway. I mean, is the siever trying to tell us that some primes are WAY TOO IMPORTANT for lowercase? And airplane food: What's the deal with airplane food? Don't they feed the jets enough when they're on the ground? |
I am pretty sure that there was never any deal with it, just indifference.
Printing lowercase is a side-effect of the mpz_str_out (which can print in up to base-62 (or 64?) system, in which case it would have used all digits and letters); for OBASE 16, it prints digits and lowercase. The rest of primes are printf'ed with "%X", so they are uppercase. I've spoken with Jason (just in case I've missed something) and I've checked both codebases - nothing there assumes upper-lower-case differences. Ha, just checked another thing: the line siever in msieve printfs with "%x", so all is lowercase for line-sieved rels. There is no reason to print in hex, either (I bet pure decimals might compress better!), but it's now well used, so no reason to cha[SIZE=1](lle)[/SIZE]nge this convention. 5% better compression [I]for free[/I] sounds like a good idea to me. Everything else has its pros and cons, and I am not talking about anything else. |
81-90M is finished. I was quite surprised by the hint from Batalov, and indeed: 3.0% savings of the tolower conversion before applying bzip2. Didn't do "best" bzip2 (option -9) though because I don't want to stress the cluster front-end server.
80-90M A+R took 10.47M sec on dualcore Opterons @2Ghz. |
Reserving 60-70M.
|
| All times are UTC. The time now is 00:42. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.