mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Factoring (https://www.mersenneforum.org/forumdisplay.php?f=19)
-   -   2^821-1 (https://www.mersenneforum.org/showthread.php?t=10267)

fivemack 2008-05-09 08:34

2^821-1
 
This is the [i]mersenne[/i]forum factoring forum, and we're demonstrably able to do quite large SNFS jobs, so I've reserved the smallest unfactored Mersenne number.

I'm afraid this will be perhaps 30% more work than 3^512+1 was; the SNFS difficulty is only three digits more, but the yield of relations does not seem as good. As for 3^512+1, use gnfs-lasieve4I15e.

[code]
n: 1061244591507170837476394518295406773413033058567971362754037707602701554995003142600106989606370443665400533464086399877589805021812659427420956337676840056998663356824134600362050233184172947122074652274833
skew: 1
c6: 1
c0: -2
Y1: -1
Y0: 174224571863520493293247799005065324265472
lpbr: 31
lpba: 31
mfbr: 62
mfba: 62
rlambda: 2.6
alambda: 2.6
rlim: 80000000
alim: 100000000
[/code]

For this polynomial both rational and algebraic special-Q make sense (gnfs-lasieve4I15e -a or -r); if you're searching below Q=100 million, and that's where I'd recommend sieving, some versions of gnfs-lasieve will complain that Q is less than the rlim or alim value; reduce that value to equal the smallest Q.

One side for a range of a million around Q=10^8 will take about eight days on one Core2/2400 core, and produce about 2.4 million relations, so my estimate is that we'll need 40 to 50 MQ, sieved on both sides. Make reservations in the 40M-100M range, and probably we'll have a respectable matrix before the range is fully covered.

[b]28/5[/b] Don't bother reserving in the 60-70 range; if we've filled 40-60 and 70-100, there will be way more than enough relations, and your computrons are best directed elsewhere

[b]Reservations[/b][code]
* andi47 40-41 R
**Anonymous 40-40.5 A 1417387
**fivemack 40.5-41A 1451939
SMH 41-45 A+R
(** 41-42 2887442A + 3179858R)
(** 42-43 2898467A + 3173150R)
(** 43-44 2871097A + 3148816R)
(** 44-45 2850341A + 3140860R)
** Anonymous 45-45.5 1575277R
* SMH 46-48 A+R
(** 46-47 2830314A)
(** 46-48 final group 4101765 total)
fivemack 48-60 A+R
(** 48-49 2807006A + 3075952R)
(** 49-50 2823541A + 3079696R)
(** 50-51 2964737A + 2878872R)
(** 51-52 2784766A + 3047309R)
(** 52-53 2755609A + 3026947R)
(** 53-54 2861528A + 2931863R)
(** 54-55 2721184A + 2998539R)
(** 55-56 2931885A + 2819319R)
(** 56-57 2909069A + 2815967R)
(** 57-58 2887275A + 2792191R)
(** 58-59 2708219A + 2968525R)
(** 59-60 2842829A + 2772860R)
**bdodson 70-100 A+R
(70- 80A 25430044)
(80- 90A 24789184)
(90-100A 24207823)
(70- 80R 27902110)
(80- 90R 26703849)
(90-100R 25559742)
**fivemack 100-101 2461307A+2417030R
[/code]

[code]
21/05/2008 21:23 110519789 relations; 102720875 unique; about 82378835 large ideals
24/05/2008 12:11 174262339 relations; 154245428 unique; about 91053621 large ideals
matrix is 14624280 x 14624528 (4005.8 MB) with weight 987631043 (67.53/col)
sparse part has weight 903858759 (61.80/col)
memory use 3962.7MB, estimated runtime (1 CPU) = 1230 hours, 51 days, 4-CPU estimate 30d
30/05/2008 08:37 200863237 relations; 172478095 unique; about 93018231 large ideals
matrix is 12152185 x 12152433 (3284.7 MB) with weight 808859623 (66.56/col)
sparse part has weight 739549610 (60.86/col)
memory use 3249.8MB, estimated runtime (1 CPU) = 794 hours (33 days), 4-CPU estimate 20d
07/06/2008 12:30 266645965 relations; 214596314 unique; about 96357373 large ideals
matrix is 10129794 x 10130042 (2798.1 MB) with weight 690519820 (68.17/col)
sparse part has weight 632200045 (62.41/col)
memory use 2989.1MB, estimated runtime (4 CPUs) = 379 hours (15.8 days)
[/code]

bdodson 2008-05-12 16:28

[QUOTE=fivemack;133105]This is the [i]mersenne[/i]forum factoring forum, and we're demonstrably able to do quite large SNFS jobs, so I've reserved the smallest unfactored Mersenne number.

I'm afraid this will be perhaps 30% more work than 3^512+1 was; the SNFS difficulty is only three digits more, but the yield of relations does not seem as good. As for 3^512+1, use gnfs-lasieve4I15e.

...

For this polynomial both rational and algebraic special-Q make sense (gnfs-lasieve4I15e -a or -r); you'll have to reduce one or the other limit if you're searching below Q=100 million.

[b]Reservations[/b]
fivemack 100-101 A+R[/QUOTE]

I'd be willing to take a 30M range on the R-side, if you don't mind
my using a slightly different version of Greg's binary:

[code]
-rwxr-xr-x ... 1129598 Apr 1 06:14 gnfs-10mo-lasieve4I15e
-rwxr-xr-x ... 1128456 Mar 28 19:54 gnfs-lasieve4I15e [/code]

where the March 28 version was what I used for the last of my 512+
relations. Among other changes, this version allows sieving by qs below
the factor base bound (so I wouldn't need to adjust/reduce the limit.
actually, the "one or the other" instruction may be a bit elliptical for
those with standard binaries, I'm not sure myself what is intended. Aren't
almost all of the qs on the R-side, and many/most of the qs on the A-side
under the factorbase limit(s)? sorry for the quibble.).

Do I correctly gather that there isn't a block of width 30M above 100M
(as 512 only ran up to 125M, and this one has both R and A, so the
max would be no larger than 125M?). If that's correct, and the binary
is OK, I'd take 70M-100M on the R-side. No special Lehigh/NFSNET
acknowledgement needed; this would be a toll-free mersenne number
contribution. The 30M would take under 24 hours here (unless other
users show up), and would (hopefully) give some useful sieving without
adding to my diskspace crises.

On that topic (if you don't mind), does msieve actually need
msieve.dat during all of the lanczos job? For example, with -nc2
and/or -ncr (so that msieve wouldn't directly head into the sqrt),
and a full week or more of lanczos left to run, would files including

[code]
-rw-r--r-- ... 446037712 May 12 08:24 msieve.dat.chk
-rw-r--r-- ... 304200968 Apr 25 04:33 msieve.dat.cyc
-rw-r--r-- ... 3740484004 Apr 25 04:34 msieve.dat.mat
-rw-r--r-- ... 429158932 Apr 25 01:14 msieve.dat.s [/code]
but not msieve.dat suffice? If so, I'd keep it on a different
disk (and be able to run all four matrices in /home, which is
faster than /vtmp).

Along the same line, I'm wondering what would be needed to be
saved to another disk to be able to pause or save an msieve computation
for, say, a week or two; perhaps while continuing to run msieve/lanczos.
Looks like only the chk files are being written, so keeping all of the above,
along with the initialization files (work.. .fb) and msieve.dat should
suffice? For example, suppose I were inattentive when restarting after
a system reboot (or crash), and hit -nc2 instead of -ncr. Without a
complete backup of all of the above, I believe I've seen msieve.dat.mat
over-written, after which an older saved .chk might be useless(?).

Apollogies for the muttering and mumbling; I'm looking forward
to having womack/mersenneforum finish this first-hole on the
Cunningham base-2 lists cleared. -Bruce

(ps - I heard a rumor that a certain 18M^2 msieve matrix has finished; and
that the smallest factor isn't quite small enough to be considered
an ecm miss.)

fivemack 2008-05-12 16:58

70-100M on the R side would be fantastic, thanks! The precise version of the lasieve binary is I think immaterial.

Yes, most of the Q I'm looking at are below the factor base bound.

As far as I know the msieve.dat file is not needed during the Lanczos stage; at least, -ncr restarts sufficiently quickly that I'd be rather surprised if it were doing anything with msieve.dat. I would be tempted just to copy files into a temp directory until msieve -ncr starts to work in that directory; the Lanczos stage accesses the disc only to write checkpoints.

jasonp 2008-05-12 17:11

[QUOTE=bdodson;133264]
On that topic (if you don't mind), does msieve actually need
msieve.dat during all of the lanczos job? For example, with -nc2
and/or -ncr (so that msieve wouldn't directly head into the sqrt),
and a full week or more of lanczos left to run, would files including

[code]
-rw-r--r-- ... 446037712 May 12 08:24 msieve.dat.chk
-rw-r--r-- ... 304200968 Apr 25 04:33 msieve.dat.cyc
-rw-r--r-- ... 3740484004 Apr 25 04:34 msieve.dat.mat
-rw-r--r-- ... 429158932 Apr 25 01:14 msieve.dat.s [/code]
but not msieve.dat suffice? If so, I'd keep it on a different
disk (and be able to run all four matrices in /home, which is
faster than /vtmp).
[/QUOTE]
Once the matrix is built, only the .chk and .mat files are needed to restart a paused matrix solve. You are correct that running -nc2 a second time is not guaranteed to generate the same matrix file, and if it does not then an old checkpoint file will not be able to restart properly. You still need to restart with '-s <dat_file_name>' because this provides the filename prefix to use, but the .dat file itself should not be needed.

The filtering requires only the .dat file, the linear algebra requires the .dat and .cyc files to generate the matrix, restarting the linear algebra requires only the .mat and .chk files, and the square root requires the .dat, .cyc, and the .dep file that is the output from the LA.
[QUOTE]
(ps - I heard a rumor that a certain 18M^2 msieve matrix has finished; and
that the smallest factor isn't quite small enough to be considered
an ecm miss.)[/QUOTE]
So coy, sheesh :)

smh 2008-05-14 11:03

Do you have a place to upload the relations?

What file name format should i use?

fivemack 2008-05-14 12:11

Please use names like 2-821.41M-42M.r.gz or 2-821.70M-80M.a.gz, and upload the relations by

[code]
ftp chiark.greenend.org.uk
user anonymous, password {your email address}
cd special/twomack-relations/aarhus
binary
put 2-821.41M-42M.r.gz
[/code]

There ought to be plenty of room on the drive, and there will be even more when I've finished factoring 3^512+1 and stop feeling I need to keep offsite backups of the relations.

mdettweiler 2008-05-14 15:01

I'm interested in helping out here a little bit, but can't get gnfs-lasieve to build on 32-bit Linux (I'm not very experienced in the area of compiling software from source). Can somebody please point me in the general direction of gnfs-lasieveI14e or gnfs-lasieveI15e compiled for 32-bit Linux?

If that doesn't pan out, I can always use the precompiled Windows version through Wine (since Wine usually doesn't have much of an impact, if any, on apps such as this), though that's a bit of a messy workaround and I'd rather avoid it if possible. :smile:

Thanks in advance! :smile:

mdettweiler 2008-05-14 17:04

Well, I just ran a benchmark (using the info provided in the M2376 parameters thread) with gnfs-lasieve4I14e.exe through Wine on my system (Core 2 Duo 2.2Ghz running Ubuntu 32-bit), and the benchmark timings looked competitive, so I think running it through Wine will do just fine. :smile: Of course, it would still be optimal to use a native Linux binary, so any pointers on where to get that are still welcome. :smile:

I'll reserve 40-41 on the A side to get acquainted with how long these things will take on my CPU. :smile:

mdettweiler 2008-05-14 17:16

Oh, BTW, I forgot to ask this: if I need to stop gnfs-lasieve4I14.exe for whatever reason, how do I make it pick up where if left off when I restart?

fivemack 2008-05-14 17:24

See [url]http://www.mersenneforum.org/showthread.php?t=9772:[/url]

'If your computer crashes mid-job, look at the last hex number on the last full line of the output file '55.56', convert it to decimal, and use that as the '-f' when you start again; subtract from '-c' appropriately so that your range ends in the right place.'

Though if you're sieving on the rational side (which you're not, at least this time), it's the last hex number [i]before the colon in the middle[/i] that you need:

[code]
12571943595,90442:da66c29,17455,55543,7,2B,47,3,3,B,B,2625A01:11c54b27,84f9ac1,2A2D7,34565,67553,12B7B9,19CF,148D,5,5,1F,1F3,2,2
^^^^^^^ this one
[/code]

mdettweiler 2008-05-14 17:34

[quote=fivemack;133418]See [URL]http://www.mersenneforum.org/showthread.php?t=9772:[/URL]

'If your computer crashes mid-job, look at the last hex number on the last full line of the output file '55.56', convert it to decimal, and use that as the '-f' when you start again; subtract from '-c' appropriately so that your range ends in the right place.'

Though if you're sieving on the rational side (which you're not, at least this time), it's the last hex number [I]before the colon in the middle[/I] that you need:

[code]
12571943595,90442:da66c29,17455,55543,7,2B,47,3,3,B,B,2625A01:11c54b27,84f9ac1,2A2D7,34565,67553,12B7B9,19CF,148D,5,5,1F,1F3,2,2
^^^^^^^ this one
[/code][/quote]
Okay, that makes sense. Just to make sure I've got this straight, given the following line (the last line in my relations file as of ~half a minute ago):
[code]-22979519,7835560:5682ccd3,28DB,30BF,2DBF3,4D44D,4E36D3,B3DAC9,11,C1,265:6462dc37,30545ad9,5C39,6D3F,9D61,18A91,26261F9[/code]
....I'm assuming the number I'd be looking for would be 26261F9, which, converted to decimal, would be 40002041 (which was roughly the q value that it was at at the time)?

Then, I'm also assuming I'd have to specify a new output file when I restart, or else it would overwrite the original relations file (I know because I tried it when I was only a few minutes into the range and didn't have much work to lose)--then I'd just concatenate all the relations files when the range is done, gzip them up, and upload them?

fivemack 2008-05-14 18:16

Yes, that's right. You do need to specify a new output filename - after one embarrassing incident, I hacked the gnfs-lasieve that I use at home to fail if the output file exists, and there's at least one version which conveniently uses as default filename {polynomial file}.lasieve-{side}.{start}-{end}.

You're right: concatenate the relations files, gzip and upload when you're finished. Good luck with the sieving!

bdodson 2008-05-14 18:22

[QUOTE=fivemack;133268]70-100M on the R side would be fantastic, thanks! [/QUOTE]

Let's make that 70M-100M A+R; adjusting my scripts for writing condor
submit files to use "-a" wasn't as bad as I thought. -Bruce

(I'm alternating 10M ranges for 821- and 949+, 70M-80M "R" was just
about done.)

mdettweiler 2008-05-14 20:22

I was just thinking: should I be using gnfs-lasieve4I15.exe instead of gnfs-lasieve4I14.exe? I know I15 produces more relations in a given range, but would it make my range take longer?

bsquared 2008-05-14 20:26

[quote=Anonymous;133428]I was just thinking: should I be using gnfs-lasieve4I15.exe instead of gnfs-lasieve4I14.exe? I know I15 produces more relations in a given range, but would it make my range take longer?[/quote]

Yes, roughly twice as long. I believe the use of 15e has been requested, yes.

mdettweiler 2008-05-14 20:31

[quote=bsquared;133429]Yes, roughly twice as long. I believe the use of 15e has been requested, yes.[/quote]
Hmm...okay. I'm at about 2.5% through my range right now; I'll switch over to 15e shortly.

mdettweiler 2008-05-14 21:40

Considering the 2x slowdown from using 15e for my range, I think I'll only do 40-40.5 instead of 40-41. This should bring the time to complete the range back in line with what I had originally planned for. :smile:

smh 2008-05-16 13:17

Extended my own reservation to 45M on both sides

Andi47 2008-05-17 10:04

Hmmmm... Where did I say that I reduce my reserved range from 40-41M R to 40-40.5M R?

I still plan to do 40-41M on the R side.

Edit: @ fivemack: I think that you have entered the reducing of Anonymous' range in post #17 into the wrong line.

[b]fivemack[/b]: Yes, I did, sorry; fixed now

mdettweiler 2008-05-22 18:18

40M-40.5M on the A side is complete, now uploading. :smile:

Edit: Upload complete.

mdettweiler 2008-05-24 04:22

I'll do 45M-45.5M on the R side, if nobody minds me having to delay starting it for a couple of days. :smile:

fivemack 2008-05-24 09:25

Dates for the matrix job
 
Thanks to the enormous resources that bdodson has applied to sieving, we're doing pretty well with relations; I'm doing a run at the moment which I expect to generate a matrix. Further relations will reduce the size of the matrix, and I suspect reduce the run-time by more than the time it takes to produce the relations, so we should definitely keep sieving for the next couple of weeks.

I intend to start the matrix job on or about June 7th, so please don't make reservations that you are not sure you will be able to complete by then; I'll do a final call for uploads of relations from unfinished ranges on June 5th.

mdettweiler 2008-05-24 23:28

It looks like there's an error in the reservations table: I reserved 45-45.5 on the R side, not the A side. (However, it really doesn't make much difference to me which side I search, so if it would make things easier I'll just as easily do the A side instead. :smile:)

[b]fivemack[/b]: sorry, fixed the reservations table

smh 2008-05-28 07:06

I'll reserve 46-47 A+R

smh 2008-06-02 17:41

I'll continue 47 - 48 A+R and will see how far i get.

fivemack 2008-06-04 08:27

36 hours to end of sieving
 
Just a reminder that sieving should be finished by tomorrow evening my time (IE about 36 hours from when I post this). We ought to have masses of relations and a nice small matrix, with hope for factors by the start of July.

mdettweiler 2008-06-05 00:25

45M-45.5M R complete, relations uploading as I type. :smile:

Andi47 2008-06-05 05:03

I just started the upload of my relations.

smh 2008-06-07 08:17

Just wondering, did you manage to build a matrix, and what's the ETA?

fivemack 2008-06-08 17:22

[QUOTE=smh;135383]Just wondering, did you manage to build a matrix, and what's the ETA?[/QUOTE]

Yes, I've got a decent matrix (just over 10M on a side), ETA is evening of 24 June. The first half of the oversieving saved significantly more time than it took; the second half perhaps not so much.

The matrix fits very happily on the 4G quad-core:
[code]
nfsslave2@sheep:~$ ps -F 14087
UID PID PPID C SZ RSS PSR STIME TTY STAT TIME CMD
1000 14087 14046 99 886339 3512856 1 18:20 pts/1 Rl 10:53 ../msieve-1.34/msieve -v -ncr -t 4
[/code]

fivemack 2008-06-25 07:15

It's done
 
The cofactor of 2^821-1 splits as
[code]
25348330589061322436059499894718930577798031880081867566547223 * 41866449065688547732334670918138300495432254381696818748989283844419920572286356810694501570678523880104345359476534651949605873238987445628114071
[/code]

The smaller factor is a P62; the 10129794 x 10130042 matrix (2798.1 MB) with weight 690519820 (68.17/col) sparse part has weight 632200045 (62.41/col) took 374:36:58 to run on a Q6600 quad-core with 4G memory, and the factors were found on the first dependency after seven hours.

Sieving effort was Q=40M to 60M and 70M to 100M, both algebraic and rational sides; from my records it was about 220 CPU-hours per million Q per side, so just under three core2/2400-years for the whole job.

R.D. Silverman 2008-06-25 12:08

[QUOTE=fivemack;136569]The cofactor of 2^821-1 splits as 25348330589061322436059499894718930577798031880081867566547223 * 41866449065688547732334670918138300495432254381696818748989283844419920572286356810694501570678523880104345359476534651949605873238987445628114071

The smaller factor is a P62; the 10129794 x 10130042 matrix (2798.1 MB) with weight 690519820 (68.17/col) sparse part has weight 632200045 (62.41/col) took 374:36:58 to run on a Q6600 quad-core with 4G memory, and the factors were found on the first dependency after seven hours.

Sieving effort was Q=40M to 60M and 70M to 100M, both algebraic and rational sides; from my records it was about 220 CPU-hours per million Q per side, so just under three core2/2400-years for the whole job.[/QUOTE]

Nice work.

Allow me to ask: What were the factor base bounds? What was
the size of the sieve region per Q?
Also, I presume the 220 hrs/10^6 Q represents data from multiple
machines? I am currently doing 2,1538M and one of my machines processes
a single q in about 10.3 seconds. --> 2860 hours per million q, not 220.

fivemack 2008-06-25 13:48

[QUOTE=R.D. Silverman;136576]Nice work.

Allow me to ask: What were the factor base bounds? What was
the size of the sieve region per Q?
[/quote]

Factor base bounds are min(8e7, Qrat) on the rational side, min(10e7, Qalg) on the algebraic side. Sieve region is 2^15 * 2^14.

[quote]
Also, I presume the 220 hrs/10^6 Q represents data from multiple
machines? I am currently doing 2,1538M and one of my machines processes
a single q in about 10.3 seconds. --> 2860 hours per million q, not 220.[/QUOTE]

Ah, I say '10^6 Q' to mean 'Q between N and N+10^6', which corresponds to about 55,000 usable Q values. In the 220 hours, that means about 14.4 seconds per ideal.

R.D. Silverman 2008-06-25 14:12

[QUOTE=fivemack;136581]Factor base bounds are min(8e7, Qrat) on the rational side, min(10e7, Qalg) on the algebraic side. Sieve region is 2^15 * 2^14.



Ah, I say '10^6 Q' to mean 'Q between N and N+10^6', which corresponds to about 55,000 usable Q values. In the 220 hours, that means about 14.4 seconds per ideal.[/QUOTE]

Thanks. How much real memory did the siever require for these
parameters?

fivemack 2008-06-25 15:57

[QUOTE=R.D. Silverman;136582]Thanks. How much real memory did the siever require for these
parameters?[/QUOTE]

It uses about 560MB of virtual memory, 240M real memory.


All times are UTC. The time now is 10:29.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.