mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   NFSNET Discussion (https://www.mersenneforum.org/forumdisplay.php?f=17)
-   -   Post processing for 2,757- (https://www.mersenneforum.org/showthread.php?t=1368)

xilman 2003-11-05 18:04

Post processing for 2,757-
 
A brief update on how we are getting on with the factorization of 2,757-, aka M757, since the sieving finished just over three weeks ago.

The raw data came to around 4 gigabytes, so manipulating files of that sort of size was not trivial.

Fifteen days was spent filtering out the useful data from the useless and then boiling down the former. The merging stage reduced what would have been a matrix with 40 million rows and columns to one which has "only" 7,522,873 rows and 7,525,487 columns.

The linear algebra to find linear dependencies in this7.5M matrix began on 28th October and is still going, 8 days later. Progress is mostly steady (we had a few interruptions for various reasons which lost us about 8 hours running in total) and is expected to take another three weeks or so.

The linear algebra alone will take over a cpu-year though this is, of course, only a fraction of the resources the sieving used.

When the linear algebra has finished, it should be a matter of a few hours before we know the factors.


Paul

ET_ 2003-11-05 20:32

Just out of curiosity, may I ask how can you manage data of that size? I mean, what software and resources are needed to handle that huge mass of data?

Luigi

xilman 2003-11-06 12:21

[QUOTE][i]Originally posted by ET_ [/i]
[B]Just out of curiosity, may I ask how can you manage data of that size? I mean, what software and resources are needed to handle that huge mass of data?

Luigi [/B][/QUOTE]

The data was brought from Austin to Cambridge by sftp over the open ethernet. Richard has ADSL and 24/7 connectivity; Microsoft Research has adequate capacity ;-) It was transferred in several over-night sessions.

My workstation is a fairly ordinary 2.5GHz box with a 40G disk and a rather larger than average 1G of RAM. It runs XP Pro and so has the NTFS filesystem which supports files much larger than the ones needed here. The 40G disk is a bit limiting and I'll be installing a commodity 160G scratch disk next week. For now, I've been keeping stuff compressed when not needed and I've been dumping files onto other machines for temporary storage.

One stage of the postprocessing needs a large amount of memory, so I used a cluster node. Each node has 2G of RAM and the filter run used 1900M of memory, so it only just fitted.

Summary: the only essentials are a filesystem that supports large files, a few dozen gigabytes of disk and a decent amount of RAM.
Oh, and a degree of patience 8-)

The large memory filter run could have been avoided, but at the cost of greater inefficiency later on, so the entire post-processing up to but excluding the linear algebra could have been performed on a commodity PC upgraded to 1G RAM.


Paul

ET_ 2003-11-06 14:23

:surprised

Thanks Paul!

Luigi


All times are UTC. The time now is 07:34.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.