![]() |
During Lanczos, every thread has some idle time, so the core temperatures will be quite a bit lower than during LLR. Some other subsystems (memory and memory controller) will be tortured harder. If there's a separate sensor on CPU NB, it will show some heat. As torture goes, these tests are complimentary.
|
[QUOTE=Batalov;308016]During Lanczos, every thread has some idle time, so the core temperatures will be quite a bit lower than during LLR. Some other subsystems (memory and memory controller) will be tortured harder. If there's a separate sensor on CPU NB, it will show some heat. As torture goes, these tests are complimentary.[/QUOTE]
The temps are lower but the CPU when overcloked fails faster than with LLR. Do those machines have the ability to be overclocked or they run only a stock frequency? Those are blades, etc... |
I would like to see a company that overclocks servers! :rakes:
|
[QUOTE=Batalov;308018]I would like to see a company that overclocks servers! :rakes:[/QUOTE]
I know, it was an ignorant question. |
Note that MPI msieve needs careful tuning when run on a large SMP machine, because it's easy for the OS to not correctly balance the load across all the cores, and easy for the OS to shuffle MPI processes around after they have allocated their memory. frmky has reported that you have to disable even cron processes, fivemack has [url="http://fivemack.livejournal.com/226160.html"]a post[/url] on what he had to do.
|
[QUOTE=jrk;307786]Reserving 44371_43_minus1[/QUOTE]
Done. Factors are in the OPN factors thread. |
I've just started 389_95_minus1 to test how MPI works on Dual CPU server Xeon E5620. For matrix 3.6M^2 I've got:
ETA 19hrs - MPI version (4x4) ETA 23hrs - 16 threads ETA 28hrs - 8 threads ETA 31hrs - 4 threads When the MPI version works, the server seems *very busy*. |
[QUOTE=unconnected;308119]I've just started 389_95_minus1 to test how MPI works on Dual CPU server Xeon E5620. For matrix 3.6M^2 I've got:
ETA 19hrs - MPI version (4x4) ETA 23hrs - 16 threads ETA 28hrs - 8 threads ETA 31hrs - 4 threads When the MPI version works, the server seems *very busy*.[/QUOTE] What about running two MPI works with the same number of threads each? E5620 is a 4C/8T processor, when you say threads you mean threads or cores? You got a dual so you have 8C/16T... |
When I say threads I mean threads, not cores.
BTW, postprocessing is completed: [CODE]prp75 factor: 268853625856147421086944544035343580437649850239242201121922967793590837891 prp105 factor: 586175223206963889564997276172745574126823744066460572905036034158781676507405862687249466261669679801371 [/CODE] |
[QUOTE=unconnected;308119]
ETA 28hrs - 8 threads ETA 31hrs - 4 threads [/QUOTE] With your results we can say that in terms of energy efficiency it's better to dedicate 4 threads to each number. When you double the number of threads the ETA doesn't go to half. I conclude it's better to run factorizations in parallel on a Dual CPU server Xeon E5620. |
pinhodecarlos: sparse linear algebra is memory-constrained, and the performance of running two jobs at once is going to be critically dependent on details of memory placement ... I don't know whether the -t4 job is going to allocate on both processors and pull stuff across the (very fast) inter-processor interconnect.
I haven't got the intuition to say anything about this without running experiments, and I haven't got a dual-Xeon to experiment with any more. |
[QUOTE=fivemack;308128]pinhodecarlos: sparse linear algebra is memory-constrained, and the performance of running two jobs at once is going to be critically dependent on details of memory placement ... I don't know whether the -t4 job is going to allocate on both processors and pull stuff across the (very fast) inter-processor interconnect.
I haven't got the intuition to say anything about this without running experiments, and I haven't got a dual-Xeon to experiment with any more.[/QUOTE] Dmitry can experiment for us. |
[QUOTE=fivemack;308128]I haven't got the intuition to say anything about this without running experiments, and I haven't got a dual-Xeon to experiment with any more.[/QUOTE]That need not be a problem. I have two such and we may be able to come to a mutually beneficial arrangement.
You wouldn't be the first MersenneForum member to have ssh access to one or more of my machines for software development. Paul |
Lionel and post-processing helpers,
Be aware that there will be a call to arms for the RSA project for SETI.USA team members. [quote=Fire$torm] This is a friendly admin notice. I am sending out the team mass email for the Sept. SIMAP challenge and included a call to arms for the RSA project. Hopefully we should see additional members on this project soon. [/quote] Fire$torm is the administrator of SETI.USA forum. They will go for first place and I suppose there will be a fight for it with team Sicituradastra. I hope there will be plenty of work to be done and available post-processing helpers. Tomorrow I will clean my computer to be another one to help out the post-processing phase. Carlos EDIT: Just saw the "RSALS shutting down at the end of August, please migrate to NFS@Home..." thread on RSALS forum so I think it is better to someone to post a message on the SETI.USA forum telling them to migrate to NFS@Home. |
Indeed, SETI.USA should flee RSALS, and attach their clients to NFS@Home instead.
At the time of this writing, the RSALS server's disk is effectively overcommitted - I mean, it has ~20e9 bytes free, but the results for the WUs in progress will take up more than this. And that's [i]after[/i] I removed the raw relation sets for 389_95_minus1 and 44371_43_minus1. I'm going to let RSALS starve, and to fill in more numbers on the NFS@Home side. |
You would want to unqueue C154_4788_5053 - it is factored by someone.
|
[QUOTE=jasonp;308027]Note that MPI msieve needs careful tuning when run on a large SMP machine, because it's easy for the OS to not correctly balance the load across all the cores, and easy for the OS to shuffle MPI processes around after they have allocated their memory. frmky has reported that you have to disable even cron processes, fivemack has [url="http://fivemack.livejournal.com/226160.html"]a post[/url] on what he had to do.[/QUOTE]
Memory on our cluster is by nodes; 64Gb/node, with two 8-core Opteron chips/node. As a practical matter, asking for four nodes seems to work best with our scheduler ("PBS/Torque/Maui" with playfair priority), and I'm using 64-cores on an 8x8 grid. So all of the cores are busy, and I don't see any change over time in the performance. What I do see is occasional restarts (with 12-22hr runs) that get an especially bad distribution of the 64 tasks. Here's a good distribution [code] Thu Aug 16 09:57:46 2012 initialized process (0,0) of 8 x 8 grid Thu Aug 16 09:57:55 2012 matrix starts at (0, 0) Thu Aug 16 09:57:55 2012 matrix is 4887089 x 4314690 (365.5 MB) with weight 133662597 (30.98/col) Thu Aug 16 09:57:56 2012 sparse part has weight 52660698 (12.20/col) ... Thu Aug 16 09:57:59 2012 matrix is 4887041 x 4314690 (318.5 MB) with weight 58899832 (13.65/col) Thu Aug 16 09:57:59 2012 sparse part has weight 40345266 ( 9.35/col) Thu Aug 16 09:57:59 2012 using block size 262144 for processor cache size 10240 kB Thu Aug 16 09:58:01 2012 commencing Lanczos iteration Thu Aug 16 09:58:01 2012 memory use: 494.4 MB Thu Aug 16 09:58:16 2012 restarting at iteration 417488 (dim = 26400057) Thu Aug 16 09:59:08 2012 linear algebra at 67.5%, ETA 120h52m Thu Aug 16 09:59:24 2012 checkpointing every 110000 dimensions [/code] for one of the 64 submatrices of our current [code] Sat Jul 21 12:36:44 2012 matrix is 39095900 x 39096100 (11555.1 MB) with weight 3371381266 (86.23/col) Sat Jul 21 12:36:44 2012 sparse part has weight 2638144630 (67.48/col) [/code] So here's the timings at this week's restarts [code] mpi00:Wed Aug 8 08:29:51 2012 linear algebra at 44.7%, ETA 205h42m mpi00:Wed Aug 8 21:11:02 2012 linear algebra at 47.8%, ETA 235h53m mpi00:Fri Aug 10 09:07:08 2012 linear algebra at 53.0%, ETA 512h18m mpi00:Fri Aug 10 12:41:58 2012 linear algebra at 53.2%, ETA 369h53m --- mpi00:Sat Aug 11 13:29:53 2012 linear algebra at 54.6%, ETA 169h37m mpi00:Mon Aug 13 09:41:02 2012 linear algebra at 57.1%, ETA 159h50m mpi00:Mon Aug 13 20:07:42 2012 linear algebra at 59.4%, ETA 151h14m --- mpi00:Wed Aug 15 06:07:29 2012 linear algebra at 65.0%, ETA 130h12m mpi00:Thu Aug 16 06:08:58 2012 linear algebra at 67.2%, ETA 351h41m mpi00:Thu Aug 16 09:59:08 2012 linear algebra at 67.5%, ETA 120h52m [/code] The short ETAs are mostly when the tasks were scheduled on nodes 14-17; while some of the worst were on nodes 18-21. But the three most recent are all on 14-17 (with node 17 as head node), so this sporadic bad loading doesn't seem to depend on the hardware (these are all ib nodes). I've taken to killing restarts with bad timings, and have so-far gotten a good timing from the subsequent restart (the timings for new runs are 12hr - 22hr, depending upon scheduling, so the progress on % isn't uniform). This is our second large matrix (the other 25M^2), with a binary from when our sysadmin where having binaries run with hydra-mpirun. They're saying I should switch to compiling with openmpi (not sure whether it is 1.6 ...), so I'd be interested to hear what we ought to be watching for with the new binary. -Bruce (as in Batalov+Dodson) PS - It is easy to see the difference between good/bad/terrible: [code] Thu Aug 16 09:59:24 2012 checkpointing every 110000 dimensions (16 restarts) Wed Aug 8 21:11:21 2012 checkpointing every 90000 dimensions Sun Aug 5 09:19:55 2012 checkpointing every 80000 dimensions (1 restart, each) Tue Jul 31 23:31:53 2012 checkpointing every 70000 dimensions (6 restarts) Sun Jul 22 04:34:32 2012 checkpointing every 60000 dimensions Fri Aug 10 12:42:30 2012 checkpointing every 50000 dimensions Fri Aug 10 09:07:58 2012 checkpointing every 40000 dimensions (2 restarts, each) [/code] These are msieve's estimate of the number of Lanczos iterations/hour; the good restarts are doing twice as many Lancos iterations as the terrible ones, per hour. |
1 Attachment(s)
Sieving a GNFS 171 task using 30-bit LPs (due to space constraints; otherwise, I'd obviously have used 31-bit LPs !) with the 14e siever is [i]officially[/i] a bad idea :smile:
The output of remdups4 -v is attached. Summary: [code]Found 91563793 unique, 48348274 duplicate (34.6% of total), and 155195 bad relations.[/code] Getting rid of all of those duplicates, and recompressing the result with pbzip2, saves more than 4 GB of disk space... |
C171
One of the problems might be the low starting point for special-q. Though, it may produce a better yield, one is apt to accumulate more dups in the lower ranges. I would think starting around 20-30M might have been a better choice. Of course, I am always subjected to be corrected by a higher authority. :-)
|
[QUOTE=Dubslow;307744]Okay, I guess I'll pitch in and do GW_6_301.[/QUOTE]
Sorry it took so long, I've had hardware issues and then I spent two days in Windows getting the entire summer's worth of gaming done in a few days. :razz: [code]PRP54 = 167234023851315043627492602845770131366956631669151511 PRP182 = 35428216977629308709346384897858722957864896521236633920871150047660354226030990937812843100212728231266275403247901894323027521996535525361051619602035845364097644084622293809080053[/code] That was pretty close with the ECM, though perhaps not an ECM "miss" per se. RSALS [URL="http://boinc.unsads.com/rsals/crunching.php"]reports[/URL] ECM to 3t50. Edit: I'll do 160969_43_minus1 (it looks a bit closer to done than the one above it). |
GC_7_280 started
[code]matrix is 10175033 x 10175210 (3065.6 MB) with weight 907103337 (89.15/col)
sparse part has weight 691696737 (67.98/col) saving the first 48 matrix rows for later matrix is 10174985 x 10175210 (2932.0 MB) with weight 726345051 (71.38/col) sparse part has weight 666845488 (65.54/col) matrix includes 64 packed rows[/code] Factors should be available in a few days. The LA is taking 5.3G RAM and is running 6 threads on a AMD Phenom II X6 1090T clocked at 3.36GHz. To early to say yet how lon the LA will take but I expect 5-8 days. Paul |
[QUOTE=Dubslow;308500]Sorry it took so long, I've had hardware issues and then I spent two days in Windows getting the entire summer's worth of gaming done in a few days. :razz:
[code]PRP54 = 167234023851315043627492602845770131366956631669151511 PRP182 = 35428216977629308709346384897858722957864896521236633920871150047660354226030990937812843100212728231266275403247901894323027521996535525361051619602035845364097644084622293809080053[/code] That was pretty close with the ECM, though perhaps not an ECM "miss" per se. RSALS [URL="http://boinc.unsads.com/rsals/crunching.php"]reports[/URL] ECM to 3t50. Edit: I'll do 160969_43_minus1 (it looks a bit closer to done than the one above it).[/QUOTE]Thanks for doing this one. Definitely not an ECM miss. The 2/9 rule suggests that a single t53 is enough, and even that would give an expected 30% chance of missing a p53. This one will be in the next GCW update at [url]http://www.leyland.vispa.com/numth/factorization/cullen_woodall/cw.html[/url] Paul |
677_91_minus1 complete
1 Attachment(s)
[CODE]prp56 factor: 21367187717338200706589294332636286671332986980768900097
prp127 factor: 2648243237696352591195326027983782094963915077860212822680189712377204660896309588047058756001799188630166145197247070133619597[/CODE] Log file attached for those interested. Jeff. |
[QUOTE=Jeff Gilchrist;308543]Log file attached for those interested.
Jeff.[/QUOTE] estimated available RAM is 1033941.6 MB?!?! commencing Lanczos iteration (24 threads) AMD? |
[QUOTE=pinhodecarlos;308544]AMD?[/QUOTE]
No, 4 x 8 core Intel Xeon. Jeff. |
GC_5_334
Slow start but it's running. 74h32m left in LA.
|
I'll take 1489_71_minus1 next.
|
Tomorrow I would like to start F1887. Lionel, please add the rest of the files, thank you.
|
C171_842592_8038 complete
1 Attachment(s)
Results for C171_842592_8038 are in:
[CODE]prp59 factor: 22550130823172946706007602585567533308467995403760668156907 prp112 factor: 9218104992628609560517410248497397495002686116297356646755578532548455870286214687962418659011768989939675457549[/CODE] Log attached. |
[QUOTE=pinhodecarlos;308802]Tomorrow I would like to start F1887. Lionel, please add the rest of the files, thank you.[/QUOTE]
How do I merge two dat files? |
The best you can do is probably to decompress the large file to a new file, and decompress the small file while appending the contents to the new file.
If msieve took relations in bzip2 format, you could decompress the small file and recompress it with (p)bzip2, then append the compressed file to the large file. |
F1887 files downloaded.
|
Serge is fond of pointing out that gzip'ed files can be concatenated together and will still decompress properly.
In windows, 'copy /b file1+file2+...+filen output_file' should work |
[quote]Serge is fond of pointing out that gzip'ed files can be concatenated together and will still decompress properly.[/quote]Indeed, RSALS has been using that property (shared by bzip2) for more than two years :smile:
But for F1887 and several other numbers, in order to save space on the server (routinely ~2 GB for a 30-bit LPs task), I froze a state of the .dat.gz file, ran remdups4 on it and pbzip2'ed the result, yielding a .dat.bz2 file, not directly usable by msieve. Later, more results are returned by the clients, which get concatenated automatically to .dat.gz file that restarted from zero. It's best for the post-processor to get both the .dat.bz2 file and the .dat.gz file, but they do not mix and match together directly :smile: |
LA for F1887 started. Lionel, if you want you can delete the main file to free up some space on the server.
Edit: Why do you keep the two main files on the server for 11_339_minus1? |
[quote]LA for F1887 started. Lionel, if you want you can delete the main file to free up some space on the server.[/quote]
Done. [quote]Edit: Why do you keep the two main files on the server for 11_339_minus1?[/quote] Because I forgot to remove one :smile: |
I quick note. I managed to merge the two files after a quick search on the forum on how to do it but I decompressed both before merging. Sorry for the question. Anyway, I didn't know about the concatenate gzip'ed files. Thank you.
|
[QUOTE=Dubslow;308500]I'll do 160969_43_minus1 (it looks a bit closer to done than the one above it).[/QUOTE]
Well... this is awkward. I used the relations file for this number as the test for [URL="http://www.mersenneforum.org/showthread.php?p=308528#post308528"]this post[/URL] about rel file format and compression. I had forgotten to unzip it, and I did that when the LA completed (it was zipped with bzip2). Unfortunately, as always, the un/compression caused some corruption. [code]commencing square root phase reading relations for dependency 1 read 3407441 cycles cycles contain 9373226 unique relations error: relation 45392007 corrupt[/code] I manually looped over all 34 dependencies, and each time there was at least one corrupt relation. The simplest method to fix this would be to remdups the rel file, tossing the bad rels, and redo the LA, but that would take a while and a lot of compute time. A more efficient method would be to remdups the file to get the bad rels, and then fix them by hand, then re-add them to the rel file in the right location. This would be rather difficult, and I don't know how to construct rels. To make matters worse, I still need to RMA my graphics card, and Linux won't boot without it (though Windows will), and I'm moving in to college tomorrow anyways, so my computer will be out of action anyways. As such, I'm linking all the relevant files here. [URL="http://dubslow.tk/random/screwup.tar.gz"]This[/URL] is a link to the .cyc, .dep, and .fb files (compressed, 174 MB) [URL="http://dubslow.tk/random/msieve.dat"]this[/URL] is a link to the uncompressed rel file (around 11.7 GB). (Edit: Of course, these links won't work tomorrow, but they should work overnight and after tomorrow.) |
[QUOTE=jasonp;308912]Serge is fond of pointing out that gzip'ed files can be concatenated together and will still decompress properly.
In windows, 'copy /b file1+file2+...+filen output_file' should work[/QUOTE] Yes, under the condition that all parts 'gzip -tv' without errors. (Otherwise there will be more hassle later to gzip repair.) gzip standard was stream-ready at design; a gzip file (or a stream) is like a train of cars (proper chunks, identified by a valid header record), so if we connect two trains the result is a "train" again. Small gzipped files are single cars, or more. But if one train was chopped off (or was broken otherwise), the combined object will be repaired with more lost parts. The first car of the second train will stick to the broken off last car of the first train and both will be discarded. (the gzip repair kit splits the train into single cars and throws away invalid ones; then the user is free to reuse valid chunks. It is a bit of a pain to do but is a valuable life experience. Like a scar. ;-) ) |
GC_5_334 splits as ...
[CODE]prp66 = 450956348716753108838639638208075170303357174618993330771280648693 prp169 = 1063506952024717121188111286584965562004890575356949552987937630434522275071306932720481262782749988010907589767937519608529842432033670755953761378482617292207519668293 [/CODE] |
I'll take care of 160969_43_minus1.
|
[QUOTE=RichD;308980]GC_5_334 splits as ...
[CODE]prp66 = 450956348716753108838639638208075170303357174618993330771280648693 prp169 = 1063506952024717121188111286584965562004890575356949552987937630434522275071306932720481262782749988010907589767937519608529842432033670755953761378482617292207519668293 [/CODE][/QUOTE]Thanks! Paul |
1489_71_minus1 splits as ...
[CODE]prp77 = 45896447589970773832931784986400490560268820308727411473025645680276882636169 oro146 = 27614776777951268487383853140458483138826979094458150355339368759958690289056222634422197634724899284008789933412583236893293328082409931239962079[/CODE]I'll start the download of 601_83_minus1 later today. |
Hey Xyzzy, did you manage to start L1378?
|
[QUOTE]Hey Xyzzy, did you manage to start L1378?[/QUOTE]We have Msieve compiled and the files downloaded. We hope to start on it in the next hour or so.
:max: |
I'll start L2225B a little later today.
|
11_348_plus1
Some nice chunky factors:
[CODE]prp105 factor: 508013567572801156672056299462377737806650679005620374628466587445783275204131411087667280100956328698633 prp112 factor: 4013868290969815790575330375162654481299708640903818986940628751032884242524268643362045008107973342892599738353 [/CODE] |
Please reserve 5_492_plus1 for me. Thank you.
|
I'll postprocess 11_327_minus1
|
L1378, 8 threads on an i7 3770, ETA ~90 hours - sounds right?
We'd like to try more work if there is any available. We have four boxes open ATM. |
[QUOTE=Xyzzy;309120]L1378, 8 threads on an i7 3770, ETA ~90 hours - sounds right?
We'd like to try more work if there is any available. We have four boxes open ATM.[/QUOTE] Something is not right. I'm running F1887 for RSALS and it is a GNFS(162) compared to yours that it is GNFS(163), only one digit higher on difficulty, and my ETA was 68 hours. My machine is a core i5 750 and your machine is twice as fast than mine. Available you have 1886503_37_minus1 and F1761. |
Mike's number may have been (relatively) undersieved. With a bit of deliberate effort, it is possible to construct an enormously hideous matrix for even the easiest of the projects. In other words, on the same number, you can build matrices with runtimes differing as much as 3x. Or more.
Also don't assume that the shear amount of relations will automatically translate into something being easier or harder. There's a correlation; and there are outliers. What are your guys matrix sizes and densities? "GNFS(162) compared to ... GNFS(163)" is not the answer. |
Looking at the logs of SL1482 SNFS(206.5) and SL1497 SNFS(208.6) the latter has less 13.3 % of unique relations and LA took more 4.12 hours.
|
[QUOTE=Batalov;309122]
What are your guys matrix sizes and densities? "GNFS(162) compared to ... GNFS(163)" is not the answer.[/QUOTE] [code] Wed Aug 22 15:59:09 2012 Wed Aug 22 15:59:09 2012 Wed Aug 22 15:59:09 2012 Msieve v. 1.50 (SVN Official Release) Wed Aug 22 15:59:09 2012 random seeds: c8d6ef48 b7464aed Wed Aug 22 15:59:09 2012 factoring 614498075982934569192989628437165526107017211773738674448031180077589665217853924186174508439757946489221018984287184288961227152807048109727179382893794388813901 (162 digits) Wed Aug 22 15:59:10 2012 searching for 15-digit factors Wed Aug 22 15:59:10 2012 commencing number field sieve (162-digit input) Wed Aug 22 15:59:10 2012 R0: -18941922781727744005511236486230 Wed Aug 22 15:59:10 2012 R1: 3234182066303473171 Wed Aug 22 15:59:10 2012 A0: -4052865153070289851931533744530387869 Wed Aug 22 15:59:10 2012 A1: 376045223241549556494050745707624 Wed Aug 22 15:59:10 2012 A2: 3096068317937794091646347 Wed Aug 22 15:59:10 2012 A3: -133861608524052308366 Wed Aug 22 15:59:10 2012 A4: -1096184976744 Wed Aug 22 15:59:10 2012 A5: 252000 Wed Aug 22 15:59:10 2012 skew 6711853.07, size 1.072e-015, alpha -7.038, combined = 1.156e-012 rroots = 5 Wed Aug 22 15:59:10 2012 Wed Aug 22 15:59:10 2012 commencing relation filtering Wed Aug 22 15:59:10 2012 estimated available RAM is 8183.1 MB Wed Aug 22 15:59:10 2012 commencing duplicate removal, pass 1 Wed Aug 22 16:10:56 2012 found 372485 duplicates and [B]94151178 unique relations[/B] Wed Aug 22 16:10:56 2012 memory use: 261.2 MB Wed Aug 22 16:10:56 2012 reading ideals above 78839808 Wed Aug 22 16:10:56 2012 commencing singleton removal, initial pass Wed Aug 22 16:20:05 2012 memory use: 1506.0 MB Wed Aug 22 16:20:05 2012 reading all ideals from disk Wed Aug 22 16:20:25 2012 memory use: 1630.2 MB Wed Aug 22 16:20:29 2012 commencing in-memory singleton removal Wed Aug 22 16:20:32 2012 begin with 94151178 relations and 80000088 unique ideals Wed Aug 22 16:21:13 2012 reduce to 57554044 relations and 39346434 ideals in 15 passes Wed Aug 22 16:21:13 2012 max relations containing the same ideal: 42 Wed Aug 22 16:21:17 2012 reading ideals above 720000 Wed Aug 22 16:21:17 2012 commencing singleton removal, initial pass Wed Aug 22 16:29:23 2012 memory use: 1378.0 MB Wed Aug 22 16:29:23 2012 reading all ideals from disk Wed Aug 22 16:29:48 2012 memory use: 2153.7 MB Wed Aug 22 16:29:55 2012 keeping 48254037 ideals with weight <= 200, target excess is 303726 Wed Aug 22 16:37:52 2012 reduce to 19201725 relations and 18849402 ideals in 4 passes Wed Aug 22 16:37:52 2012 max relations containing the same ideal: 100 Wed Aug 22 16:37:56 2012 relations with 0 large ideals: 1000 Wed Aug 22 16:37:56 2012 relations with 1 large ideals: 2238 Wed Aug 22 16:37:56 2012 relations with 2 large ideals: 34106 Wed Aug 22 16:37:56 2012 relations with 3 large ideals: 268596 Wed Aug 22 16:37:56 2012 relations with 4 large ideals: 1148267 Wed Aug 22 16:37:57 2012 relations with 5 large ideals: 2965420 Wed Aug 22 16:37:57 2012 relations with 6 large ideals: 4868509 Wed Aug 22 16:37:57 2012 relations with 7+ large ideals: 9913589 Wed Aug 22 16:37:57 2012 commencing 2-way merge Wed Aug 22 16:38:09 2012 reduce to 13048367 relation sets and 12696044 unique ideals Wed Aug 22 16:38:09 2012 commencing full merge Wed Aug 22 16:41:07 2012 memory use: 1510.5 MB Wed Aug 22 16:41:09 2012 found 7006430 cycles, need 6958244 Wed Aug 22 16:41:10 2012 weight of 6958244 cycles is about 487117595 (70.01/cycle) Wed Aug 22 16:41:10 2012 distribution of cycle lengths: Wed Aug 22 16:41:10 2012 1 relations: 772528 Wed Aug 22 16:41:10 2012 2 relations: 875994 Wed Aug 22 16:41:10 2012 3 relations: 919269 Wed Aug 22 16:41:10 2012 4 relations: 866373 Wed Aug 22 16:41:10 2012 5 relations: 791880 Wed Aug 22 16:41:10 2012 6 relations: 671347 Wed Aug 22 16:41:10 2012 7 relations: 547723 Wed Aug 22 16:41:10 2012 8 relations: 428896 Wed Aug 22 16:41:10 2012 9 relations: 324500 Wed Aug 22 16:41:10 2012 10+ relations: 759734 Wed Aug 22 16:41:10 2012 heaviest cycle: 19 relations Wed Aug 22 16:41:11 2012 commencing cycle optimization Wed Aug 22 16:41:18 2012 start with 35882034 relations Wed Aug 22 16:42:10 2012 pruned 1004740 relations Wed Aug 22 16:42:10 2012 memory use: 1166.8 MB Wed Aug 22 16:42:10 2012 distribution of cycle lengths: Wed Aug 22 16:42:10 2012 1 relations: 772528 Wed Aug 22 16:42:10 2012 2 relations: 896461 Wed Aug 22 16:42:10 2012 3 relations: 955077 Wed Aug 22 16:42:10 2012 4 relations: 891993 Wed Aug 22 16:42:10 2012 5 relations: 815832 Wed Aug 22 16:42:10 2012 6 relations: 682234 Wed Aug 22 16:42:10 2012 7 relations: 550652 Wed Aug 22 16:42:10 2012 8 relations: 421544 Wed Aug 22 16:42:10 2012 9 relations: 311780 Wed Aug 22 16:42:10 2012 10+ relations: 660143 Wed Aug 22 16:42:10 2012 heaviest cycle: 19 relations Wed Aug 22 16:42:17 2012 RelProcTime: 2587 Wed Aug 22 16:42:17 2012 Wed Aug 22 16:42:17 2012 commencing linear algebra Wed Aug 22 16:42:18 2012 read 6958244 cycles Wed Aug 22 16:42:27 2012 cycles contain 19006256 unique relations Wed Aug 22 16:45:26 2012 read 19006256 relations Wed Aug 22 16:45:51 2012 using 20 quadratic characters above 1073741468 Wed Aug 22 16:47:14 2012 building initial matrix Wed Aug 22 16:50:48 2012 memory use: 2650.8 MB Wed Aug 22 16:51:09 2012 read 6958244 cycles Wed Aug 22 16:51:11 2012 matrix is 6958067 x 6958244 (2107.8 MB) with weight 657191216 (94.45/col) Wed Aug 22 16:51:11 2012 sparse part has weight 469038524 (67.41/col) Wed Aug 22 16:52:12 2012 filtering completed in 2 passes Wed Aug 22 16:52:15 2012 matrix is 6956928 x 6957105 (2107.7 MB) with weight 657151632 (94.46/col) Wed Aug 22 16:52:15 2012 sparse part has weight 469029637 (67.42/col) Wed Aug 22 16:52:55 2012 matrix starts at (0, 0) Wed Aug 22 16:52:57 2012 matrix is 6956928 x 6957105 (2107.7 MB) with weight 657151632 (94.46/col) Wed Aug 22 16:52:57 2012 sparse part has weight 469029637 (67.42/col) Wed Aug 22 16:52:57 2012 saving the first 48 matrix rows for later Wed Aug 22 16:52:59 2012 matrix includes 64 packed rows Wed Aug 22 16:53:00 2012 [B]matrix is 6956880 x 6957105 (2027.3 MB) with weight 522677445 (75.13/col)[/B] Wed Aug 22 16:53:00 2012 sparse part has weight 461877542 (66.39/col) Wed Aug 22 16:53:00 2012 using block size 65536 for processor cache size 8192 kB Wed Aug 22 16:53:27 2012 commencing Lanczos iteration (4 threads) Wed Aug 22 16:53:27 2012 memory use: 1789.7 MB Wed Aug 22 16:54:24 2012 linear algebra at 0.0%, ETA [B]68h41m[/B] Wed Aug 22 16:54:43 2012 checkpointing every 100000 dimensions [/code] |
[QUOTE]Available you have 1886503_37_minus1 and F1761.[/QUOTE]We would like to reserve these if that is okay.
:smile: |
[QUOTE=Xyzzy;309120]L1378, 8 threads on an i7 3770, ETA ~90 hours - sounds right?[/QUOTE]
Did you try 4 threads as well? That might be faster than 8 threads. Otherwise, this looks reasonable to me. |
[QUOTE=Batalov;309122]Mike's number may have been (relatively) undersieved. With a bit of deliberate effort, it is possible to construct an enormously hideous matrix for even the easiest of the projects. In other words, on the same number, you can build matrices with runtimes differing as much as 3x. Or more.
Also don't assume that the shear amount of relations will automatically translate into something being easier or harder. There's a correlation; and there are outliers. [/QUOTE] Ok. I need to study more and more about that correlation and on the subject. |
[QUOTE=Xyzzy;309126]We would like to reserve these if that is okay.
:smile:[/QUOTE] Lionel is The Boss. |
1 Attachment(s)
[QUOTE]What are your guys matrix sizes and densities?[/QUOTE]We have no idea!
:max: |
[QUOTE]Did you try 4 threads as well? That might be faster than 8 threads.[/QUOTE]Is it safe to stop the program and run it again with a different amount of cores? Or does it start from scratch if it is stopped?
[COLOR=green]--> Don't do sudden movements, Mike. This is (just slightly) dangerous.[/COLOR] [COLOR=green]1. It is sort of safe to kill the running process. We assume that you have the latest binaries (earlier binaries sometimes produced the unrestartable savefiles).[/COLOR] [COLOR=green]2. Copy the whole folder for a backup.[/COLOR] [COLOR=green]3. [/COLOR][COLOR=darkred]Do not run -nc2 again! (Happened to almost everyone. Never a pleasant experience.)[/COLOR] [COLOR=green]4. change [B]-t 8 -nc2[/B] into [B]-t 4 -ncr[/B], reexamine the commandline and then press Enter.[/COLOR] [COLOR=#008000][/COLOR] [COLOR=#008000]5. of course there's an option to do nothing and simply wait for 90 hours.[/COLOR] |
Maybe Mike has some el cheapo memory! (just kidding)
Maybe more threads is not always better than less threads. Also maybe he is using MPI where he would better without. ...There are very many variables. It is advanced microwave cookery, that's what it is. [QUOTE][B][COLOR=#0066cc]Rev. Lovejoy[/COLOR][/B]: Coping with senility? [B][COLOR=#0066cc]Jasper[/COLOR][/B]: No. I'm here for Microwave Cookery. No, wait. [[I]pause[/I]] [B][COLOR=#0066cc]Jasper[/COLOR][/B]: [somberly] Coping with senility. [/QUOTE] |
[QUOTE=Batalov;309122]Mike's number may have been (relatively) undersieved. [/QUOTE]
So is it worth to sieve more, something like ~10 %, to get a smaller matrix? Probably the time to sieve more 10 % is higher than processing the matrix with less 10 % of unique relations, right? Or wrong? Can we prove mathematically what is the optimal number of unique relations to build a matrix for a certain number in function of G/SNFS difficulty without spending too much of effort on sieving? |
Mike's number:
found 186853 duplicates and 89571509 unique relations 8241565 x 8241790 (2395.7 MB) with weight 614375248 (74.54/col) That explains the 90 hours for linear algebra. My number has more unique relations for a ~7M^2 matrix. |
[QUOTE=Batalov;309132]
...There are very many variables. It is advanced microwave cookery, that's what it is.[/QUOTE] Most of the times I don't understand your jokes (the dialogues) and I get furious... Tomorrow I'll see you answers, just passed my bed time (my girlfriend is pointing me a gun!). |
[QUOTE][COLOR=green]We assume that you have the latest binaries (earlier binaries sometimes produced the unrestartable savefiles).[/COLOR][/QUOTE]Msieve v. 1.51 (SVN 755)
[quote][COLOR=darkred]Do not run -nc2 again! (Happened to almost everyone. Never a pleasant experience.)[/COLOR][/quote]We ran: [FONT="Courier New"]nice -19 ./msieve -s L1378.dat -l L1378.log -i L1378.ini -nf L1378.fb -nc -t 8 -v[/FONT] [QUOTE][COLOR=green]Of course there's an option to do nothing and simply wait for 90 hours.[/COLOR][/quote]90 hours works for us! |
If you stop it, you can safely restart it with
nice -19 ./msieve -s L1378.dat -l L1378.log -i L1378.ini -nf L1378.fb -ncr -nc3 -t 4 -v Or you can wait 90 hours. :smile: |
[QUOTE=Xyzzy;309139]
We ran: [FONT="Courier New"]nice -19 ./msieve -s L1378.dat -l L1378.log -i L1378.ini -nf L1378.fb -nc -t 9 -v[/FONT][/QUOTE] Just adding the r to -nc will work fine. [FONT="Courier New"]nice -19 ./msieve -s L1378.dat -l L1378.log -i L1378.ini -nf L1378.fb -ncr -t 4 -v[/FONT] -nc1 will do just filtering (building the matrix from the rels) -nc2 will do the matrix -nc3 will do the part after the matrix -nc will do all of them -ncr will resume a matrix [QUOTE=frmky;309146]If you stop it, you can safely restart it with nice -19 ./msieve -s L1378.dat -l L1378.log -i L1378.ini -nf L1378.fb -ncr -nc3 -t 4 -v Or you can wait 90 hours. :smile:[/QUOTE] -ncr will do the sqrt as well. |
Four threads instead of eight:
[FONT=Courier New]linear algebra completed 427267 of 8241790 dimensions (5.2%, ETA 55h23m)[/FONT] :smile: |
[QUOTE=Xyzzy;309149]Four threads instead of eight:
[FONT=Courier New]linear algebra completed 427267 of 8241790 dimensions (5.2%, ETA 55h23m)[/FONT][/QUOTE] Those virtual cores are getting in the way of progress it seems. Jeff. |
Coping with senility
[QUOTE=pinhodecarlos;309137]Most of the times I don't understand your jokes (the dialogues) and I get furious...
Tomorrow I'll see you answers, just passed my bed time (my girlfriend is pointing me a gun!).[/QUOTE] It's [URL="http://www.imdb.com/title/tt0763029/quotes"]Simpsons[/URL] (Season 4, Episode 16). I even had that as the avatar signature. Aw wait, and then I didn't. Anyway, try the angagramatic aligner: [CODE]Coping with senility Incites pitying howl No Syphilitic Twinge oily twitching penis [/CODE] |
[QUOTE=Xyzzy;309149]Four threads instead of eight:
[FONT=Courier New]linear algebra completed 427267 of 8241790 dimensions (5.2%, ETA 55h23m)[/FONT] :smile:[/QUOTE] Thar' she blows! |
[QUOTE=Dubslow;309147]-ncr will do the sqrt as well.[/QUOTE]
Really? If so, that differs from the MPI version. |
[QUOTE=pinhodecarlos;309133]Can we prove mathematically what is the optimal number of unique relations to build a matrix for a certain number in function of G/SNFS difficulty without spending too much of effort on sieving?[/QUOTE]
I can envision the Feynman defense here. [B]Theorem[/B]: There exists a distance from which a female looks most [URL="http://en.wikipedia.org/wiki/Physical_attractiveness"]attractive[/URL]. [B]Proof[/B]: At d=0, we cannot see anything so the attractiveness = 0. At d=∞, it's too far to see anything so the attractiveness = 0. At 0<d<∞, the attractiveness is a positive smooth function. ???? [STRIKE]PROFIT.[/STRIKE] QED |
1 Attachment(s)
P.S. I should quote Wikipedia more often.
No sooner than I quoted the article (but no later than a minute after), it has been vandalized: |
[QUOTE]Lionel is The Boss.[/QUOTE]We're going to assume it is okay. If it isn't then we are just out a little CPU time. Sleep eludes us and we need something constructive to do.
1886503_37_minus1 F1761 Apologies in advance if we are stepping on anyone's toes. We can stop the jobs at any time, no problem. :smile: |
All RSALS numbers are now reserved for post-processing :smile:
|
[QUOTE=Xyzzy;309149]Four threads instead of eight:
[FONT=Courier New]linear algebra completed 427267 of 8241790 dimensions (5.2%, ETA 55h23m)[/FONT] :smile:[/QUOTE] [QUOTE=Batalov;309166]Thar' she blows![/QUOTE] After all I was right...something was not right. Mike, weren't you with HT off? |
[QUOTE]After all I was right...something was not right.
Mike, weren't you with HT off?[/QUOTE]We have HT turned on. Should we turn it off in the BIOS screen? Or is running just four threads the same effect? |
[QUOTE=Batalov;309174]At 0<d<∞, the attractiveness is a positive smooth function. [/QUOTE]Unwarranted assumption.
|
To continue with off-topic:
[QUOTE=xilman;309191]Unwarranted assumption.[/QUOTE] Not only that, but even with the assumption made the claim does not follow. (What I would assume is: attractiveness is a lower-semicontinuous function [TEX]A[/TEX] of the distance on [TEX][0,\infty)[/TEX] with [TEX]A(0)=0[/TEX] and [TEX]\limsup_{d \to \infty}A(d)\le 0[/TEX]. However, lower-semicontinuity I cannot justify nor the loss of attractiveness at infinity.) |
[QUOTE=Xyzzy;309187]We have HT turned on. Should we turn it off in the BIOS screen? Or is running just four threads the same effect?[/QUOTE]
Don't do nothing. I am just amazed, as Jeff Gilchrist stated, that "Those virtual cores are getting in the way of progress it seems." When I was testing an i7 laptop I found that running 8 threads instead of only 4 cores was a little faster (sandy bridge processor) but passing from 90 hours (8T) to ~58 hours (4C) it is a reduction I was not expecting at all! Now I don't understand if I should invest on a 3rd generation core i5 or core i7 |
Doing just about anything with sparse matrices is black magic. I've thought that it would be a nice Master's thesis for someone to look at the low-level data structures and the (IMO crappy) core matrix multiply code and really take the time to tune it nicely on modern x86 architectures. The code everyone is using for that was mostly written in 2008.
|
[QUOTE=jasonp;309199]Doing just about anything with sparse matrices is black magic. I've thought that it would be a nice Master's thesis for someone to look at the low-level data structures and the (IMO crappy) core matrix multiply code and really take the time to tune it nicely on modern x86 architectures. The code everyone is using for that was mostly written in 2008.[/QUOTE]
Maybe it's time to someone give you access to a remote Ivy Bridge processor so you can start tweaking the code. If I had the knowledge I would help. |
I wish I had the time to tweak anything...
|
1 Attachment(s)
[QUOTE=pinhodecarlos;308802]Tomorrow I would like to start F1887. Lionel, please add the rest of the files, thank you.[/QUOTE]
Done. [code]prp76 factor: 4917013079084209537679070319124203745778083784210315661783186986368247867961 prp87 factor: 124973854268735122618747041792310619544286379392091899196205253675188127512652629327541 [/code] |
[QUOTE=xilman;309191]Unwarranted assumption.[/QUOTE]
Surely, for some individuals (mothers-in-law?), it could be a discontinuous all-negative function. We will not deal with this special case. It was a solution for a spherical poistively-attractive female in vacuum observed with an ideal, objective and infinitely sharp spherical eye ...also in vacuum! :smile: |
[QUOTE=Batalov;309208]Surely, for some individuals (mothers-in-law?), it could be a discontinuous all-negative function. We will not deal with this special case. It was a solution for a spherical poistively-attractive female in vacuum observed with an ideal, objective and infinitely sharp spherical eye ...also in vacuum! :smile:[/QUOTE]If the female is in vacuum for any length of time I very strongly doubt your assumption of positively attractive.
|
Are you in for abuse or an argument?
Vacuum does nothing to a sperical female! Nor does time! :no: _________________ Back to topic: there exists the (range of) matrix of ideal size(s) (spherical, in vacuum, of course) that for all local constraints (the particular composite, the limits, the polynomial, #cores for sieving, #cores for LA, interconnect quality between cores, risk of electric black/brownouts) minimizes the wall-clock running time. It is too time-consuming to find it precisely (because the function is unknown and different for each set of input parameters). Artillerists among us use the triangulation technique. (too close, too far, ... bingo, how fun!) |
[QUOTE=Batalov;309210]Are you in for abuse or an argument?
Vacuum does nothing to a sperical female! Nor does time! :no: _________________ Back to topic: there exists the (range of) matrix of ideal size(s) (spherical, in vacuum, of course) that for all local constraints (the particular composite, the limits, the polynomial, #cores for sieving, #cores for LA, interconnect quality between cores, risk of electric black/brownouts) minimizes the wall-clock running time. It is too time-consuming to find it precisely (because the function is unknown and different for each set of input parameters). Artillerists among us use the triangulation technique. (too close, too far, ... bingo, how fun!)[/QUOTE]Stop this! Stop this! It's getting too silly! |
"[URL="http://www.msnbc.msn.com/id/34482178/ns/health-skin_and_beauty/t/ideal-beauty-matter-millimeters-study-says/"]Ideal beauty a matter of millimeters, study says[/URL]" :w00t:
|
[QUOTE=Batalov;309132]It is advanced microwave cookery, that's what it is.[/QUOTE]
[quote]Raindrops keep fallin' on my head But that doesn't mean my eyes will soon be turnin' red Cryin's not for me 'Cause I'm never gonna stop the rain by complainin' Because I'm free Nothin's worryin' me[/quote] He gave up beer for a month! :shock: |
[QUOTE=debrouxl;309183]All RSALS numbers are now reserved for post-processing :smile:[/QUOTE]And GC_7_280 is in the sqrt phase. Unfortunately, it's been there for several hours and is currently working on the seventh dependency. I'm hoping the LA hasn't failed. The log file reports "Sun Aug 26 04:09:22 2012 recovered 38 nontrivial dependencies"
|
[QUOTE=xilman;309263]And GC_7_280 is in the sqrt phase. Unfortunately, it's been there for several hours and is currently working on the seventh dependency. I'm hoping the LA hasn't failed. The log file reports "Sun Aug 26 04:09:22 2012 recovered 38 nontrivial dependencies"[/QUOTE]
That happened to me recently but luckily the factors popped out after the 20th dependency. We will keep our fingers crossed for you. |
Ouch. Not even an intermediate factor printed out during the square root that allows a mnual GCD? Does the square root actually complain about dependencies being incorrect?
|
[QUOTE=jasonp;309265]Ouch. Not even an intermediate factor printed out during the square root that allows a mnual GCD? Does the square root actually complain about dependencies being incorrect?[/QUOTE]It's the old version which doesn't print out any factors until all are available. I should update one day but it's been usable up to now.
Sometimes I get unlucky. Some years ago the CWI suite took ten dependencies to find the first and only factorization. Do enough NFS runs and sooner or later you'll run into this effect. I've done hundreds, possibly a couple of thousand, over the years and so I expect to take 8-10 dependencies occasionally. Now on dependency 8. |
[QUOTE=xilman;309266]It's the old version which doesn't print out any factors until all are available. I should update one day but it's been usable up to now.
Sometimes I get unlucky. Some years ago the CWI suite took ten dependencies to find the first and only factorization. Do enough NFS runs and sooner or later you'll run into this effect. I've done hundreds, possibly a couple of thousand, over the years and so I expect to take 8-10 dependencies occasionally. Now on dependency 8.[/QUOTE]That was the successful one: [code] reading relations for dependency 8 read 5089233 cycles cycles contain 13485332 unique relations read 13485332 relations multiplying 13485332 relations multiply complete, coefficients have about 577.74 million bits initial square root is modulo 152533 sqrtTime: 32130 prp54 factor: 237960100936499470070803715680643414397114649125940561 prp131 factor: 66727742898770680536930377566302071940289006468179246773908204720256358555415036260545147328546962892834415255401872878589456692711 elapsed time 171:19:14 [/code] To be expected once every few hundred factorizations. Not an ECM miss, though it would have been nicer if the split had been more equal. Paul |
My very first NFS job that other people cared about took until [URL="http://www.mersenneforum.org/showpost.php?p=297696&postcount=1776"]the 10th dependency[/URL].
|
| All times are UTC. The time now is 09:53. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.