![]() |
|
|
#1 | |
|
"James Heinrich"
May 2004
ex-Northern Ontario
11·311 Posts |
Quote:
![]() v1.39 doesn't seem to thread very well. Even with -t 12, at best I saw 6 physical cores in use, with each core running at about 60% capacity. v1.49 seems to thread much better, at -t 12 it's spread evenly across 12 hyperthreaded cores at about 65-70% load evenly on all cores. Attached log file includes (in order): 1.39: msieve.exe -v -nc -t 12 1.39: msieve.exe -v -nc -t 6 1.49: msieve.exe -v -nc -t 12 1.49: msieve.exe -v -nc -t 6 1.49: msieve.exe -v -nc -t 3 |
|
|
|
|
|
|
#2 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
3·2,141 Posts |
Something odd is going on here: 1.39 is faster than 1.49 even though it's reporting lower CPU usage and (since filtering has been improved between 1.39 and 1.49) running a larger matrix.
Last fiddled with by fivemack on 2011-12-17 at 09:54 |
|
|
|
|
|
#3 |
|
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
3×17×97 Posts |
Could you also post the spec of your machine? (memory speed, motherboard etc)
Thank you in advance. |
|
|
|
|
|
#4 | |
|
"James Heinrich"
May 2004
ex-Northern Ontario
342110 Posts |
Quote:
|
|
|
|
|
|
|
#5 |
|
Sep 2004
2·5·283 Posts |
Core i5 750@3.6GHz, 8 GB DDR3 1333 MHz.
Code:
Sat Dec 17 15:22:31 2011 Sat Dec 17 15:22:31 2011 Sat Dec 17 15:22:31 2011 Msieve v. 1.50 (SVN unknown) Sat Dec 17 15:22:31 2011 random seeds: df0f8cb4 76141666 Sat Dec 17 15:22:31 2011 factoring 4173963189037240309713561470980060844301599530724369583171546744398445920275944654104909360191816721597816377363188552009346572473 (130 digits) Sat Dec 17 15:22:32 2011 searching for 15-digit factors Sat Dec 17 15:22:32 2011 commencing number field sieve (130-digit input) Sat Dec 17 15:22:32 2011 R0: -10678489767165051962024349 Sat Dec 17 15:22:32 2011 R1: 202148791722581 Sat Dec 17 15:22:32 2011 A0: 130068846150643433934522883496770 Sat Dec 17 15:22:32 2011 A1: 1272305145809368946430393037 Sat Dec 17 15:22:32 2011 A2: -4820655295278675156855 Sat Dec 17 15:22:32 2011 A3: -39469582197230476 Sat Dec 17 15:22:32 2011 A4: 38243516789 Sat Dec 17 15:22:32 2011 A5: 30060 Sat Dec 17 15:22:32 2011 skew 228570.75, size 1.569e-012, alpha -6.792, combined = 7.496e-011 rroots = 5 Sat Dec 17 15:22:32 2011 Sat Dec 17 15:22:32 2011 commencing relation filtering Sat Dec 17 15:22:32 2011 estimated available RAM is 8183.1 MB Sat Dec 17 15:22:32 2011 commencing duplicate removal, pass 1 Sat Dec 17 15:24:32 2011 found 2598477 hash collisions in 13708853 relations Sat Dec 17 15:24:50 2011 added 1 free relations Sat Dec 17 15:24:50 2011 commencing duplicate removal, pass 2 Sat Dec 17 15:24:55 2011 found 2526833 duplicates and 11182021 unique relations Sat Dec 17 15:24:55 2011 memory use: 82.6 MB Sat Dec 17 15:24:55 2011 reading ideals above 720000 Sat Dec 17 15:24:56 2011 commencing singleton removal, initial pass Sat Dec 17 15:26:47 2011 memory use: 344.5 MB Sat Dec 17 15:26:47 2011 reading all ideals from disk Sat Dec 17 15:26:47 2011 memory use: 330.4 MB Sat Dec 17 15:26:48 2011 commencing in-memory singleton removal Sat Dec 17 15:26:49 2011 begin with 11182021 relations and 11719970 unique ideals Sat Dec 17 15:26:54 2011 reduce to 5210969 relations and 4863630 ideals in 16 passes Sat Dec 17 15:26:54 2011 max relations containing the same ideal: 96 Sat Dec 17 15:26:56 2011 removing 825491 relations and 719218 ideals in 106273 cliques Sat Dec 17 15:26:56 2011 commencing in-memory singleton removal Sat Dec 17 15:26:56 2011 begin with 4385478 relations and 4863630 unique ideals Sat Dec 17 15:26:58 2011 reduce to 4293130 relations and 4049611 ideals in 10 passes Sat Dec 17 15:26:58 2011 max relations containing the same ideal: 83 Sat Dec 17 15:27:00 2011 removing 617219 relations and 510946 ideals in 106273 cliques Sat Dec 17 15:27:00 2011 commencing in-memory singleton removal Sat Dec 17 15:27:00 2011 begin with 3675911 relations and 4049611 unique ideals Sat Dec 17 15:27:02 2011 reduce to 3609739 relations and 3470882 ideals in 10 passes Sat Dec 17 15:27:02 2011 max relations containing the same ideal: 73 Sat Dec 17 15:27:03 2011 relations with 0 large ideals: 486 Sat Dec 17 15:27:03 2011 relations with 1 large ideals: 2707 Sat Dec 17 15:27:03 2011 relations with 2 large ideals: 35991 Sat Dec 17 15:27:03 2011 relations with 3 large ideals: 200230 Sat Dec 17 15:27:03 2011 relations with 4 large ideals: 592581 Sat Dec 17 15:27:03 2011 relations with 5 large ideals: 989462 Sat Dec 17 15:27:03 2011 relations with 6 large ideals: 978596 Sat Dec 17 15:27:03 2011 relations with 7+ large ideals: 809686 Sat Dec 17 15:27:03 2011 commencing 2-way merge Sat Dec 17 15:27:05 2011 reduce to 2184638 relation sets and 2045781 unique ideals Sat Dec 17 15:27:05 2011 commencing full merge Sat Dec 17 15:27:26 2011 memory use: 228.9 MB Sat Dec 17 15:27:27 2011 found 1094722 cycles, need 1073981 Sat Dec 17 15:27:27 2011 weight of 1073981 cycles is about 75457035 (70.26/cycle) Sat Dec 17 15:27:27 2011 distribution of cycle lengths: Sat Dec 17 15:27:27 2011 1 relations: 130527 Sat Dec 17 15:27:27 2011 2 relations: 127368 Sat Dec 17 15:27:27 2011 3 relations: 125795 Sat Dec 17 15:27:27 2011 4 relations: 113119 Sat Dec 17 15:27:27 2011 5 relations: 102993 Sat Dec 17 15:27:27 2011 6 relations: 87941 Sat Dec 17 15:27:27 2011 7 relations: 76759 Sat Dec 17 15:27:27 2011 8 relations: 65165 Sat Dec 17 15:27:27 2011 9 relations: 54780 Sat Dec 17 15:27:27 2011 10+ relations: 189534 Sat Dec 17 15:27:27 2011 heaviest cycle: 21 relations Sat Dec 17 15:27:27 2011 commencing cycle optimization Sat Dec 17 15:27:28 2011 start with 6170552 relations Sat Dec 17 15:27:35 2011 pruned 146025 relations Sat Dec 17 15:27:35 2011 memory use: 206.5 MB Sat Dec 17 15:27:35 2011 distribution of cycle lengths: Sat Dec 17 15:27:35 2011 1 relations: 130527 Sat Dec 17 15:27:35 2011 2 relations: 130418 Sat Dec 17 15:27:35 2011 3 relations: 130196 Sat Dec 17 15:27:35 2011 4 relations: 115818 Sat Dec 17 15:27:35 2011 5 relations: 105171 Sat Dec 17 15:27:35 2011 6 relations: 88903 Sat Dec 17 15:27:35 2011 7 relations: 77655 Sat Dec 17 15:27:35 2011 8 relations: 64752 Sat Dec 17 15:27:35 2011 9 relations: 54165 Sat Dec 17 15:27:35 2011 10+ relations: 176376 Sat Dec 17 15:27:35 2011 heaviest cycle: 20 relations Sat Dec 17 15:27:36 2011 RelProcTime: 304 Sat Dec 17 15:27:36 2011 Sat Dec 17 15:27:36 2011 commencing linear algebra Sat Dec 17 15:27:36 2011 read 1073981 cycles Sat Dec 17 15:27:37 2011 cycles contain 3464474 unique relations Sat Dec 17 15:28:08 2011 read 3464474 relations Sat Dec 17 15:28:11 2011 using 20 quadratic characters above 134217498 Sat Dec 17 15:28:25 2011 building initial matrix Sat Dec 17 15:28:53 2011 memory use: 453.8 MB Sat Dec 17 15:28:54 2011 read 1073981 cycles Sat Dec 17 15:28:55 2011 matrix is 1073793 x 1073981 (326.8 MB) with weight 103906487 (96.75/col) Sat Dec 17 15:28:55 2011 sparse part has weight 72792856 (67.78/col) Sat Dec 17 15:29:01 2011 filtering completed in 2 passes Sat Dec 17 15:29:02 2011 matrix is 1071372 x 1071560 (326.6 MB) with weight 103797093 (96.87/col) Sat Dec 17 15:29:02 2011 sparse part has weight 72757804 (67.90/col) Sat Dec 17 15:29:03 2011 matrix starts at (0, 0) Sat Dec 17 15:29:04 2011 matrix is 1071372 x 1071560 (326.6 MB) with weight 103797093 (96.87/col) Sat Dec 17 15:29:04 2011 sparse part has weight 72757804 (67.90/col) Sat Dec 17 15:29:04 2011 saving the first 48 matrix rows for later Sat Dec 17 15:29:04 2011 matrix includes 64 packed rows Sat Dec 17 15:29:04 2011 matrix is 1071324 x 1071560 (314.1 MB) with weight 82570139 (77.06/col) Sat Dec 17 15:29:04 2011 sparse part has weight 71622900 (66.84/col) Sat Dec 17 15:29:04 2011 using block size 65536 for processor cache size 8192 kB Sat Dec 17 15:29:08 2011 commencing Lanczos iteration (4 threads) Sat Dec 17 15:29:08 2011 memory use: 268.3 MB Sat Dec 17 15:29:13 2011 linear algebra at 0.1%, ETA 0h47m Sat Dec 17 15:29:15 2011 checkpointing every 1210000 dimensions |
|
|
|
|
|
#6 |
|
Sep 2004
2·5·283 Posts |
Anyone here with more benches?
|
|
|
|
|
|
#7 |
|
"Ben"
Feb 2007
22×3×293 Posts |
Here is some ETA based data for two different machines. I can confirm that 1.39 appears to be faster than 1.49, although it doesn't scale quite as well. Utilization with 8 cores on machine 1 using 1.39 hovered around 55%, while with 1.49 it was around 85%.
machine 1 (hyperthreading disabled): Dual Intel(R)Xeon(R) CPU X5687 @ 3.60GHz (4 cores) triple channel DDR3 @ 1333 MHz (per CPU) + QPI CPU interconnect Code:
v1.49 (win 7 pro, 64 bit): threads, ETA (min) (after settling) 2, 66 4, 41 8, 30 v1.39 (win 7 pro, 64 bit): threads, ETA (min) (after settling) 2, 48 4, 32 8, 29 Dual Intel(R)Xeon(R) CPU X5680 @ 3.33GHz (6 cores) triple channel DDR3 @ 1333 MHz (per CPU) + QPI CPU interconnect Code:
v1.50 (linux, 64 bit) threads, ETA (min) (after settling) 4, 58 6, 45 8, 41 12, 39 24, 53 |
|
|
|
|
|
#8 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
3×2,141 Posts |
Do you have the ability to try out the MPI version on that dual-hex-Xeon machine? It was a distinct improvement on my (now xilman's) dual-quad-Opteron, see
http://www.mersenneforum.org/showpos...9&postcount=60 http://www.mersenneforum.org/showpos...9&postcount=65 |
|
|
|
|
|
#9 | |
|
"Ben"
Feb 2007
22·3·293 Posts |
Quote:
|
|
|
|
|
|
|
#10 |
|
Tribal Bullet
Oct 2004
DD716 Posts |
Other than adding MPI, the only other major change between v1.39 and v1.50 is that now vector-vector multiplies are handled by multiple threads. Even if that causes cache thrashing, it shouldn't cost 25% of the total performance...
Both MPI and multithreaded vector-vector code were added in v1.46. Can someone see if the regression happens between v1.45 and v1.46? Unfortunately the SF page only has v1.41 and up, I can provide older releases if necessary. Last fiddled with by jasonp on 2011-12-29 at 19:38 |
|
|
|
|
|
#11 |
|
"Frank <^>"
Dec 2004
CDP Janesville
41128 Posts |
Ooops....I thought I had put 1.49 on my hexcore to test, but looking at the log, I only had 1.39 on there, so I'll have to re-run the benches again. But just for yucks, here is a log for the LA on my box for 1.39.
Summary of run times for 1-6 threads with nothing else running: Code:
elapsed time 02:56:55 elapsed time 01:39:01 elapsed time 01:15:45 elapsed time 01:05:25 elapsed time 01:00:09 elapsed time 00:59:16 Full specs on the box:
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Benchmarks | Pjetrode | Information & Answers | 3 | 2018-01-07 23:23 |
| Where are the Benchmarks | Sandman192 | Homework Help | 17 | 2012-04-05 19:03 |
| Benchmarks | MurrayInfoSys | Information & Answers | 3 | 2011-04-14 17:10 |
| Benchmarks for 24.12 | Prime95 | Software | 60 | 2005-06-11 07:35 |
| Benchmarks | Vandy | Hardware | 6 | 2002-10-28 13:45 |