![]() |
i7-3930K benchmarks
1 Attachment(s)
[QUOTE=fivemack]I see you have a 3930K now. Would you be prepared to do what was done at [url]http://www.mersenneforum.org/showpost.php?p=152301&postcount=7[/url]
(though -t 3, -t 6, -t 12 are more sensible thread-counts to test on a six-core) and also running msieve149.exe from msieve149.zip downloaded from [url]http://sourceforge.net/projects/msieve/files/msieve/Msieve%20v1.49/[/url] Please post the answers in the msieve forum [url]http://www.mersenneforum.org/forumdisplay.php?f=83[/url] I am extremely interested in how fast the four-channel memory controller is for msieve linear algebra as the thread count increases; if it is interestingly fast then I'll probably get a 3930K next month.[/QUOTE] I'm just posting here at [i]fivemack[/i]'s request, I don't actually understand what these results mean. :smile: v1.39 doesn't seem to thread very well. Even with -t 12, at best I saw 6 physical cores in use, with each core running at about 60% capacity. v1.49 seems to thread much better, at -t 12 it's spread evenly across 12 hyperthreaded cores at about 65-70% load evenly on all cores. Attached log file includes (in order): 1.39: msieve.exe -v -nc -t 12 1.39: msieve.exe -v -nc -t 6 1.49: msieve.exe -v -nc -t 12 1.49: msieve.exe -v -nc -t 6 1.49: msieve.exe -v -nc -t 3 |
Something odd is going on here: 1.39 is faster than 1.49 even though it's reporting lower CPU usage and (since filtering has been improved between 1.39 and 1.49) running a larger matrix.
[TEX] \begin{tabular}{|l|l|l|l|l|l|} \hline & time (3)& time (6)& time (12) \\ \hline \\ 1.39& & 1245 & 1301 \\ 1.49& 2363 & 1653 & 1571 \\ \hline \end{tabular} [/TEX] |
Could you also post the spec of your machine? (memory speed, motherboard etc)
Thank you in advance. |
[QUOTE=pinhodecarlos;282568]Could you also post the spec of your machine? (memory speed, motherboard etc)[/QUOTE]Sorry, yes:[list][*]CPU: Intel i7-3930K @ 4500MHz[*]RAM: Corsair Vengeance 4*8GB [quad-channel] (CMZ32GX3M4X1600C10) DDR3-1600 10-10-10-27-2T[*]motherboard: Asus P9X79 PRO[/list]Let me know if I missed anything.
|
Core i5 750@3.6GHz, 8 GB DDR3 1333 MHz.
[code]Sat Dec 17 15:22:31 2011 Sat Dec 17 15:22:31 2011 Sat Dec 17 15:22:31 2011 Msieve v. 1.50 (SVN unknown) Sat Dec 17 15:22:31 2011 random seeds: df0f8cb4 76141666 Sat Dec 17 15:22:31 2011 factoring 4173963189037240309713561470980060844301599530724369583171546744398445920275944654104909360191816721597816377363188552009346572473 (130 digits) Sat Dec 17 15:22:32 2011 searching for 15-digit factors Sat Dec 17 15:22:32 2011 commencing number field sieve (130-digit input) Sat Dec 17 15:22:32 2011 R0: -10678489767165051962024349 Sat Dec 17 15:22:32 2011 R1: 202148791722581 Sat Dec 17 15:22:32 2011 A0: 130068846150643433934522883496770 Sat Dec 17 15:22:32 2011 A1: 1272305145809368946430393037 Sat Dec 17 15:22:32 2011 A2: -4820655295278675156855 Sat Dec 17 15:22:32 2011 A3: -39469582197230476 Sat Dec 17 15:22:32 2011 A4: 38243516789 Sat Dec 17 15:22:32 2011 A5: 30060 Sat Dec 17 15:22:32 2011 skew 228570.75, size 1.569e-012, alpha -6.792, combined = 7.496e-011 rroots = 5 Sat Dec 17 15:22:32 2011 Sat Dec 17 15:22:32 2011 commencing relation filtering Sat Dec 17 15:22:32 2011 estimated available RAM is 8183.1 MB Sat Dec 17 15:22:32 2011 commencing duplicate removal, pass 1 Sat Dec 17 15:24:32 2011 found 2598477 hash collisions in 13708853 relations Sat Dec 17 15:24:50 2011 added 1 free relations Sat Dec 17 15:24:50 2011 commencing duplicate removal, pass 2 Sat Dec 17 15:24:55 2011 found 2526833 duplicates and 11182021 unique relations Sat Dec 17 15:24:55 2011 memory use: 82.6 MB Sat Dec 17 15:24:55 2011 reading ideals above 720000 Sat Dec 17 15:24:56 2011 commencing singleton removal, initial pass Sat Dec 17 15:26:47 2011 memory use: 344.5 MB Sat Dec 17 15:26:47 2011 reading all ideals from disk Sat Dec 17 15:26:47 2011 memory use: 330.4 MB Sat Dec 17 15:26:48 2011 commencing in-memory singleton removal Sat Dec 17 15:26:49 2011 begin with 11182021 relations and 11719970 unique ideals Sat Dec 17 15:26:54 2011 reduce to 5210969 relations and 4863630 ideals in 16 passes Sat Dec 17 15:26:54 2011 max relations containing the same ideal: 96 Sat Dec 17 15:26:56 2011 removing 825491 relations and 719218 ideals in 106273 cliques Sat Dec 17 15:26:56 2011 commencing in-memory singleton removal Sat Dec 17 15:26:56 2011 begin with 4385478 relations and 4863630 unique ideals Sat Dec 17 15:26:58 2011 reduce to 4293130 relations and 4049611 ideals in 10 passes Sat Dec 17 15:26:58 2011 max relations containing the same ideal: 83 Sat Dec 17 15:27:00 2011 removing 617219 relations and 510946 ideals in 106273 cliques Sat Dec 17 15:27:00 2011 commencing in-memory singleton removal Sat Dec 17 15:27:00 2011 begin with 3675911 relations and 4049611 unique ideals Sat Dec 17 15:27:02 2011 reduce to 3609739 relations and 3470882 ideals in 10 passes Sat Dec 17 15:27:02 2011 max relations containing the same ideal: 73 Sat Dec 17 15:27:03 2011 relations with 0 large ideals: 486 Sat Dec 17 15:27:03 2011 relations with 1 large ideals: 2707 Sat Dec 17 15:27:03 2011 relations with 2 large ideals: 35991 Sat Dec 17 15:27:03 2011 relations with 3 large ideals: 200230 Sat Dec 17 15:27:03 2011 relations with 4 large ideals: 592581 Sat Dec 17 15:27:03 2011 relations with 5 large ideals: 989462 Sat Dec 17 15:27:03 2011 relations with 6 large ideals: 978596 Sat Dec 17 15:27:03 2011 relations with 7+ large ideals: 809686 Sat Dec 17 15:27:03 2011 commencing 2-way merge Sat Dec 17 15:27:05 2011 reduce to 2184638 relation sets and 2045781 unique ideals Sat Dec 17 15:27:05 2011 commencing full merge Sat Dec 17 15:27:26 2011 memory use: 228.9 MB Sat Dec 17 15:27:27 2011 found 1094722 cycles, need 1073981 Sat Dec 17 15:27:27 2011 weight of 1073981 cycles is about 75457035 (70.26/cycle) Sat Dec 17 15:27:27 2011 distribution of cycle lengths: Sat Dec 17 15:27:27 2011 1 relations: 130527 Sat Dec 17 15:27:27 2011 2 relations: 127368 Sat Dec 17 15:27:27 2011 3 relations: 125795 Sat Dec 17 15:27:27 2011 4 relations: 113119 Sat Dec 17 15:27:27 2011 5 relations: 102993 Sat Dec 17 15:27:27 2011 6 relations: 87941 Sat Dec 17 15:27:27 2011 7 relations: 76759 Sat Dec 17 15:27:27 2011 8 relations: 65165 Sat Dec 17 15:27:27 2011 9 relations: 54780 Sat Dec 17 15:27:27 2011 10+ relations: 189534 Sat Dec 17 15:27:27 2011 heaviest cycle: 21 relations Sat Dec 17 15:27:27 2011 commencing cycle optimization Sat Dec 17 15:27:28 2011 start with 6170552 relations Sat Dec 17 15:27:35 2011 pruned 146025 relations Sat Dec 17 15:27:35 2011 memory use: 206.5 MB Sat Dec 17 15:27:35 2011 distribution of cycle lengths: Sat Dec 17 15:27:35 2011 1 relations: 130527 Sat Dec 17 15:27:35 2011 2 relations: 130418 Sat Dec 17 15:27:35 2011 3 relations: 130196 Sat Dec 17 15:27:35 2011 4 relations: 115818 Sat Dec 17 15:27:35 2011 5 relations: 105171 Sat Dec 17 15:27:35 2011 6 relations: 88903 Sat Dec 17 15:27:35 2011 7 relations: 77655 Sat Dec 17 15:27:35 2011 8 relations: 64752 Sat Dec 17 15:27:35 2011 9 relations: 54165 Sat Dec 17 15:27:35 2011 10+ relations: 176376 Sat Dec 17 15:27:35 2011 heaviest cycle: 20 relations Sat Dec 17 15:27:36 2011 RelProcTime: 304 Sat Dec 17 15:27:36 2011 Sat Dec 17 15:27:36 2011 commencing linear algebra Sat Dec 17 15:27:36 2011 read 1073981 cycles Sat Dec 17 15:27:37 2011 cycles contain 3464474 unique relations Sat Dec 17 15:28:08 2011 read 3464474 relations Sat Dec 17 15:28:11 2011 using 20 quadratic characters above 134217498 Sat Dec 17 15:28:25 2011 building initial matrix Sat Dec 17 15:28:53 2011 memory use: 453.8 MB Sat Dec 17 15:28:54 2011 read 1073981 cycles Sat Dec 17 15:28:55 2011 matrix is 1073793 x 1073981 (326.8 MB) with weight 103906487 (96.75/col) Sat Dec 17 15:28:55 2011 sparse part has weight 72792856 (67.78/col) Sat Dec 17 15:29:01 2011 filtering completed in 2 passes Sat Dec 17 15:29:02 2011 matrix is 1071372 x 1071560 (326.6 MB) with weight 103797093 (96.87/col) Sat Dec 17 15:29:02 2011 sparse part has weight 72757804 (67.90/col) Sat Dec 17 15:29:03 2011 matrix starts at (0, 0) Sat Dec 17 15:29:04 2011 matrix is 1071372 x 1071560 (326.6 MB) with weight 103797093 (96.87/col) Sat Dec 17 15:29:04 2011 sparse part has weight 72757804 (67.90/col) Sat Dec 17 15:29:04 2011 saving the first 48 matrix rows for later Sat Dec 17 15:29:04 2011 matrix includes 64 packed rows Sat Dec 17 15:29:04 2011 matrix is 1071324 x 1071560 (314.1 MB) with weight 82570139 (77.06/col) Sat Dec 17 15:29:04 2011 sparse part has weight 71622900 (66.84/col) Sat Dec 17 15:29:04 2011 using block size 65536 for processor cache size 8192 kB Sat Dec 17 15:29:08 2011 commencing Lanczos iteration ([B]4 threads[/B]) Sat Dec 17 15:29:08 2011 memory use: 268.3 MB Sat Dec 17 15:29:13 2011 linear algebra at 0.1%, ETA 0h[B]47[/B]m Sat Dec 17 15:29:15 2011 checkpointing every 1210000 dimensions[/code] |
Anyone here with more benches?
|
Here is some ETA based data for two different machines. I can confirm that 1.39 appears to be faster than 1.49, although it doesn't scale quite as well. Utilization with 8 cores on machine 1 using 1.39 hovered around 55%, while with 1.49 it was around 85%.
machine 1 (hyperthreading disabled): [B]Dual[/B] Intel(R)Xeon(R) CPU X5687 @ 3.60GHz (4 cores) triple channel DDR3 @ 1333 MHz (per CPU) + QPI CPU interconnect [CODE] v1.49 (win 7 pro, 64 bit): threads, ETA (min) (after settling) 2, 66 4, 41 8, 30 v1.39 (win 7 pro, 64 bit): threads, ETA (min) (after settling) 2, 48 4, 32 8, 29 [/CODE] machine 2 (hyperthreading enabled): [B]Dual[/B] Intel(R)Xeon(R) CPU X5680 @ 3.33GHz (6 cores) triple channel DDR3 @ 1333 MHz (per CPU) + QPI CPU interconnect [CODE] v1.50 (linux, 64 bit) threads, ETA (min) (after settling) 4, 58 6, 45 8, 41 12, 39 24, 53 [/CODE] |
Do you have the ability to try out the MPI version on that dual-hex-Xeon machine? It was a distinct improvement on my (now xilman's) dual-quad-Opteron, see
[url]http://www.mersenneforum.org/showpost.php?p=263129&postcount=60[/url] [url]http://www.mersenneforum.org/showpost.php?p=263129&postcount=65[/url] |
[QUOTE=fivemack;283949]Do you have the ability to try out the MPI version on that dual-hex-Xeon machine? It was a distinct improvement on my (now xilman's) dual-quad-Opteron, see
[URL]http://www.mersenneforum.org/showpost.php?p=263129&postcount=60[/URL] [URL]http://www.mersenneforum.org/showpost.php?p=263129&postcount=65[/URL][/QUOTE] As of now, no; mpi isn't installed and I don't have admin privs on the machine. Though our admin knows what I occasionally use these machines for and might be willing to install it for me. I'll check when he's back next week. |
Other than adding MPI, the only other major change between v1.39 and v1.50 is that now vector-vector multiplies are handled by multiple threads. Even if that causes cache thrashing, it shouldn't cost 25% of the total performance...
Both MPI and multithreaded vector-vector code were added in v1.46. Can someone see if the regression happens between v1.45 and v1.46? Unfortunately the SF page only has v1.41 and up, I can provide older releases if necessary. |
1 Attachment(s)
Ooops....I thought I had put 1.49 on my hexcore to test, but looking at the log, I only had 1.39 on there, so I'll have to re-run the benches again. But just for yucks, here is a log for the LA on my box for 1.39.
Summary of run times for 1-6 threads with nothing else running:[code]elapsed time 02:56:55 elapsed time 01:39:01 elapsed time 01:15:45 elapsed time 01:05:25 elapsed time 01:00:09 elapsed time 00:59:16[/code]I'll re-run the tests and then play around with seeing what else I can run on the system without impacting msieve too much. Full specs on the box:[LIST][*]Gigabyte M68MT-S2P MOBO[*]8 GB RAM (2x-4GB DDR3 PC3-1066 1333 MHz Patriot)[*]Athlon Phenom II X6 1090T BE 3.2 GHz[/LIST] |
| All times are UTC. The time now is 23:29. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.