mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2008-03-21, 18:43   #1
SlashDude
 
SlashDude's Avatar
 
Aug 2002
Minneapolis, MN

22·3·19 Posts
Default Quad Quad-cores

FYI-
I've had the opportunity to evaluate an HP DL580 G5 and an IBM System x3850 M2.

Both systems had Windows 2003 server Enterprise x64 with 4 x quad-core CPU's. (16 cores) The HP server was 2.4GHz, 1066MHz FSB and 32GB ram, and the IBM server was a 2.98GHz 1333MHz FSB and 64GB ram. (Note - IBM states a ~30% memory speed increase)

I ran manual tests using exponents in the 47,000,000 range.
Here are the results:
I used 0-F (hex) for the CPU names. 0-3, 4-7, 8-B, and C-F were on the same chip.

HP DL 580:
Code:
Cores running                         Time per iteration
0                                         0.067			
0,8                                       0.067			
2,6,A                                     0.068			
0,4,8,C                                   0.069			
2,6,A,C                                   0.069			
0,2                                       0.072			
0,2,8,A                                   0.074			
0,2,4,6,8,A                               0.075			
2,6,8,A,C                                 0.075	.069 on 2,6,C		
0,1                                       0.076			
2,4,6,8,A,C                               0.076	.070 on 2,C		
0,2,4,6,8,A,C                             0.08	.072 on C		
0,2,4,6,8,A,C,E                           0.085			
0,1,2                                     0.09	.077 on 2		
0,1,4,5,8,9,C,D                           0.093			
0,1,2,3                                   0.103			
0,1,2,3,4,5,6,7                           0.114			
0,1,2,3,8,9,A,B                           0.114			
0,2,4,5,6,8,A,C,E                         0.12	.93 on 6		.86 on 0,2,8,A,C,E
0,2,4,5,6,8,9,A,C,E	                  0.124	.96 on 6,A		.89 on 0,2,C,E
0,1,2,3,4,5,6,7,8,9,A,B                   0.125			
0,1,2,4,5,6,8,9,A,C,E                     0.13	.98 on 2,6,A,E		.091 on C
0,1,2,4,5,6,8,9,A,C,D,E                   0.134	.1 on 2,6,A and E		
0,1,2,4,5,6,7,8,9,A,C,D,E                 0.155	.135 on 1,2,9,A,C,D		.101 on 2, A and E
0,1,2,4,5,6,7,8,9,A,B,C,D,E               0.156	.138 on 1,2,C,D		.102 on 2 and E
0,1,2,3,4,5,6,7,8,9,A,B,C,D,E             0.158	.141 on C,D		.103 on E
All                                       0.16
IBM System x3850 M2:
Code:
Cores running                         Time per iteration
0                                         0.061		
0,8                                       0.061		
2,6,A                                     0.061		
0,4,8,C                                   0.061		
2,6,A,C                                   0.061		
0,2                                       0.064		
0,2,8,A                                   0.065		
0,2,4,6,8,A                               0.066		
2,6,8,A,C                                 0.066	0.061	
0,1                                       0.068		
2,4,6,8,A,C                               0.066	0.061	
0,2,4,6,8,A,C                             0.066	0.061	
0,2,4,6,8,A,C,E                           0.066		
0,1,2                                     0.077	0.07	
0,1,4,5,8,9,C,D                           0.079		
0,1,2,3                                   0.087		
0,1,2,3,4,5,6,7                           0.088		
0,1,2,3,8,9,A,B                           0.088		
0,2,4,5,6,8,A,C,E                         0.079	0.072	0.067
0,2,4,5,6,8,9,A,C,E	                  0.08	0.073	0.067
0,1,2,3,4,5,6,7,8,9,A,B                   0.091		
0,1,2,4,5,6,8,9,A,C,E                     0.83	0.074	0.068
0,1,2,4,5,6,8,9,A,C,D,E                   0.85	0.075	
0,1,2,4,5,6,7,8,9,A,C,D,E                 0.1	0.087	0.077
0,1,2,4,5,6,7,8,9,A,B,C,D,E               0.103	0.09	0.078
0,1,2,3,4,5,6,7,8,9,A,B,C,D,E             0.107	0.093	0.08
All                                       0.111
Now the true reason for these tests was a VMware ESX server evaluation.
Here are the numbers running the same tests within a "virtual machine" running on the above hardware (Times averaged between all running clients.) Each guest OS was configured with a single CPU and 512MB RAM. I didn't have an x64 OS image, so these tests were done using the x86 client on Windows Server 2003 standard.
Code:
HP	IBM	Faster%
0.068	0.062	9.68%
0.069	0.062	11.29%
0.070	0.062	12.90%
0.074	0.062	19.35%
0.075	0.063	18.35%
0.076	0.068	11.76%
0.081	0.068	18.45%
0.090	0.071	26.67%
0.102	0.079	29.94%
0.101	0.083	21.65%
0.110	0.084	30.31%
0.119	0.088	35.42%
0.129	0.09	43.36%
0.139	0.097	42.57%
0.144	0.103	39.50%
0.155	0.112	38.62%
Hopefully this will be of interest to someone else too.

-SD
LLR Note:
I had a chance to run LLR on the HP, and found that a single test in the x*2^~540000-1 range took 668 seconds to complete. With 16 tests running, it took 673 seconds. (a 5 second slowdown per test - less then 1% per test!)

<Edit - Fixed code boxes>

Last fiddled with by SlashDude on 2008-03-21 at 18:58
SlashDude is offline   Reply With Quote
Old 2008-03-22, 16:53   #2
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

4,457 Posts
Default

Well if I can convince my wife to forgo new windows in the house and a winter vacation next year I can get one ....

Last fiddled with by petrw1 on 2008-03-22 at 16:53
petrw1 is offline   Reply With Quote
Old 2008-03-22, 23:16   #3
lycorn
 
lycorn's Avatar
 
Sep 2002
Oeiras, Portugal

1,423 Posts
Default

Have you performed, or are considering to perform, any test using the multithreading capabilities of the 25.x version of Prime95 (running a single LL test on multiple cores and see how well it scales)?
lycorn is offline   Reply With Quote
Old 2008-03-23, 23:58   #4
lpmurray
 
lpmurray's Avatar
 
Sep 2002

89 Posts
Default

Would you or anyone with a 4 core, consider trying a 100 million number running on 4 cores of a single processor using 25.x and seeing how long it takes to finish a number? On my duel Pentium Xeon 2.4GHz says it will take 17 years. I am thinking of up-grading to a duel 4 core Intel Processor motherboard. Plus my front buss is single channel 533MHz I think 1333MHz duel channel should make a big difference. My iteration time is 1.070sec.
lpmurray is offline   Reply With Quote
Old 2008-03-24, 08:16   #5
Cruelty
 
Cruelty's Avatar
 
May 2005

2×809 Posts
Default

Quote:
Originally Posted by lpmurray View Post
Would you or anyone with a 4 core, consider trying a 100 million number running on 4 cores of a single processor using 25.x and seeing how long it takes to finish a number? On my duel Pentium Xeon 2.4GHz says it will take 17 years. I am thinking of up-grading to a duel 4 core Intel Processor motherboard. Plus my front buss is single channel 533MHz I think 1333MHz duel channel should make a big difference. My iteration time is 1.070sec.
Which exponent do you want to try?
Cruelty is offline   Reply With Quote
Old 2008-03-24, 10:42   #6
lpmurray
 
lpmurray's Avatar
 
Sep 2002

89 Posts
Default

Any one of the first 100 million digits is fine. I just want to know how much faster it can be done. When I first started doing 10million digit numbers they would take 13 months when I upgraded my duel PII 350'S TO PIII 550'S. now they get done in under a month. I would just like to see what the iteration time of 100million number doing LL TEST. I think 100million numbers are worth doing if they can be done in less then 18 months. I figure by the time the first ones are done, the speed at which to do them should be about 600+% faster depending on the speed of the first 32nm chips.
lpmurray is offline   Reply With Quote
Old 2008-03-24, 18:25   #7
SlashDude
 
SlashDude's Avatar
 
Aug 2002
Minneapolis, MN

22·3·19 Posts
Default

Quote:
Originally Posted by lycorn View Post
Have you performed, or are considering to perform, any test using the multithreading capabilities of the 25.x version of Prime95 (running a single LL test on multiple cores and see how well it scales)?
I didn't know about this!

Some quick numbers:

Running 1 test on 16 cores (on the HP DL580) runs at .020 seconds per iteration on an exponent in the 47,000,000 range. This has the box running at ~51% total.
Single test: (CPU Affinity set to run on first CPU - Each addition thread took the CPU's in order - 0,1,2,3,4,5...)
Code:
Cores  Iteration   Box%
16     .020 sec    51%
8      .022 sec    40%
7      .023 sec    36%
6      .024 sec    32%
5      .029 sec    26%
4      .029 sec    23%
3      .034 sec    18%
2      .037 sec    13%
1      .067 sec     7%
Code:
Single test: (CPU Affinity set to run on any)
Cores  Iteration   Box%
16     .024 sec    41-51%
8      .024 sec    31%
7      .046 sec    20%
6      .051 sec    18%
5      .055 sec    17%
4      .029 sec    25%
3      .035 sec    19%
2      .050 sec    13%
1      .067 sec     7%
Code:
Two tests: (CPU Affinity set to run on 0 and 8)
Cores  Iteration   Box%
16     .036 sec    52% (.045 sec on Worker #2)
8      .024 sec    80%
7      .025 sec    72%
6      .027 sec    64%
5      .031 sec    55%
4      .031 sec    46%
3      .036 sec    35%
2      .039 sec    25%
1      .067 sec    13%
Code:
Two test: (CPU Affinity set to Smart Assignments)
Cores  Iteration   Box%
16     .044 sec    56%
8      .045 sec    50%
7      .046 sec    45%
6      .038 sec    52%
5      .045 sec    43%
4      .034 sec    48%
3      .046 sec    34%
2      .051 sec    25%
1      .067 sec     13%
Code:
Two test: (CPU Affinity set to run on any)
Cores  Iteration   Box%
16     .048 sec    45-55%
8      .048 sec    44%
7      .046 sec    45%
6      .040 sec    50%
5      .045 sec    42%
4      .034 sec    50%
3      .049 sec    33%
2      .051 sec    25%
1      .067 sec    13%
Four tests: (CPU Affinity set to run on 0, 4, 8 and C)
Code:
Cores  Iteration   Box%
6      .050 sec    63% (.73 on Worker #2 and Worker #3)
5      .036 sec    57% (.153 on Worker #1 and .233 on Worker #3)
4      .043 sec    93%
3      .045 sec    70%
2      .045 sec    50%
1      .069 sec    25%
Let me know if you want any other multi-threaded tests run.
-SD
SlashDude is offline   Reply With Quote
Old 2008-03-24, 20:43   #8
SlashDude
 
SlashDude's Avatar
 
Aug 2002
Minneapolis, MN

22810 Posts
Default

Quote:
Originally Posted by lpmurray View Post
Would you or anyone with a 4 core, consider trying a 100 million number running on 4 cores of a single processor using 25.x and seeing how long it takes to finish a number? On my duel Pentium Xeon 2.4GHz says it will take 17 years. I am thinking of up-grading to a duel 4 core Intel Processor motherboard. Plus my front buss is single channel 533MHz I think 1333MHz duel channel should make a big difference. My iteration time is 1.070sec.
Here is a run from 1 to 16 cores on a 100 million number:

Single test: M73006597 (CPU Affinity set to run on first CPU - Each addition thread took the CPU's in order - 0,1,2,3,4,5...) (Hopefully I got the math correct on “Days”)
Code:
HP DL580 G5 (2.4GHz 1066 FSB)
Cores  Iteration   Box%  Days
16     .030 sec    57%   25.35
15     .031 sec    54%   26.19
14     .031 sec    50%   26.19
13     .031 sec    60%   26.19
12     .031 sec    55%   26.19
11     .032 sec    47%   27.04
10     .032 sec    43%   27.04
9      .034 sec    40%   28.73
8      .034 sec    41%   28.73
7      .037 sec    36%   31.26
6      .040 sec    32%   33.80
5      .047 sec    27%   39.71
4      .048 sec    24%   40.56
3      .056 sec    18%   47.32
2      .061 sec    13%   51.54
1      .109 sec     7%    92.10
Code:
IBM 3850 M2 (2.98GHz, 1333 FSB)
16     .031 sec    52%   26.19
15     .032 sec    48%   27.04
14     .032 sec    43%   27.04
13     .032 sec    43%   27.04
12     .032 sec    41%   27.04
11     .033 sec    41%   27.88
10     .034 sec    40%   28.73
9      .036 sec    47%   30.42
8      .036 sec    46%   30.42
7      .034 sec    36%   28.73
6      .036 sec    32%   30.42
5      .043 sec    27%   36.33
4      .050 sec    22%   42.25
3      .065 sec    18%   54.92
2      .055 sec    13%   46.47
1      .098 sec     7%    82.81
-SD
SlashDude is offline   Reply With Quote
Old 2008-03-24, 23:15   #9
lpmurray
 
lpmurray's Avatar
 
Sep 2002

89 Posts
Default

Sorry but the number you did was only 21,977,176 digits long, a hundred million digit number must be 9 digits long..... 332192809 is exactly 100million digits long. You must run that number or larger to be running a 100million digit number. Also thank you for running.
lpmurray is offline   Reply With Quote
Old 2008-03-24, 23:36   #10
Cruelty
 
Cruelty's Avatar
 
May 2005

2×809 Posts
Default

Iteration time on C2Q @ 3GHz / FSB @ 1333 / RAM @ DDR2-1000 = 0.172 sec. I have tested M332192831 for couple of minutes and ETA was 13.10.2027
Cruelty is offline   Reply With Quote
Old 2008-03-25, 03:44   #11
lpmurray
 
lpmurray's Avatar
 
Sep 2002

89 Posts
Default

According to your test number 332192831 and iteration time of .172 it seams like it should take 21 months to finish the number.
(332192831 * .172) / 31557600 sec in year = 1.8 years.. to me the status should say you will be done in Nov. 2009 not 2027. I can't figure out why the big difference and (where or If) I'm going wrong... 13.10.2027 seams way to long to run that number on that system.
P.S. thanks for taking the time... If it is only 21 months to do a 100million digit number it should be cut way down over the next 2 years. I wonder how muh faster it would be throwing 16 cores at that puppy instead of 4?
lpmurray is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Xeon vs. Quad CPU (775) EdH Hardware 19 2017-06-08 22:06
"Nehalem" quad-cores faster than 100 GFLOPS? ixfd64 Hardware 11 2009-03-09 18:17
What's the better quad? CRGreathouse Hardware 51 2009-03-04 01:32
Quad Core and P95 sgrupp Hardware 54 2008-01-25 22:01
Quad Core R.D. Silverman Hardware 76 2007-11-19 21:57

All times are UTC. The time now is 01:46.

Tue Nov 24 01:46:05 UTC 2020 up 74 days, 22:57, 4 users, load averages: 2.61, 2.47, 2.56

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.