mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Quad Quad-cores (https://www.mersenneforum.org/showthread.php?t=10120)

SlashDude 2008-03-21 18:43

Quad Quad-cores
 
FYI-
I've had the opportunity to evaluate an [URL="http://h10010.www1.hp.com/wwpc/us/en/en/WF05a/15351-15351-3328412-241644-3328422-3454575.html"]HP DL580 G5[/URL] and an [URL="http://www-03.ibm.com/systems/x/hardware/enterprise/x3850m2/index.html"]IBM System x3850 M2[/URL].

Both systems had Windows 2003 server Enterprise x64 with 4 x quad-core CPU's. (16 cores) The HP server was 2.4GHz, 1066MHz FSB and 32GB ram, and the IBM server was a 2.98GHz 1333MHz FSB and 64GB ram. (Note - IBM states a ~30% memory speed increase)

I ran manual tests using exponents in the 47,000,000 range.
Here are the results:
I used 0-F (hex) for the CPU names. 0-3, 4-7, 8-B, and C-F were on the same chip.

HP DL 580:
[CODE]
Cores running Time per iteration
0 0.067
0,8 0.067
2,6,A 0.068
0,4,8,C 0.069
2,6,A,C 0.069
0,2 0.072
0,2,8,A 0.074
0,2,4,6,8,A 0.075
2,6,8,A,C 0.075 .069 on 2,6,C
0,1 0.076
2,4,6,8,A,C 0.076 .070 on 2,C
0,2,4,6,8,A,C 0.08 .072 on C
0,2,4,6,8,A,C,E 0.085
0,1,2 0.09 .077 on 2
0,1,4,5,8,9,C,D 0.093
0,1,2,3 0.103
0,1,2,3,4,5,6,7 0.114
0,1,2,3,8,9,A,B 0.114
0,2,4,5,6,8,A,C,E 0.12 .93 on 6 .86 on 0,2,8,A,C,E
0,2,4,5,6,8,9,A,C,E 0.124 .96 on 6,A .89 on 0,2,C,E
0,1,2,3,4,5,6,7,8,9,A,B 0.125
0,1,2,4,5,6,8,9,A,C,E 0.13 .98 on 2,6,A,E .091 on C
0,1,2,4,5,6,8,9,A,C,D,E 0.134 .1 on 2,6,A and E
0,1,2,4,5,6,7,8,9,A,C,D,E 0.155 .135 on 1,2,9,A,C,D .101 on 2, A and E
0,1,2,4,5,6,7,8,9,A,B,C,D,E 0.156 .138 on 1,2,C,D .102 on 2 and E
0,1,2,3,4,5,6,7,8,9,A,B,C,D,E 0.158 .141 on C,D .103 on E
All 0.16
[/CODE]

IBM System x3850 M2:
[CODE]
Cores running Time per iteration
0 0.061
0,8 0.061
2,6,A 0.061
0,4,8,C 0.061
2,6,A,C 0.061
0,2 0.064
0,2,8,A 0.065
0,2,4,6,8,A 0.066
2,6,8,A,C 0.066 0.061
0,1 0.068
2,4,6,8,A,C 0.066 0.061
0,2,4,6,8,A,C 0.066 0.061
0,2,4,6,8,A,C,E 0.066
0,1,2 0.077 0.07
0,1,4,5,8,9,C,D 0.079
0,1,2,3 0.087
0,1,2,3,4,5,6,7 0.088
0,1,2,3,8,9,A,B 0.088
0,2,4,5,6,8,A,C,E 0.079 0.072 0.067
0,2,4,5,6,8,9,A,C,E 0.08 0.073 0.067
0,1,2,3,4,5,6,7,8,9,A,B 0.091
0,1,2,4,5,6,8,9,A,C,E 0.83 0.074 0.068
0,1,2,4,5,6,8,9,A,C,D,E 0.85 0.075
0,1,2,4,5,6,7,8,9,A,C,D,E 0.1 0.087 0.077
0,1,2,4,5,6,7,8,9,A,B,C,D,E 0.103 0.09 0.078
0,1,2,3,4,5,6,7,8,9,A,B,C,D,E 0.107 0.093 0.08
All 0.111
[/CODE]

Now the true reason for these tests was a VMware ESX server evaluation.
Here are the numbers running the same tests within a "virtual machine" running on the above hardware (Times averaged between all running clients.) Each guest OS was configured with a single CPU and 512MB RAM. I didn't have an x64 OS image, so these tests were done using the x86 client on Windows Server 2003 standard.
[CODE]
HP IBM Faster%
0.068 0.062 9.68%
0.069 0.062 11.29%
0.070 0.062 12.90%
0.074 0.062 19.35%
0.075 0.063 18.35%
0.076 0.068 11.76%
0.081 0.068 18.45%
0.090 0.071 26.67%
0.102 0.079 29.94%
0.101 0.083 21.65%
0.110 0.084 30.31%
0.119 0.088 35.42%
0.129 0.09 43.36%
0.139 0.097 42.57%
0.144 0.103 39.50%
0.155 0.112 38.62%
[/CODE]

Hopefully this will be of interest to someone else too. :smile:

-SD
LLR Note:
I had a chance to run LLR on the HP, and found that a single test in the x*2^~540000-1 range took 668 seconds to complete. With 16 tests running, it took 673 seconds. (a 5 second slowdown per test - less then 1% per test!)

<Edit - Fixed code boxes>

petrw1 2008-03-22 16:53

Well if I can convince my wife to forgo new windows in the house and a winter vacation next year I can get one ....

lycorn 2008-03-22 23:16

Have you performed, or are considering to perform, any test using the multithreading capabilities of the 25.x version of Prime95 (running a single LL test on multiple cores and see how well it scales)?

lpmurray 2008-03-23 23:58

Would you or anyone with a 4 core, consider trying a 100 million number running on 4 cores of a single processor using 25.x and seeing how long it takes to finish a number? On my duel Pentium Xeon 2.4GHz says it will take 17 years. I am thinking of up-grading to a duel 4 core Intel Processor motherboard. Plus my front buss is single channel 533MHz I think 1333MHz duel channel should make a big difference. My iteration time is 1.070sec.

Cruelty 2008-03-24 08:16

[QUOTE=lpmurray;129535]Would you or anyone with a 4 core, consider trying a 100 million number running on 4 cores of a single processor using 25.x and seeing how long it takes to finish a number? On my duel Pentium Xeon 2.4GHz says it will take 17 years. I am thinking of up-grading to a duel 4 core Intel Processor motherboard. Plus my front buss is single channel 533MHz I think 1333MHz duel channel should make a big difference. My iteration time is 1.070sec.[/QUOTE]Which exponent do you want to try?

lpmurray 2008-03-24 10:42

Any one of the first 100 million digits is fine. I just want to know how much faster it can be done. When I first started doing 10million digit numbers they would take 13 months when I upgraded my duel PII 350'S TO PIII 550'S. now they get done in under a month. I would just like to see what the iteration time of 100million number doing LL TEST. I think 100million numbers are worth doing if they can be done in less then 18 months. I figure by the time the first ones are done, the speed at which to do them should be about 600+% faster depending on the speed of the first 32nm chips.

SlashDude 2008-03-24 18:25

[QUOTE=lycorn;129472]Have you performed, or are considering to perform, any test using the multithreading capabilities of the 25.x version of Prime95 (running a single LL test on multiple cores and see how well it scales)?[/QUOTE]
I didn't know about this! :smile:

Some quick numbers:

Running 1 test on 16 cores (on the HP DL580) runs at .020 seconds per iteration on an exponent in the 47,000,000 range. This has the box running at ~51% total.
Single test: (CPU Affinity set to run on first CPU - Each addition thread took the CPU's in order - 0,1,2,3,4,5...)
[CODE]Cores Iteration Box%
16 .020 sec 51%
8 .022 sec 40%
7 .023 sec 36%
6 .024 sec 32%
5 .029 sec 26%
4 .029 sec 23%
3 .034 sec 18%
2 .037 sec 13%
1 .067 sec 7%[/CODE][CODE]Single test: (CPU Affinity set to run on any)
Cores Iteration Box%
16 .024 sec 41-51%
8 .024 sec 31%
7 .046 sec 20%
6 .051 sec 18%
5 .055 sec 17%
4 .029 sec 25%
3 .035 sec 19%
2 .050 sec 13%
1 .067 sec 7%[/CODE]
[CODE]Two tests: (CPU Affinity set to run on 0 and 8)
Cores Iteration Box%
16 .036 sec 52% (.045 sec on Worker #2)
8 .024 sec 80%
7 .025 sec 72%
6 .027 sec 64%
5 .031 sec 55%
4 .031 sec 46%
3 .036 sec 35%
2 .039 sec 25%
1 .067 sec 13%[/CODE]
[CODE]Two test: (CPU Affinity set to Smart Assignments)
Cores Iteration Box%
16 .044 sec 56%
8 .045 sec 50%
7 .046 sec 45%
6 .038 sec 52%
5 .045 sec 43%
4 .034 sec 48%
3 .046 sec 34%
2 .051 sec 25%
1 .067 sec 13%[/CODE]

[CODE]Two test: (CPU Affinity set to run on any)
Cores Iteration Box%
16 .048 sec 45-55%
8 .048 sec 44%
7 .046 sec 45%
6 .040 sec 50%
5 .045 sec 42%
4 .034 sec 50%
3 .049 sec 33%
2 .051 sec 25%
1 .067 sec 13%[/CODE]
Four tests: (CPU Affinity set to run on 0, 4, 8 and C)
[CODE]Cores Iteration Box%
6 .050 sec 63% (.73 on Worker #2 and Worker #3)
5 .036 sec 57% (.153 on Worker #1 and .233 on Worker #3)
4 .043 sec 93%
3 .045 sec 70%
2 .045 sec 50%
1 .069 sec 25%[/CODE]

Let me know if you want any other multi-threaded tests run.
-SD

SlashDude 2008-03-24 20:43

[QUOTE=lpmurray;129535]Would you or anyone with a 4 core, consider trying a 100 million number running on 4 cores of a single processor using 25.x and seeing how long it takes to finish a number? On my duel Pentium Xeon 2.4GHz says it will take 17 years. I am thinking of up-grading to a duel 4 core Intel Processor motherboard. Plus my front buss is single channel 533MHz I think 1333MHz duel channel should make a big difference. My iteration time is 1.070sec.[/QUOTE]

Here is a run from 1 to 16 cores on a 100 million number:

Single test: M73006597 (CPU Affinity set to run on first CPU - Each addition thread took the CPU's in order - 0,1,2,3,4,5...) (Hopefully I got the math correct on “Days”)
[CODE]HP DL580 G5 (2.4GHz 1066 FSB)
Cores Iteration Box% Days
16 .030 sec 57% 25.35
15 .031 sec 54% 26.19
14 .031 sec 50% 26.19
13 .031 sec 60% 26.19
12 .031 sec 55% 26.19
11 .032 sec 47% 27.04
10 .032 sec 43% 27.04
9 .034 sec 40% 28.73
8 .034 sec 41% 28.73
7 .037 sec 36% 31.26
6 .040 sec 32% 33.80
5 .047 sec 27% 39.71
4 .048 sec 24% 40.56
3 .056 sec 18% 47.32
2 .061 sec 13% 51.54
1 .109 sec 7% 92.10[/CODE]
[CODE]IBM 3850 M2 (2.98GHz, 1333 FSB)
16 .031 sec 52% 26.19
15 .032 sec 48% 27.04
14 .032 sec 43% 27.04
13 .032 sec 43% 27.04
12 .032 sec 41% 27.04
11 .033 sec 41% 27.88
10 .034 sec 40% 28.73
9 .036 sec 47% 30.42
8 .036 sec 46% 30.42
7 .034 sec 36% 28.73
6 .036 sec 32% 30.42
5 .043 sec 27% 36.33
4 .050 sec 22% 42.25
3 .065 sec 18% 54.92
2 .055 sec 13% 46.47
1 .098 sec 7% 82.81[/CODE]

-SD

lpmurray 2008-03-24 23:15

Sorry but the number you did was only 21,977,176 digits long, a hundred million digit number must be 9 digits long..... 332192809 is exactly 100million digits long. You must run that number or larger to be running a 100million digit number. Also thank you for running.

Cruelty 2008-03-24 23:36

Iteration time on C2Q @ 3GHz / FSB @ 1333 / RAM @ DDR2-1000 = 0.172 sec. I have tested M332192831 for couple of minutes and ETA was 13.10.2027 :smile:

lpmurray 2008-03-25 03:44

According to your test number 332192831 and iteration time of .172 it seams like it should take 21 months to finish the number.
(332192831 * .172) / 31557600 sec in year = 1.8 years.. to me the status should say you will be done in Nov. 2009 not 2027. I can't figure out why the big difference and (where or If) I'm going wrong... 13.10.2027 seams way to long to run that number on that system.
P.S. thanks for taking the time... If it is only 21 months to do a 100million digit number it should be cut way down over the next 2 years. I wonder how muh faster it would be throwing 16 cores at that puppy instead of 4?


All times are UTC. The time now is 01:33.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.