mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Dual CPU perfomance hit (https://www.mersenneforum.org/showthread.php?t=7)

worknplay 2002-09-13 08:30

Mivacca2 wrote:

[quote]i am going to imagine someone is going to build a motherboard that uses dual P4's on it. [/quote]and

[quote]is hyperthreading between multiple CPU's going to affect speeds any?[/quote]1. Don't count on it. I doubt Intel will enable SMP on hyperthreaded P4's - or on any P4 for that matter. Why? Two reasons a) Xeon; and b) Xeon. Remember, that since the introduction of the P4, we've seen a determined effort by Intel to deny SMP to the masses - if you want P4 power in SMP .. buy Xeon .. or go back to the P[I]!!!.[/I]
Remember how scared Intel was when those who cared realize the first Celeron's were SMP capable? (I'm writing this from a BP6 w/ 2xCel366 @550 which has performed without incident since 1999). Even P[I]!!![/I] SMP was barely tollerable .. because it was affordable to the masses. SMP P4? Not a chance. Hyperthreading is the closest thing we'll see to SMP on the P4 for the masses.
Besides, think about it - two hyperthreaded P4s = four (potential)processes. You'd need to be running a multi-processor kernal of Linux (are they all multi-processor kernals?) or Win2k/XP $erver. Win2k/XP Pro would only be able to take advantage of one CPU which is "fooling" it into thinking there are two .. the second physical CPU would be occupying wasted space. So don't hold your breath for dual-P4's .. you might turn a very deep shade of purple ;)

2. Hyperthreading - at first glance - in and of itself will only benefit multithreaded applications - those written to take advantage of SMP. Remember hyperthreading "fool's" the OS - and any multithreaded application - into thinking that there are two processors in a single CPU system.
Since Prime95 is a single process application - at present - I doubt you'll see any more of a performance improvement other than that of raw speed - 3GHz+ - mind you, would you be able to get away with running two processes of Prime95 on a hyperthreaded P4?
I can only imagine one thing .. can you say [I]overhead[/I]?
This is because even with multithreaded applications, Intel claims the performance improvement would only be only 25-30% greater than a non-hyperthreaded CPU at the same speed - not that there will be any.
[URL]http://www.theregister.co.uk/content/3/27039.html[/URL][URL="https://oddslot.co.uk/analysis/"]oddslot[/URL]
However, if George or someone else at GIMPS where to write into Prime95 the ablility to "split" a LL test between two processes - which if I recall he indicated was not either workable or inefficient on SMP systems - that was efficient and was at least able to take full advantage of hyperthreading, then I can only imagine some brilliant benchmarks indeed :D

In the final analysis it appears that hyperthreading will be the P4's SMP for the masses or at the very least, the poor person's version - if not a test - of multithreading CPUs which Intel will introduce within the next 24 months +/- to the Itanium core .. in the end .. wouldn't you really prefer a multithreaded IA-64 processor:question: 8) ... with a really, really big cache? :D
[/url]

Xyzzy 2002-09-13 11:00

Stormblade ran some benchmarks on a dual cpu Xeon system earlier this year... He ran benchies with hyper-threading enabled and disabled... I can't remember the exact result, but I think the hyper-threading turned out to not be a good idea... I'll email him the link to this post and maybe he can clarify it for us...

Edit:

Here is what I can find from the mailing list...

[quote]Date: Thu, 7 Mar 2002 17:14:33 -0500
Subject: RE: Mersenne: Hyper-threading

As far as Hyperthreading goes, It is already available in the P4 Xeon chips.
I have personally tested the effectiveness with Prime95, and it does nothing
for our testing. The 2 instances ran normally, but each one took twice as
long as 1 instance would have. I think the only benefit Hyperthreading adds
to any application, is that if a pipeline becomes stalled, another thread
can use the 2nd pipeline, where on a standard P4, if one pipeline is
stalled, the other has to wait for it.[/quote]

And here is from Ars Technica...

[quote]Notes:
we used Advanced:Time rather than the benchmark.
0 and 1 are the physical CPUs, 2 and 3 are the "virutal" CPUs created by hyperthreading
System = 2 2.2GHz P4 Xeon "Prestonia" CPUs. 4GB PC800 Rambus memory. OS=Win2k
Exponent =10M (8.97M to 10.24M(512K) on Mersenne benchmark page)

Test Case 1: Running a single instance of Prime95 on each CPU

CPU # Iteration Time(average in ms)

0 24.081
1 24.024
2 24.042
3 24.044

Test case 2: Running 2 instances of Prime95 on various CPUs

CPU # Iteration Time(average in ms)

0 25.188ms
1 25.161ms

0 25.217
2 25.127

0 48.380
3 47.821

1 47.538
2 46.838

1 25.185
3 25.126

2 25.161
3 25.191

Test Case 3: Running 3 instances of Prime 95

CPU # Iteration Time(average in ms)

0 25.603
1 50.352
2 50.807

0 50.386
2 25.591
3 50.849

Test Case 4: Running 4 instances of Prime95

CPU # Iteration Time(average in ms)

0 56.343
1 55.881
2 55.795
3 56.448

Test Case 5: Running 4 instances of Prime95, Running LL on "Primary" CPUs and factoring on "Virtual" CPUs.

CPU # Iteration Time(average in ms)

0 39.589
1 39.667
2 Factoring
3 Factoring

Test Case 6: Running 4 instances of Prime95, Running LL on 1 CPU, and Factoring on the rest.

CPU # Iteration Time(average in ms)

0 39.003
1 Factoring
2 Factoring
3 Factoring[/quote]

:shock:

Stormblade 2002-09-14 03:03

Xyzzy found all the info I had on the testing. If anyone doesnt understand the results, I would be glad to explain them.

:)

Edit:

Woot! I got a Shrek Pic! cool! (I do need to stop by more often....)

lpmurray 2002-09-15 10:17

Duel CPU performance hit
 
I understand that 2 prime LL's running under hyperthreading run twice as fast but how about 1 LL plus 1 factoring will the LL take a hit or not.... THE USER WITH XENON PLEASE LET ME KNOW...THANKS....Larry

Xyzzy 2002-09-15 12:01

Larry-

Look at test case 2 and 5 in Stormblade's chart...

[quote]Test case 2: Running 2 instances of Prime95 on various CPUs

CPU # Iteration Time(average in ms)

0 25.188ms
1 25.161ms

0 25.217
2 25.127

0 48.380
3 47.821

1 47.538
2 46.838

1 25.185
3 25.126

2 25.161
3 25.191

Test Case 5: Running 4 instances of Prime95, Running LL on "Primary" CPUs and factoring on "Virtual" CPUs.

CPU # Iteration Time(average in ms)

0 39.589
1 39.667
2 Factoring
3 Factoring[/quote]

So it looks like running the CPUs as if they were *not* hyperthreaded, by running a single LL test on each physical CPU, results in ~25ms... If you add factoring to each physical CPU, by using hyperthreading, the LL iteration rate drops to ~40ms...

This just illustrates that hyperthreading doesn't do much for a CPU that is already maxxed out... It is more geared for a CPU that is stalled waiting for I/O...

The reason we asked Stormblade for all possible combinations was due to our belief that the multiple processes would also slow down due to bandwidth limitations... The Xeon bus is identical to the P4 bus, which means that a single P4 running Prime95 has 3.2GB/s available to it, but the dual Xeon only has 1.6GB/s per CPU...

willmore 2002-09-16 01:06

But, if I remember right, Earnst Mayer is still working on the dual floating/integer LL tester program. If I understand his approach correctly, he's doing the 'hyperthreading' manually--by interleaving the code for the two tests.

I think he expects some speed improvement. Is that just due to the processors he's targeting? AXP and such?

Mivacca2 2002-09-17 23:08

[quote]
wouldn't you really prefer a multithreaded IA-64 processor:question: 8) ... with a really, really big cache? :D
[/url][/quote]


Would I prefer one, hell yes who wouldn't? :) Unfortunately.... I would prefer the price of a P4 to that of a Ia-64 chip *cringes in fear* :( It'll be ok though... We will just have to see what happens when they change the core and the new socket configuration with the next pentium....

MiVacca2

willmore 2002-09-24 17:14

If we're just dreaming, I'll take a 2Ghz EV8. Hyperthreading? Naw, SMT. More execution units need more issueable instruction. *drool*

To bad it's dead. Thank you Compaq/HP. HP, putting the risk in RISC.

ewmayer 2002-09-24 23:59

[quote="willmore"]But, if I remember right, Ernst Mayer is still working on the dual floating/integer LL tester program. If I understand his approach correctly, he's doing the 'hyperthreading' manually--by interleaving the code for the two tests.

I think he expects some speed improvement. Is that just due to the processors he's targeting? AXP and such?[/quote]

David. see my recent posting on this topic in the Math section.

-Ernst

BigRed 2002-10-09 20:39

Best mix for Dual P4?
 
I was reading earlier in this thread about memory contention with dualies and wanted some suggestions for the best mix of crunching for my situation.
I have 5 dual P4 Xeon boxes from Dell. 2 are 2.2GHz with 1Gb RAM and 3 are 1.7GHz with 512Mb RAM. All RAM is PC-800. Nothing is overclocked (or very overclockable since they're Dells). Everything's running RedHat Linux with 2.4.x kernels.
My goal is to have good LL stats and have a chance of finding a prime number. It'd be nice to find a 10 million digit prime too but I'm not holding my breath. I'm not eager to do a lot of manual work moving exponents around after P-1 checking but I could be persuaded to. I'm not currently a member of TPR but could be convinced to join if it would make my life easier.
I've also got 7 single CPU P4s that use the older PC-133 RAM and 3 dual P3s. I'd been running regular LL tests on both processors of all the duals. For the moment I've switched to one LL and one double check for those boxen.
So, any suggestions for the best mix of tests to maximize use of my available CPU power?

garo 2002-10-10 00:32

If you are interested in finding a prime go with 2 LL test on each of the P4 duallies. A DC is just like an LL test in terms of the performance penalty on dualies. The optimum mix is one CPU on factoring and the other on LL test but on P4's it might just be better to go with 2 LL tests.


All times are UTC. The time now is 23:20.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.