![]() |
|
|
#1 |
|
Mar 2003
32×5 Posts |
--
Has anyone tried the "Affinity" option in Prime95 to run a separate instance of Prime95 on each logical processor of a Hyperthreading machine? It would be interesting to see if Hyperthreading is powerful enough to get any performance gains when trying to use it as a true dual machine. -- |
|
|
|
|
|
#2 |
|
Jan 2003
far from M40
53 Posts |
Hi Rick,
the Hyperthreading-ability of P4s has no relevance for Prime95 as it is so highly optimized that it already uses every spare CPU cycle. Running multiple copies of Prime95 would result in about the same overall - throughput as running a single one. Perhaps the throughput would even be slightly lower as Windows had to spend some time minding another task. Maybe Hyperthreading has a certain impact on other tasks running along with Prime95 on a P4 - System, but I doubt that, as Prime95 runs on Idle - Priority by default. Cheers, Benjamin |
|
|
|
|
|
#3 |
|
Aug 2002
2·101 Posts |
My suggestion would be to run one LL and one factoring process per CPU (or even one LL and three factoring for a dual). LL saturates the memory bus, but factoring runs out of the cache. Factoring can easily step up and take the CPU whenever the LL process is waiting on a memory fetch.
|
|
|
|
|
|
#4 |
|
10000001010112 Posts |
Hello
I post here because this is the thread more affine with what I am needing. I have a P4C 3Ghz 800FSB Canterwood system. I am running Prime95 in stress mode for checking that the machine is 100% rock solid. I have HT ON in the BIOS and Windows detects 2 CPUs. I can see that while Prime95 is running it only uses 50% of the CPU (one thread only). I was wondering how to make Prime95 to use 100% of the CPU? I have been looking the afinnity options but I have not the thing very clear. Do I have to run Prime95 2 times and change the affinity in each one to run on a different CPU (f.e. 0 and 1)? I feel that if Prime is only running in one of the threads of the HT CPU I am not stressing the system enough to be sure if it is 100% stable or not. I have been running Prime for 10 hours without a fail, and it was running 3dMark2001SE in loop at the same time. Any help is greatly apreciated. |
|
|
|
#5 |
|
Aug 2002
23×32 Posts |
On a single processor system with hyperthreading, only one thread of prime95 is needed to saturate a processor. also, running 3dmark or annother utility as such will do more to stress the agp bus and the vidcard, not the processor. Good to run prime95 for a while on its own to stress just the memory and processor after you run the 3d demo looping.
|
|
|
|
|
|
#6 | |
|
"Patrik Johansson"
Aug 2002
Uppsala, Sweden
52·17 Posts |
Quote:
|
|
|
|
|
|
|
#7 | ||
|
Aug 2002
2·3·53 Posts |
Quote:
Quote:
As Tasuke says, 1 instance of Prime95 with HT turned off will stress your computer enough to judge stability. |
||
|
|
|
|
|
#8 |
|
Apr 2003
Berlin, Germany
192 Posts |
I think the problem is that prime95 is running as idle task and won't get full utilization of a HT CPU because the OS doesn't know that prime should get all available free time. Maybe while using 1 CPU the other will be filled with the original idle task and thus they have to share their time. Increasing prime's priority could help.
If 2 full copies of prime would run they would hinder eachother more instead of using full resources. Prime95 needs as much cache and mem bandwidth as it can get. So it's better to have one client getting 100% available power than 2 getting 40%. HT is useful for tasks which don't need all available ressources and have a different instruction mix. 2 programs using FP/SIMD units most of the time cannot lift FP/SIMD usage above 100% - there are only as many units as in stock P4 CPUs. I'd like to see some benchmarks of one/two clients on HT CPUs - also with different priority settings. Regards, Matthias |
|
|
|
|
|
#9 |
|
Aug 2002
4816 Posts |
HT on a P4 is not the ability to double work. It WILL make preemptive multitasking somewhat faster as there can now be 2 separate threads in the pipe at once. This saves some clocks when you have a int op and fp op or two int ops lined up for the processor.
The p4 has one fpu (with add and mul branches) but several GP integer units. This makes running e-mail and word with photoshop while browsing sem a little faster. But a really tight codded app cannot find more processor from Just HT being enabled. In fact, it will see a decline. Something else can use the pipe at the same time as the well coded app, and at 20 stages, that amounts to 5%. Add in the Prefetch failing once or twice and you get to the 6% to 10% hit in single process that has been seen in some benchmarks, most notibly high memory access, and 3d apps. The P4 HT needs a more intelegent prefetcher in order to keep the 2 process ram access from causing a problem. This is one reason why the P4 sees performance increases from each memory system advance. the damn thing cannot keep up with it self, and running HT on a P4 makes it slightly worse. Hence why no 3.2 Ghz until 800 mem bus. Processor was starving. So 2 instinces of Prime on an HT ystem would see greater than a 2-5% declineper thread(normalizing one thread to 50%), as the memory accesses would keep hitting eachother. I will try to find benchmarks. BTW, Prime95 would not hit a memory bandwidth limit on such a system, but the latency for each thread would go up causing the hit in performance.
|
|
|
|
|
|
#10 |
|
Aug 2002
5910 Posts |
I've been running Prime on an HT-enabled P4 for a while and I can try to give some meaningful numbers to all the theory that's being thrown around here.
Using the benchmark module probably doesn't give useful data since the program seems to pause between iterations. For example, with the CPU at 3.3Ghz running two instances of v23.3 at priority 2, the 1024k FFT benchmark is 34.320ms for one instance and 47.039ms for the other. A single instance takes 31.461ms per iteration. That would seem to indicate an 60% increase in throughput, which is suspect. Running a normal LL in both instances gives better figures. I have a 1792k exponent in one instance with average iterations times around .113s. In the other instance, I have 512k exponent, with iterations around .038s. The single instance times for each of these should be .016s and .0615s respectively. That makes the total throughput about 96% of running a single instance. Oddly, turning on round-off checking gives about 103% of single instance throughput; so HT must allow some extra work to be done during the round-off check. As expected, HT doesn't provide much benefit for Prime95, but it doesn't hurt it much either. Since Prime doesn't do exclusively FP instructions (after all, it has to do a lot of loads and stores), a system with more memory bandwidth and a larger cache might see an overall improvement with HT. As a side note, Win XP's handling of pre-emption with HT isn't very efficient. If I run a multi-threaded program at normal priority with Prime95 running in the background (at idle priority), Prime can still steal cycles from the other program. If you're using SMT-aware software and need the best possible performance with it, you should probably disable Prime while that application is running. |
|
|
|
|
|
#11 |
|
Jun 2003
The Computer
401 Posts |
How do you do affinity?
Even with the password it is still grayed and since Prime95 is all I use that computer for anyway now (the computer I make these messages on is a different computer from Prime95) I noticed that the computer makes a noise when it is busy but all the noise I hear is from the screensaver and apparently it doesn't make sound because it's only using idle memory and I have it at priority 5 to outprioritize the screensaver but priority doesn't speed it up like the software said (the software said it wouldn't speed it up). Maybe FactorOverride would help too. Clowns789 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Hyperthreading | TheMawn | Hardware | 12 | 2013-08-15 00:03 |
| Hyperthreading | Primeinator | Information & Answers | 13 | 2010-05-20 15:15 |
| Hyperthreading | Jud McCranie | Information & Answers | 11 | 2009-03-05 06:41 |
| Should hyperthreading be used? | Electrolyte | Hardware | 5 | 2006-11-08 01:29 |
| Hyperthreading | dave_0273 | Hardware | 5 | 2003-12-12 13:22 |