![]() |
|
|
#12 | |
|
Jan 2003
North Carolina
2·3·41 Posts |
Quote:
A time slice is given to each thread as it becomes computable. If the low prio thread runs to completion of that time slice without interruption from the high prio thread, then XP does round robin not pre-emptive scheduling. Unless Win XP's scheduler goes out of the way to stop the low prio thread dead in it's tracks, that thread completes its time slice even if a higher prio thread is available for execution. If all that is correct, I can't say that prime95 is the problem; the OS doesn't have the intelligence to stop the low prio thread dead in its tracks and replace it with the higher prio thread. Under most circumstances you want OS scheduler to be round robin since pre-emptive scheduling overhead could climb through the roof allowing for very little productive CPU work to be done (although it should/could appear that the higher prio threads are running faster then what you see today -- what an irony!). It's a CPU scheduler design compromise. Do I maximize the use of the CPU with minimal overhead, or do I maximize the response time the user sees with a lot (potentially) of wasted CPU scheduling overhead. Obviously it is the former since MS knows in its infinite wisdom that 99% of all CPU cycles are wasted -- except for us GIMPster, et al. Personal opinion is that a 2GHz+ machine would handle the pre-emptive scheduling with minimal impact. But alas, from what you state, round robin is in the scheduling code path you observe. |
|
|
|
|
|
|
#13 |
|
Aug 2002
223 Posts |
Two quick things:
1) Taskmgr in a dual box will show 100% for BOTH cpu's, so if one is solid, and the other is idle, it's displayed as 50% utilization in taskmgr. 2) Setting affinity is documented in the txt files that come with prime95. :) |
|
|
|
|
|
#14 |
|
Dec 2003
Texas
38 Posts |
Intel(R) Pentium(R) 4 CPU 3.20GHz
CPU speed: 3191.83 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 8 KB L2 cache size: 512 KB L1 cache line size: 64 bytes L2 cache line size: 128 bytes TLBS: 64 Prime95 version 23.7, RdtscTiming=1 W2K Pro, running as a file server in a p2p, 8 user net. HT active SOYO Dragon 2 Platinum MB System Bus 800Mhz Corsair Platinum Twinx 400 DDR XMS 3200(2 512 Sticks, dual channel) Antec True-550 low noise power supply (fan speed 1565) Dual Maxtor 200GB Serial ATA (raid, mirrored) Koolance PC2 Liquid-Cooling System CPU runs at 34c One 90mm case fan running at 2220 Chassis temp 40c vcore 1.47V, 3.29V, 5.05V Dram 2.56V AGP 1.47V Prime A with affinity set to 0 Prime B is halted Prime95 Results for prime A Best time for 384K FFT length: 11.224 ms. Best time for 448K FFT length: 13.382 ms. Best time for 512K FFT length: 15.250 ms. Best time for 640K FFT length: 18.187 ms. Best time for 768K FFT length: 22.068 ms. Best time for 896K FFT length: 26.252 ms. Best time for 1024K FFT length: 29.467 ms. Best time for 1280K FFT length: 38.685 ms. Best time for 1536K FFT length: 47.614 ms. Best time for 1792K FFT length: 56.342 ms. Best time for 2048K FFT length: 63.821 ms. Prime A with affinity set to any cpu Prime B is halted Prime95 Results for prime A Best time for 384K FFT length: 11.271 ms. Best time for 448K FFT length: 18.674 ms. Best time for 512K FFT length: 21.086 ms. Best time for 640K FFT length: 25.116 ms. Best time for 768K FFT length: 30.669 ms. Best time for 896K FFT length: 36.408 ms. Best time for 1024K FFT length: 40.667 ms. Best time for 1280K FFT length: 53.852 ms. Best time for 1536K FFT length: 65.949 ms. Best time for 1792K FFT length: 78.736 ms. Best time for 2048K FFT length: 88.409 ms. Prime A with affinity set to 0, running benchmark Prime B with affinity set to 1, testing exp 33634787 Prime95 Results for prime A Best time for 384K FFT length: 11.271 ms. Best time for 448K FFT length: 18.674 ms. Best time for 512K FFT length: 21.086 ms. Best time for 640K FFT length: 25.116 ms. Best time for 768K FFT length: 30.669 ms. Best time for 896K FFT length: 36.408 ms. Best time for 1024K FFT length: 40.667 ms. Best time for 1280K FFT length: 53.852 ms. Best time for 1536K FFT length: 65.949 ms. Best time for 1792K FFT length: 78.736 ms. Best time for 2048K FFT length: 88.409 ms. Prime A with affinity set to any cpu, running benchmark Prime B with affinity set to any cpu, testing exp 33634787 Prime95 Results for prime A Best time for 384K FFT length: 20.725 ms. Best time for 448K FFT length: 24.463 ms. Best time for 512K FFT length: 28.649 ms. Best time for 640K FFT length: 35.109 ms. Best time for 768K FFT length: 41.827 ms. Best time for 896K FFT length: 52.986 ms. Best time for 1024K FFT length: 59.494 ms. Best time for 1280K FFT length: 79.947 ms. Best time for 1536K FFT length: 96.102 ms. Best time for 1792K FFT length: 116.078 ms. Best time for 2048K FFT length: 132.234 ms. Prime A with affinity set to 0, running benchmark Prime B with affinity set to any cpu, testing exp 33634787 Prime95 Results for prime A Best time for 384K FFT length: 11.351 ms. Best time for 448K FFT length: 18.752 ms. Best time for 512K FFT length: 21.117 ms. Best time for 640K FFT length: 25.401 ms. Best time for 768K FFT length: 33.551 ms. Best time for 896K FFT length: 39.940 ms. Best time for 1024K FFT length: 49.909 ms. Best time for 1280K FFT length: 67.462 ms. Best time for 1536K FFT length: 87.204 ms. Best time for 1792K FFT length: 101.409 ms. Best time for 2048K FFT length: 116.812 ms. The following benchmarks were run as simultaneously as possible. This is "average" of 5 sets. Prime A with affinity set to 0, running benchmark Prime B with affinity set to 1, running benchmark Prime95 Results for prime A Best time for 384K FFT length: 11.220 ms. Best time for 448K FFT length: 13.391 ms. Best time for 512K FFT length: 23.290 ms. Best time for 640K FFT length: 27.669 ms. Best time for 768K FFT length: 33.755 ms. Best time for 896K FFT length: 44.602 ms. Best time for 1024K FFT length: 58.193 ms. Best time for 1280K FFT length: 59.085 ms. Best time for 1536K FFT length: 72.465 ms. Best time for 1792K FFT length: 86.623 ms. Best time for 2048K FFT length: 97.101 ms. Prime95 Results for prime B Best time for 384K FFT length: 17.016 ms. Best time for 448K FFT length: 20.447 ms. Best time for 512K FFT length: 23.088 ms. Best time for 640K FFT length: 27.494 ms. Best time for 768K FFT length: 33.495 ms. Best time for 896K FFT length: 39.946 ms. Best time for 1024K FFT length: 49.722 ms. Best time for 1280K FFT length: 66.220 ms. Best time for 1536K FFT length: 80.628 ms. Best time for 1792K FFT length: 95.117 ms. Best time for 2048K FFT length: 88.393 ms. |
|
|
|
|
|
#15 |
|
Aug 2002
Minneapolis, MN
22×3×19 Posts |
Dual P4 Hyper-Threading benchmark results:
All tests were done 3 times, and averaged. Nothing else was running on the machine. The machine is a Compaq ML530 dual P4 2.4GHz (400MHz FSB) HT Xeon 4GB ECC ram, SAN / fiber disk running Windows 2003 server standard edition. Running the benchmark results, I saw some really strange results. They were actually misleading. Here is a link the "benchmark" results - (The numbers are misleading!) http://www.15k.org/mersenne/results.xls After running the benchmarks, I wanted to see what happened when running real prime tests. Here is what I found. With multiple copies running, I averaged the time from all running clients. I ran a few tests on a 33M - 33559517 (Same machine as above) and here is what I found: 1 copy of Prime95 running, Per iteration time: 0.114 sec. (With or without affinity set) 2 copies of Prime95 running, Per iteration time: 0.129 sec. (With Affinity set to 0 for the first copy, and set to 1 or 3 for the second copy) 2 copies of Prime95 running, Per iteration time: 0.245 sec. (With Affinity set to 0 for the first copy, and set to 2 for the second copy - This is the same this as running a single HT CPU system) 2 copies of Prime95 running, Per iteration time: 0.129 to 0.247 - average = 0.177 sec. (With Affinity set to run on any CPU) 3 copies of Prime95 running, Per iteration time: 0.132 to 0.264 - average = 0.210 sec. From this, having the affinity set has a big impact on running 2 copies! Running the 2nd copy has ~13% speed decrease on both running copies - that's 3 hours per day longer, per copy - 6 "extra" hours of work for the 2 copies Last fiddled with by SlashDude on 2003-12-04 at 14:25 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Hyperthreading | TheMawn | Hardware | 12 | 2013-08-15 00:03 |
| Hyperthreading | Primeinator | Information & Answers | 13 | 2010-05-20 15:15 |
| Hyperthreading | Jud McCranie | Information & Answers | 11 | 2009-03-05 06:41 |
| Should hyperthreading be used? | Electrolyte | Hardware | 5 | 2006-11-08 01:29 |
| Hyperthreading | dave_0273 | Hardware | 5 | 2003-12-12 13:22 |