mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2003-06-10, 16:28   #12
nomadicus
 
nomadicus's Avatar
 
Jan 2003
North Carolina

2·3·41 Posts
Default

Quote:
Originally Posted by NookieN
As a side note, Win XP's handling of pre-emption with HT isn't very efficient. If I run a multi-threaded program at normal priority with Prime95 running in the background (at idle priority), Prime can still steal cycles from the other program. If you're using SMT-aware software and need the best possible performance with it, you should probably disable Prime while that application is running.
As you noted, that Win XP's pre-emption scheduling isn't very efficient. Not efficient, more like XP probably doesn't do pre-emptive scheduling.

A time slice is given to each thread as it becomes computable. If the low prio thread runs to completion of that time slice without interruption from the high prio thread, then XP does round robin not pre-emptive scheduling.

Unless Win XP's scheduler goes out of the way to stop the low prio thread dead in it's tracks, that thread completes its time slice even if a higher prio thread is available for execution. If all that is correct, I can't say that prime95 is the problem; the OS doesn't have the intelligence to stop the low prio thread dead in its tracks and replace it with the higher prio thread. Under most circumstances you want OS scheduler to be round robin since pre-emptive scheduling overhead could climb through the roof allowing for very little productive CPU work to be done (although it should/could appear that the higher prio threads are running faster then what you see today -- what an irony!).

It's a CPU scheduler design compromise. Do I maximize the use of the CPU with minimal overhead, or do I maximize the response time the user sees with a lot (potentially) of wasted CPU scheduling overhead. Obviously it is the former since MS knows in its infinite wisdom that 99% of all CPU cycles are wasted -- except for us GIMPster, et al.

Personal opinion is that a 2GHz+ machine would handle the pre-emptive scheduling with minimal impact. But alas, from what you state, round robin is in the scheduling code path you observe.
nomadicus is offline   Reply With Quote
Old 2003-06-12, 02:09   #13
Paulie
 
Paulie's Avatar
 
Aug 2002

223 Posts
Default

Two quick things:

1) Taskmgr in a dual box will show 100% for BOTH cpu's, so if one is solid, and the other is idle, it's displayed as 50% utilization in taskmgr.

2) Setting affinity is documented in the txt files that come with prime95. :)
Paulie is offline   Reply With Quote
Old 2003-12-04, 07:19   #14
TheKelster
 
Dec 2003
Texas

38 Posts
Default HT Benchmark trials

Intel(R) Pentium(R) 4 CPU 3.20GHz
CPU speed: 3191.83 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.7, RdtscTiming=1

W2K Pro, running as a file server in a p2p, 8 user net.
HT active
SOYO Dragon 2 Platinum MB
System Bus 800Mhz
Corsair Platinum Twinx 400 DDR XMS 3200(2 512 Sticks, dual

channel)
Antec True-550 low noise power supply (fan speed 1565)
Dual Maxtor 200GB Serial ATA (raid, mirrored)
Koolance PC2 Liquid-Cooling System
CPU runs at 34c
One 90mm case fan running at 2220
Chassis temp 40c
vcore 1.47V, 3.29V, 5.05V
Dram 2.56V
AGP 1.47V

Prime A with affinity set to 0
Prime B is halted
Prime95 Results for prime A
Best time for 384K FFT length: 11.224 ms.
Best time for 448K FFT length: 13.382 ms.
Best time for 512K FFT length: 15.250 ms.
Best time for 640K FFT length: 18.187 ms.
Best time for 768K FFT length: 22.068 ms.
Best time for 896K FFT length: 26.252 ms.
Best time for 1024K FFT length: 29.467 ms.
Best time for 1280K FFT length: 38.685 ms.
Best time for 1536K FFT length: 47.614 ms.
Best time for 1792K FFT length: 56.342 ms.
Best time for 2048K FFT length: 63.821 ms.

Prime A with affinity set to any cpu
Prime B is halted
Prime95 Results for prime A
Best time for 384K FFT length: 11.271 ms.
Best time for 448K FFT length: 18.674 ms.
Best time for 512K FFT length: 21.086 ms.
Best time for 640K FFT length: 25.116 ms.
Best time for 768K FFT length: 30.669 ms.
Best time for 896K FFT length: 36.408 ms.
Best time for 1024K FFT length: 40.667 ms.
Best time for 1280K FFT length: 53.852 ms.
Best time for 1536K FFT length: 65.949 ms.
Best time for 1792K FFT length: 78.736 ms.
Best time for 2048K FFT length: 88.409 ms.

Prime A with affinity set to 0, running benchmark
Prime B with affinity set to 1, testing exp 33634787
Prime95 Results for prime A
Best time for 384K FFT length: 11.271 ms.
Best time for 448K FFT length: 18.674 ms.
Best time for 512K FFT length: 21.086 ms.
Best time for 640K FFT length: 25.116 ms.
Best time for 768K FFT length: 30.669 ms.
Best time for 896K FFT length: 36.408 ms.
Best time for 1024K FFT length: 40.667 ms.
Best time for 1280K FFT length: 53.852 ms.
Best time for 1536K FFT length: 65.949 ms.
Best time for 1792K FFT length: 78.736 ms.
Best time for 2048K FFT length: 88.409 ms.

Prime A with affinity set to any cpu, running benchmark
Prime B with affinity set to any cpu, testing exp 33634787
Prime95 Results for prime A
Best time for 384K FFT length: 20.725 ms.
Best time for 448K FFT length: 24.463 ms.
Best time for 512K FFT length: 28.649 ms.
Best time for 640K FFT length: 35.109 ms.
Best time for 768K FFT length: 41.827 ms.
Best time for 896K FFT length: 52.986 ms.
Best time for 1024K FFT length: 59.494 ms.
Best time for 1280K FFT length: 79.947 ms.
Best time for 1536K FFT length: 96.102 ms.
Best time for 1792K FFT length: 116.078 ms.
Best time for 2048K FFT length: 132.234 ms.

Prime A with affinity set to 0, running benchmark
Prime B with affinity set to any cpu, testing exp 33634787
Prime95 Results for prime A
Best time for 384K FFT length: 11.351 ms.
Best time for 448K FFT length: 18.752 ms.
Best time for 512K FFT length: 21.117 ms.
Best time for 640K FFT length: 25.401 ms.
Best time for 768K FFT length: 33.551 ms.
Best time for 896K FFT length: 39.940 ms.
Best time for 1024K FFT length: 49.909 ms.
Best time for 1280K FFT length: 67.462 ms.
Best time for 1536K FFT length: 87.204 ms.
Best time for 1792K FFT length: 101.409 ms.
Best time for 2048K FFT length: 116.812 ms.

The following benchmarks were run as simultaneously as

possible. This is "average" of 5 sets.
Prime A with affinity set to 0, running benchmark
Prime B with affinity set to 1, running benchmark
Prime95 Results for prime A
Best time for 384K FFT length: 11.220 ms.
Best time for 448K FFT length: 13.391 ms.
Best time for 512K FFT length: 23.290 ms.
Best time for 640K FFT length: 27.669 ms.
Best time for 768K FFT length: 33.755 ms.
Best time for 896K FFT length: 44.602 ms.
Best time for 1024K FFT length: 58.193 ms.
Best time for 1280K FFT length: 59.085 ms.
Best time for 1536K FFT length: 72.465 ms.
Best time for 1792K FFT length: 86.623 ms.
Best time for 2048K FFT length: 97.101 ms.
Prime95 Results for prime B
Best time for 384K FFT length: 17.016 ms.
Best time for 448K FFT length: 20.447 ms.
Best time for 512K FFT length: 23.088 ms.
Best time for 640K FFT length: 27.494 ms.
Best time for 768K FFT length: 33.495 ms.
Best time for 896K FFT length: 39.946 ms.
Best time for 1024K FFT length: 49.722 ms.
Best time for 1280K FFT length: 66.220 ms.
Best time for 1536K FFT length: 80.628 ms.
Best time for 1792K FFT length: 95.117 ms.
Best time for 2048K FFT length: 88.393 ms.
TheKelster is offline   Reply With Quote
Old 2003-12-04, 14:23   #15
SlashDude
 
SlashDude's Avatar
 
Aug 2002
Minneapolis, MN

22×3×19 Posts
Default

Dual P4 Hyper-Threading benchmark results:
All tests were done 3 times, and averaged.
Nothing else was running on the machine.
The machine is a Compaq ML530 dual P4 2.4GHz (400MHz FSB) HT Xeon 4GB ECC ram, SAN / fiber disk running Windows 2003 server standard edition.

Running the benchmark results, I saw some really strange results. They were actually misleading. Here is a link the "benchmark" results - (The numbers are misleading!) http://www.15k.org/mersenne/results.xls

After running the benchmarks, I wanted to see what happened when running real prime tests.

Here is what I found. With multiple copies running, I averaged the time from all running clients.

I ran a few tests on a 33M - 33559517 (Same machine as above) and here is what I found:
1 copy of Prime95 running, Per iteration time: 0.114 sec. (With or without affinity set)
2 copies of Prime95 running, Per iteration time: 0.129 sec. (With Affinity set to 0 for the first copy, and set to 1 or 3 for the second copy)
2 copies of Prime95 running, Per iteration time: 0.245 sec. (With Affinity set to 0 for the first copy, and set to 2 for the second copy - This is the same this as running a single HT CPU system)
2 copies of Prime95 running, Per iteration time: 0.129 to 0.247 - average = 0.177 sec. (With Affinity set to run on any CPU)
3 copies of Prime95 running, Per iteration time: 0.132 to 0.264 - average = 0.210 sec.

From this, having the affinity set has a big impact on running 2 copies!
Running the 2nd copy has ~13% speed decrease on both running copies - that's 3 hours per day longer, per copy - 6 "extra" hours of work for the 2 copies

Last fiddled with by SlashDude on 2003-12-04 at 14:25
SlashDude is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Hyperthreading TheMawn Hardware 12 2013-08-15 00:03
Hyperthreading Primeinator Information & Answers 13 2010-05-20 15:15
Hyperthreading Jud McCranie Information & Answers 11 2009-03-05 06:41
Should hyperthreading be used? Electrolyte Hardware 5 2006-11-08 01:29
Hyperthreading dave_0273 Hardware 5 2003-12-12 13:22

All times are UTC. The time now is 16:12.


Fri Jul 7 16:12:22 UTC 2023 up 323 days, 13:40, 0 users, load averages: 1.85, 1.49, 1.27

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔