![]() |
Prime95 24.14 and hyperthreading
I know this has been asked many times, but:
Some time ago, I noticed that my task manager was displaying two processors even though I only have one. From what I hear, this is due to hyperthreading. While it appears that Prime95 is using only 50% of the CPU resources, it is actually using 100%. Does this mean that it is not possible to run another instance of Prime95? |
You can, but I don't think you will get any more throughput from it.
|
[QUOTE=ixfd64;145423]I know this has been asked many times, but:
Some time ago, I noticed that my task manager was displaying two processors even though I only have one. From what I hear, this is due to hyperthreading. While it appears that Prime95 is using only 50% of the CPU resources, it is actually using 100%. Does this mean that it is not possible to run another instance of Prime95?[/QUOTE] You could run another instance of Prime95, but what hyperthreading does is to provide a 'second processor' which uses the resources not being used by the program running on the main processor. Since Prime95 is exceptionally well-written and uses all the resources available in about as efficient a manner as possible, there's almost nothing left for the 'second processor'. |
[QUOTE=fivemack;145437]You could run another instance of Prime95 but... ..there's almost nothing left for the 'second processor'[/QUOTE]
I run two instances of Prime95 on a hyperthreaded P4. I reckon i get about 5-10% more throughput than a single instance. But its slightly more work to make sure the two threads don't argue with each other. Richard |
You can switch to 25.7 and use 2 threads on the same number.
|
I've heard that running two instances on a hyperthreaded machine will, as was stated once or twice already in this thread, squeeze another 5-10% of throughput out of your machine, though usually at a slight expense in CPU temperature. I personally would probably run two instances most of the time, though back when I had a hyperthreaded P4 (since upgraded to a Core 2 Duo) I would sometimes just run one instance for convenience' sake. :smile:
|
On my Northwood P4 I have found that for LL testing with large FFT sizes there is less than 1% throughput to be gained, but for very small FFT sizes with ECM there can be more than 25% gain from running two hyperthreaded processes.
Best to try it out on your own machine and make the comparison for yourself. |
[quote=geoff;145507]On my Northwood P4 I have found that for LL testing with large FFT sizes there is less than 1% throughput to be gained, but for very small FFT sizes with ECM there can be more than 25% gain from running two hyperthreaded processes.
Best to try it out on your own machine and make the comparison for yourself.[/quote] Hmm...interesting. It would seem that there is less of a difference between one thread and two threads for large FFT's, than there is for small FFT's. This would be consistent with my personal experience, as the vast majority of the FPU-intensive distributed computing work that I have done has involved small FFT's (such as LLR, PRP, etc., all of which primarily use small FFT's at their current overall testing levels). Anyone else notice a similar pattern of diminishing returns from hyperthreading as FFT sizes get bigger? |
[quote=mdettweiler;145514]Anyone else notice a similar pattern of diminishing returns from hyperthreading as FFT sizes get bigger?[/quote]If so, that would be in accordance with the increasing percentage of overall time that the software spends in the main FP compute loops for larger FFTs. A pipeline-full of floating-point instructions for one hyperthread would allow non-FP overhead instructions to be done expeditiously in the second hyperthread, for as long as there were enough non-FP instructions to be executed there before the second hyperthread loaded up the FP pipeline from its side or the first one's FP-pipe finished, that is.
|
[quote=cheesehead;145572]If so, that would be in accordance with the increasing percentage of overall time that the software spends in the main FP compute loops for larger FFTs. A pipeline-full of floating-point instructions for one hyperthread would allow non-FP overhead instructions to be done expeditiously in the second hyperthread, for as long as there were enough non-FP instructions to be executed there before the second hyperthread loaded up the FP pipeline from its side or the first one's FP-pipe finished, that is.[/quote]
So...with this in mind, I wonder how one thread vs. two threads would compare for low-FPU-usage tasks such as trial factoring or sieving? Has anyone ever compared these? |
[QUOTE=cheesehead;145572]If so, that would be in accordance with the increasing percentage of overall time that the software spends in the main FP compute loops for larger FFTs. A pipeline-full of floating-point instructions for one hyperthread would allow non-FP overhead instructions to be done expeditiously in the second hyperthread, for as long as there were enough non-FP instructions to be executed there before the second hyperthread loaded up the FP pipeline from its side or the first one's FP-pipe finished, that is.[/QUOTE]
Hyperthreading can sometimes work well even when both processes are executing the same type of instructions. For example if two 32-bit processes are executing instructions that operate on the lower 32 bits of an %xmm register, then I think hyperthreading can combine both instructions and execute them together as one 2x32-bit vector operation. |
| All times are UTC. The time now is 22:03. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.