mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2005-09-26, 02:52   #1
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,529 Posts
Default Hyperthreaded Machines & V24.15

I'm adding a feature for hyperthreaded CPUs and need help testing it - I don't have a hyperthreaded CPU.

The problem is that when a foreground process runs at normal priority, the OS schedules prime95 on the virtual CPU even though its priority is low. Once both tasks are scheduled to run, the CPU treats them as equal priority. Thus, prime95 steals cycles from the foreground task, reducing responsiveness.

If you have a hyperthreaded CPU, please try out ftp://mersenne.org/gimps/p95tst.zip. It tries to get around this problem by pausing prime95 for 30 seconds if two successive iterations are 40% slower than a normal iteration.

Let me know if this helps foreground task responsiveness and if you think pause occurs too frequently or too infrequently.
Prime95 is online now   Reply With Quote
Old 2005-09-26, 13:16   #2
Peter Nelson
 
Peter Nelson's Avatar
 
Oct 2004

232 Posts
Default

I have a 3GHz P4 Northwood which supports hyperthreading. I could run this test on it.

Sorry I didn't really follow your explanation of what this enhancement does.

I assume this is designed suitable for single CPU with two hyperthreads.

So is the idea that one thread always runs normally?

And the other thread will pause when the user does something in another app? Or do both instances get paused?

I dont know how you compare 40% performance to what standard iteration measurement?

Sorry if I'm being dense but it's not yet clear to me what you are trying to achieve or how.
Peter Nelson is offline   Reply With Quote
Old 2005-09-26, 14:09   #3
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,529 Posts
Default

This feature is for hyperthreaded machines running a single instance of prime95.

For example, you aren't using the machine and prime95 chugs along at 0.040 seconds per iteration. When you start using the machine to run a game (or spreadsheet or whatever) the OS will run both your game and prime95 on the two logical CPUs. As soon as Prime95 sees to consecutive iterations of 0.056 seconds it will pause for 30 seconds to give your game full use of the CPU. After 30 seconds if you get another iteration of 0.056 seconds or more it will pause for another 30 seconds and so on until you finish your game.
Prime95 is online now   Reply With Quote
Old 2005-09-27, 14:39   #4
Peter Nelson
 
Peter Nelson's Avatar
 
Oct 2004

10218 Posts
Default

Thanks George I understand what you are trying to do now.

I think if I was going to run a game I would do it for more time, at least several minutes.

Perhaps you could change 30 secs to every 10 mins without ill effect (at average you would lose max 5 minutes priming time after the game ended.

Anyway its not just about cpu usage but memory availability.

When I play CounterStrike:Source game (based on Valve Halflife 2) I always shut down prime95 first then play the game then restart the p95 client after. The source engine seems to grab lots of memory use it and the cpu intensively, then release it after.

Perhaps on such detection of other process using cpu the prime95 could still be there but save its state (?to disk?) and release allocated memory to the OS. After a while it can request mem again and restore test state for continuation. I suppose the halt/continue should work a little like the "laptop running on battery" option.

Oh, and I am surprised to think of running SINGLE instance on a hyperthread capable machine (although *I* usually do this cos I dont like HT). I know some people run TWO p95 instances to use both threads as p95. If so, would this detection method not result in one instance going to sleep unless you start them almost instantaneously?

For this reason maybe this 30sec (or whatever) sleeping behaviour should be toggled by a menu option rather than be the default mode of operation.

Would this just apply to LL iterations or will you be putting this enhancement also into the trial factoring iterations times?
Peter Nelson is offline   Reply With Quote
Old 2005-09-27, 14:44   #5
Cruelty
 
Cruelty's Avatar
 
May 2005

23·7·29 Posts
Default

I'm not sure if this is what you actually mean, but prime95 already has something like that:
PauseWhileRunning=prog1,prog2,prog3,etc

Check "undoc.txt" in your prime95 directory for details.
Cruelty is offline   Reply With Quote
Old 2005-09-27, 15:05   #6
Peter Nelson
 
Peter Nelson's Avatar
 
Oct 2004

52910 Posts
Default

Thanks cruelty, I'd come across that option before.

I might try it for playing Counterstrike :-)

I think what George intends is to make it more generic so that you dont have to know the name of the game you will play in advance.
Peter Nelson is offline   Reply With Quote
Old 2005-09-27, 18:39   #7
PhilF
 
PhilF's Avatar
 
Feb 2005
Colorado

22·163 Posts
Default

George, if you do implement your idea, please make it an option and not automatic. Personally, I like the fact that when I am watching video, for example, Prime95 still gets a good piece of CPU time because of Hyper Threading. If I ever do anything that needs all of the CPU, I just do as Peter does and stop Prime95.

I'm not sure about this, but I think on Hyper Threaded machines Prime95 "steals" fewer clock cycles from other programs if the affinity is set to 1 or 2 instead of zero. I say this because Prime95 suffers less of a slow down when I am running other CPU intensive programs and Prime95's affinity is set to zero. I leave mine set to zero, since in my case this "stealing" causes no problems with the other programs I run. I can provide you with some actual numbers if you like.
PhilF is offline   Reply With Quote
Old 2005-09-28, 02:20   #8
Peter Nelson
 
Peter Nelson's Avatar
 
Oct 2004

10218 Posts
Default

OK george, i'm running your test binary on my hyperthreading machine.

Its WIN XP 32 bit with SP2.

P4 Northwood @ 3GHZ.

I allowed it to do selftesting which passed.

Its now working on a mersenne number.

At the moment its doing the P1 stage 1.

I have set the screen iterations to every 100.

It reports on-screen timings of about 8 seconds.

This shows in task manager as 48,49 or 50 % cpu and graphs in virtual cpu performance left graph.

Now I run my CS:Source game and graph of second virt cpu goes right up.

Iteration time falls to 13 seconds.

But it just keeps running.

The game is noticeably impaired (subjectively by about 30 fps and slower reactions which means I get killed more).

Now, perhaps your modifications dont work in P1 only actual LL test iterations.

I will leave it going and report timings and behaviour during the LL stage.

It might be worthwhile if your binary said something in the window like "going to sleep for 30 seconds to allow other tasks" and "processing resumed".

This would allow us to see when the functionality kicks in and out (if at all).

Right now it appears in P1 just to keep running.
Peter Nelson is offline   Reply With Quote
Old 2005-09-28, 02:55   #9
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,529 Posts
Default

The new code doesn't kick in until you are LL testing.
Prime95 is online now   Reply With Quote
Old 2005-09-28, 14:23   #10
Mystwalker
 
Mystwalker's Avatar
 
Jul 2004
Potsdam, Germany

3×277 Posts
Thumbs up

First, I want to state that I consider this feature a briliant one. I already had some problems running DC programs on HT systems. E.g. compile times are very high, so I typically turn of the programs before. Using 2 programs simultaneously to take use of HT, even videos stutter from time to time!

Taking one of Peter's ideas further, a number of users run two different DC applications to make best use of hyperthreading.
Maybe there could be an option similar to PauseWhileRunning, which states programs (which would be the other DC program(s) of choice) that shouldn't be respected when considering a pause.

Maybe it would be feasible to check the need for CPU power directly from a CPU load source of the OS?
Mystwalker is offline   Reply With Quote
Old 2005-09-28, 16:00   #11
Peter Nelson
 
Peter Nelson's Avatar
 
Oct 2004

232 Posts
Question MULTI INSTANCES INTEGRATION and ESTIMATING COMPLETION DATES

I think it is possible that a second instance of P95 might cause a first to pause (undesirable).

I agree P95 could ask the OS about load.

ALSO the software should and can interrogate to discover how many physical and/or virtual processors exist on the hardware. There are standard and documented ways to get this easily.

If the software knew this then it might adapt its behavior in respect of the sleeping thing.

I would PREFER to see effort going into a client which can invoke and manage multiple threads.

ie one prime95 menu to set options etc for all instances (which may not necessarily be the same settings).

eg menu

VP0 is running LL on 1234...... 23% complete using 25% cpu load
VP1 is running TF on 9876..... to 63 bits 85% complete using 25% cpu load
VP2 is running LL on 2345..... 99% complete using 25% cpu load
VP3 is running LL on 3455 .... 4% complete using 25% cpu load

Select processor thread to work with: 3

Currently testing 3455........ (LL test)

Would you like to pause this process?
Would you like to return this exponent? N

The STOP, Continue menu options could default to operating on all four threads.

This is an example for a machine with 2 physical processors each of 2 virtual total 4 threads.

ie ONE PROGRAM manages all FOUR threads. (Which may continue to have their config files in their own directories etc).

I appreciate this might not be simple but is the way to go as we have dualcore (with or without HT) now and multicores are coming soon.

This might be best implemented by separating the configuration menu and instance launcher from the p95 "engine" that does the actual work.

Also by knowing that a machine can run 2 LL tests (say) in parallel (eg one per cpu) it will be better able to calculate completion times for queued workloads.

You know the clock time taken to test one exponent (through timing iterations) BUT because the client is currently focussed on ONE thread only it doesn't know that the next two exponents will be worked on in parallel rather than sequentially. The finish time is therefore sooner than predicted. The client could retrieve work from server as one process and pass it out to each thread. It would be better this way cos the menu would know what work is on the machine rather than just its own process.

In time the server might also be improved (v5?) to correctly show more accurate expected completion dates. (They are currently miscalculated even for a single threading machine) cos its based on when the machine requested the number rather than when it is due to begin working on it. This needs fixing!

What do you think?
Peter Nelson is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Opteron is Hyperthreaded ? bgbeuning Information & Answers 3 2016-01-10 08:26
unable to detect some of the hyperthreaded logical cpus owned139 Hardware 5 2015-01-11 21:47
unable to detect some of the hyperthreaded logical cpus? jarablue Hardware 3 2013-09-16 01:58
Appeal for machines dave_dm GMP-ECM 0 2005-06-29 02:23
Machines R.D. Silverman NFSNET Discussion 12 2004-07-02 12:12

All times are UTC. The time now is 06:49.


Sat Jul 24 06:49:43 UTC 2021 up 1 day, 1:18, 1 user, load averages: 1.76, 1.61, 1.57

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.