mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

View Poll Results: 4 cores on 1 worker,or 4 workers on 1 core each?
4 cores-1 worker 2 16.67%
It's pretty much the same 1 8.33%
It depends on the number and the method(elaborate below) 9 75.00%
4 workers 1 core each 0 0%
Multiple Choice Poll. Voters: 12. You may not vote on this poll

Reply
 
Thread Tools
Old 2016-02-04, 09:16   #1
arbiter21
 
Feb 2016

816 Posts
Lightbulb Is it faster to run 1 worker or many?

Hi,I'm new to GIMPS and I saw this post http://www.mersenneforum.org/showthread.php?t=20570 and I would like a more detailed answer.
I have a 4 core i7 4790,is it faster to run all 4 cores on 1 worker,or to have 4 workers working on 1 core each and why?
I did a little experiment and it seemed the same : with 4 exponents at about 40m ,when running 4 workers each needed about 205 hours,while 1 worker with 4 cores just 50 hours.
arbiter21 is offline   Reply With Quote
Old 2016-02-04, 09:40   #2
axn
 
axn's Avatar
 
Jun 2003

13×359 Posts
Default

Quote:
Originally Posted by arbiter21 View Post
I did a little experiment and it seemed the same : with 4 exponents at about 40m ,when running 4 workers each needed about 205 hours,while 1 worker with 4 cores just 50 hours.
A poll? This is not a matter of public opinion - it is a matter of fact.

Test it. Run each configuration for a while. Go with whatever works for you.
axn is offline   Reply With Quote
Old 2016-02-04, 10:21   #3
arbiter21
 
Feb 2016

23 Posts
Default

Quote:
Originally Posted by axn View Post
A poll? This is not a matter of public opinion - it is a matter of fact.

Test it. Run each configuration for a while. Go with whatever works for you.
I liked it xD .Btw you didn't answer the question,i tested it and I saw it was exactly the same,but I read it's not...that's why I asked.
arbiter21 is offline   Reply With Quote
Old 2016-02-04, 10:50   #4
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

160658 Posts
Default

Mostly, it's a matter of memory bandwidth contention. Modern processors are so fast, and the code so well optimized, that they crunch the data faster than they can read the data from memory. So unless you have quad channel or DDR4 or high frequency memory (or some combination of those), there's a good chance that there's memory contention. And if there's memory contention, it can sometimes be worthwhile to run less workers with more cores per worker. The trade off is that more cores per worker is typically less efficient at the actual crunching, while less workers uses less memory bandwidth. So if you have a modern processor where the data crunching far outstrips the memory bandwidth, then yes the tradeoff may be to run two or four cores per worker.

Like axn said though, it's not a matter of opinion, nor is it the same answer for every person or hardware configuration. The only real way to know is to test it for yourself, which you had already done before posting, hence the mild snark.

It does seem that, in your case, four cores all on one worker is more efficient than four workers with a core each. But, have you tried two workers with two cores each? The other question is, have you disabled hyperthreading? Prime95 is one of perhaps a handful of applications out there where the code is so well tuned that hyperthreading actually hurts rather than helps, and it may be influencing your tests.
Dubslow is offline   Reply With Quote
Old 2016-02-04, 10:51   #5
axn
 
axn's Avatar
 
Jun 2003

13·359 Posts
Default

Quote:
Originally Posted by arbiter21 View Post
i tested it and I saw it was exactly the same,but I read it's not...
Who you gonna believe -- your lyin' eyes or what the internet tells you?

The amount of calculation required is not going to change whichever way you do it, so in some fundamental sense, these ought to be the same. However, when multiple tests are running, they all fight for the cache, but when only one test is running (multi-threaded), there is overhead of thread synchronization and what nots. Which of these is the lesser of the two evils depend upon the exponent size, number of cores, L3 cache size, memory bandwidth, etc.

The only general comment is to test your specific setup for a given FFT size and select whichever one seems to give the best throughput.

Last fiddled with by axn on 2016-02-04 at 10:52
axn is offline   Reply With Quote
Old 2016-02-04, 11:58   #6
arbiter21
 
Feb 2016

23 Posts
Default

Thank you both for your answers,something else I noticed which is kinda weird.
I have 8 threads, if I give all 8 of them to 1 worker it needs 48h to complete.
If I give it 4,then again it needs 48h to complete, I notice that when cpu1 works at 100% then cpu2 is at 0% and then cpu1 goes 0% and cpu 2 100%,same thing happens with cpu3,4 5,6 and 7,8.
What's going on here?50% cpu usage but same time with 100%?
arbiter21 is offline   Reply With Quote
Old 2016-02-04, 12:06   #7
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

160658 Posts
Default

That's the hyperthreading I mentioned. There are only 4 cores, and if you're in Windows, then what the task manager shows as cores 1 and 2 is actually two threads running on the same physical core. Ditto 3-4 is one physical core, 5-6, and 7-8.
Dubslow is offline   Reply With Quote
Old 2016-02-04, 12:09   #8
arbiter21
 
Feb 2016

816 Posts
Default

Quote:
Originally Posted by Dubslow View Post
That's the hyperthreading I mentioned. There are only 4 cores, and if you're in Windows, then what the task manager shows as cores 1 and 2 is actually two threads running on the same physical core. Ditto 3-4 is one physical core, 5-6, and 7-8.
So how does it work,can you elaborate?How come with 50% cpu it's doing the same work as with 100%?Thanks!
arbiter21 is offline   Reply With Quote
Old 2016-02-04, 12:19   #9
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

If Prime95 is using either virtual cores 1 or 2, then physical core 1 is being fully utilized, even if Windows only sees one of two threads as "in use".

So if Prime95 is running 4 threads, one each on the virtual cores 1, 3, 5 and 7, then all four physical processors are being fully utilized, even though Windows sees virtual cores 2, 4, 6 and 8 as being "empty", and so shows 50%.
Dubslow is offline   Reply With Quote
Old 2016-02-04, 12:29   #10
arbiter21
 
Feb 2016

816 Posts
Default

Quote:
Originally Posted by Dubslow View Post
If Prime95 is using either virtual cores 1 or 2, then physical core 1 is being fully utilized, even if Windows only sees one of two threads as "in use".

So if Prime95 is running 4 threads, one each on the virtual cores 1, 3, 5 and 7, then all four physical processors are being fully utilized, even though Windows sees virtual cores 2, 4, 6 and 8 as being "empty", and so shows 50%.
That's interesting...So when I,for example, watch a video and from 50% cpu it goes to 60%,prime95 actualy throttles to let me use that extra 10% cpu right? ( not sure if throttles is the right word).
Moreover when running 4 threads instead of 8 i notice a small difference in ms/iter time,with 4 threads it's about 4ms/iter ,while with 8 it's 4,5 ms/iter.Is that because of what you said about multithreading?(by the way thank you for your time and information)

Last fiddled with by arbiter21 on 2016-02-04 at 12:30
arbiter21 is offline   Reply With Quote
Old 2016-02-04, 13:48   #11
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

5×11×157 Posts
Default

@OP:
1. Go to the "options" menu in P95,
2. select "benchmark",
3. let it finish,
4. analyze the numbers (they are saved in the result file too, where you can use your favorite viewer)
5. ask when you don't know...
...
31. Profit

Every system is different.

Last fiddled with by LaurV on 2016-02-04 at 13:48
LaurV is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Worker #5 and Worker#7 not running (Error ILLEGAL SUMOUT skrupian08 Information & Answers 9 2016-08-23 16:35
faster than LL? paulunderwood Miscellaneous Math 13 2016-08-02 00:05
Having more than 1 worker Unregistered Information & Answers 1 2010-06-07 01:12
Faster way to do LLT? 1260 Miscellaneous Math 23 2005-09-04 07:12
Faster than LL? clowns789 Miscellaneous Math 3 2004-05-27 23:39

All times are UTC. The time now is 06:32.

Sat Aug 8 06:32:14 UTC 2020 up 22 days, 2:19, 1 user, load averages: 1.87, 1.95, 1.94

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.