mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2015-10-24, 00:57   #1
saeres
 
Oct 2015

2010 Posts
Default Multiple Core for 1 assignment

Is there a program that allows one to use multiple cores (in cpu) to complete an assignment. I read that someone managed to verify a mersenne prime using 32 server cores and wanted to take advantage of such utility.
saeres is offline   Reply With Quote
Old 2015-10-24, 01:12   #2
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

22·2,767 Posts
Default

The current version of Prime95 can use many cores on a single assignment.
Uncwilly is online now   Reply With Quote
Old 2015-10-24, 02:03   #3
saeres
 
Oct 2015

22×5 Posts
Default

I'm running 3 assignments on 1 quad core and they might reach 80% of total cpu power. I'm looking for 1 assignment to be computed by near 100% of cpu (1 assignment might us 30% max)
saeres is offline   Reply With Quote
Old 2015-10-24, 02:37   #4
saeres
 
Oct 2015

22×5 Posts
Default

Nevermind. For some reason prime95 showed that I had only dual core and now it's registering all 4 cores and is allowing me to address the assignment to all four cores.
saeres is offline   Reply With Quote
Old 2015-10-24, 05:40   #5
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

22·2,767 Posts
Default

Just so that you know:

You will get more assignments done for a given time (assuming it is long enough to be significant) by having each core work on its own assignment. Having 4 cores work on a single assignment will not get 4x the speed of a single core. It will be closer to 2.7x faster. There are others that have the data to demonstrate this. Run the tests. Use a single core and note the iteration times, then 2 cores, then 3, then 4.
Uncwilly is online now   Reply With Quote
Old 2016-02-03, 02:48   #6
billvau
 
Feb 2016

7 Posts
Default

Uncwilly - just saw your post on assigning cores. I'm a newbie that just started 2 days ago. I thought that in setting up the program I had to tell it the number of CPUs to use but now I can't find that. Can you help me by pointing me to configuration documentation? Or, do you know a "best" configuration?

Thanks much,
Bill
billvau is offline   Reply With Quote
Old 2016-02-03, 05:17   #7
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2·3·1,693 Posts
Default

To set workers, go to (menu) Test>Worker windows. Welcome to the project!
kladner is offline   Reply With Quote
Old 2016-02-03, 13:30   #8
billvau
 
Feb 2016

7 Posts
Default

Thanks, kladner!
billvau is offline   Reply With Quote
Old 2016-02-03, 20:19   #9
cuBerBruce
 
cuBerBruce's Avatar
 
Aug 2012
Mass., USA

1001111102 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
Just so that you know:

You will get more assignments done for a given time (assuming it is long enough to be significant) by having each core work on its own assignment. Having 4 cores work on a single assignment will not get 4x the speed of a single core. It will be closer to 2.7x faster. There are others that have the data to demonstrate this. Run the tests. Use a single core and note the iteration times, then 2 cores, then 3, then 4.
Beware of what Uncwilly is saying. He is comparing 1 worker with 4 cores versus 1 worker with 1 core, rather than comparing 4 workers with 1 core each versus 1 worker with 4 cores.

On my Haswell system, I get much less than 4x throughput when running 4 workers on 1 core each versus running 1 worker on a single core, just like Uncwilly's example. However, I get basically the same throughput running 1 worker with 4 cores as I do running 4 workers using 1 core each. My throughput seems to be only dependent on the number of physical cores being used, and essentially no significant effect due to cores per worker (as long as only 1 thread per physical core is being used).

You also need to make sure each thread is running on a separate physical core. I have given up on relying "smart assignment" to do the right thing, and simply edit my local.txt to make sure affinity is set up properly.

Also, do not use CPU usage percentage as a measure of how much work is actually getting done.
cuBerBruce is offline   Reply With Quote
Old 2016-02-03, 20:40   #10
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2C6E16 Posts
Default

Quote:
Originally Posted by cuBerBruce View Post
You also need to make sure each thread is running on a separate physical core. I have given up on relying "smart assignment" to do the right thing, and simply edit my local.txt to make sure affinity is set up properly.
I completely agree with you. After a great deal of experimentation, I found that running 4 tests on 4 real CPUs generated about the same net throughput as running 1 test on 4 real CPUs.

Can anyone say memory/cache bottleneck?

However, I find it a bit disappointing that Prime95 / mprime can't seem to figure out the optimal affinity on its own. When you have threads jumping around all the available CPUs (including the virtual ones) your net throughput is significantly affected.
chalsall is offline   Reply With Quote
Old 2016-02-04, 01:52   #11
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

41×251 Posts
Default

Quote:
Originally Posted by cuBerBruce View Post
I get basically the same throughput running 1 worker with 4 cores as I do running 4 workers using 1 core each
This only shows that your system does not have a memory bandwidth bottleneck, for the range of the exponents (FFT size) you are doing, or if it has one, it manifests itself in both cases. For the former, increasing the exponent will start showing differences in favor of 1 worker 4 cores, and for the last, decreasing the exponent will start showing differences in favor of 4 workers one core each. For all the systems I ever played with, the higher output flow I could get was (almost) always n workers with 1 core each, where n is the number of physical cores. With very seldom exceptions, and strange systems, sometime you can get few percents by using HT (doubling the number of workers or using 2 cores for each worker), with the cost that your system is much hotter, less responsive, less stable, and the CPU consumes almost double, etc. But generally, the best is when you go n workers, 1 core each, this only if your system has enough memory bandwidth. Again, this depends of the exponent, cache size, etc.
[edit: for the records, I use water cooling mostly, and I do overclock]

Last fiddled with by LaurV on 2016-02-04 at 01:59
LaurV is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Multiple threads per assignment? f0rteOC Hardware 3 2016-02-29 06:08
using multiple threads on an LL assignment tha Software 4 2016-02-02 13:49
Core i5 2500K vs Core i7 2600K (Linear algebra phase) em99010pepe Hardware 0 2011-11-11 15:18
exclude single core from quad core cpu for gimps jippie Information & Answers 7 2009-12-14 22:04
Multiple systems/multiple CPUs. Best configuration? BillW Software 1 2003-01-21 20:11

All times are UTC. The time now is 14:59.


Fri Jul 7 14:59:01 UTC 2023 up 323 days, 12:27, 0 users, load averages: 1.14, 1.07, 1.09

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔