mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2010-11-18, 04:03   #12
otutusaus
 
Nov 2010
Ann Arbor, MI

2×47 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
TF does not gain through-put by trying to do it on multiple cores. You will turn in more results per time period by having one core per test.
I don't see why. One factor can be checked in every core; when a core is done with one factor, it takes the next on the list. I think that can be efficient and it doesn't seem difficult to implement.
otutusaus is offline   Reply With Quote
Old 2010-11-18, 06:11   #13
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

265A16 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
TF does not gain through-put by trying to do it on multiple cores. You will turn in more results per time period by having one core per test.
Quote:
Originally Posted by otutusaus View Post
I don't see why. One factor can be checked in every core; when a core is done with one factor, it takes the next on the list. I think that can be efficient and it doesn't seem difficult to implement.
Let me clarify: Doing one TF test on multiple cores does not achieve better results. Doing one TF per core (all working on separate numbers) gives the best through-put.
Uncwilly is online now   Reply With Quote
Old 2010-11-18, 06:32   #14
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

11000011010012 Posts
Default

Quote:
Originally Posted by otutusaus View Post
I don't see why. One factor can be checked in every core; when a core is done with one factor, it takes the next on the list. I think that can be efficient and it doesn't seem difficult to implement.
In principle, yes. However, in practice, there is a certain latency associated with interprocess communication between cores. The TF code (as I understand it) does not straightforwardly do one factor candidate, then the next, then the next; it takes advantage of certain algorithmic shortcuts that entail doing the factor candidates within a particular bit level out of order. To split this up requires periodic (on the order of milliseconds) communication between threads to coordinate their effort. (Don't ask me why this is, I don't fully understand the specifics. )

You are correct, though, that TF does naturally lend itself better to multithreading than other worktypes. Similar programs used to search for other (non-Mersenne) types of primes have implemented such multithreading to great effect. However, even the best-optimized multithreaded programs will still have some performance loss compared to running separate jobs on each core--ideally this is kept down to <1-2% or so, but there is a loss nonetheless. This is why single-exponent multithreaded TF hasn't been a priority at GIMPS to date; as individual TF bit-level assignments take only a few hours, there would be very little benefit at this point to splitting them over multiple cores.
mdettweiler is offline   Reply With Quote
Old 2010-11-18, 13:50   #15
otutusaus
 
Nov 2010
Ann Arbor, MI

2·47 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
there is a certain latency associated with interprocess communication between cores.
Quote:
Originally Posted by mdettweiler View Post
However, even the best-optimized multithreaded programs will still have some performance loss compared to running separate jobs on each core--ideally this is kept down to <1-2% or so, but there is a loss nonetheless.
Whatever associated loss there is, it's already there now. I am not expert, but when I run a FT I can see on the task manager that the job is already shared between cores (amounting a total of not more than a single core job). So the "interprocess communication between cores" is already happening!
Overall I don't see why extending the process to all cores should slow the process much more.

Last fiddled with by otutusaus on 2010-11-18 at 13:51
otutusaus is offline   Reply With Quote
Old 2010-11-18, 14:38   #16
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22×5×373 Posts
Default

Quote:
Originally Posted by otutusaus View Post
I don't see why. One factor can be checked in every core; when a core is done with one factor, it takes the next on the list. I think that can be efficient and it doesn't seem difficult to implement.
Why would you want to?

If you are doing TF on (say) 10 different Mersenne candidates, it is
even MORE efficient to devote a single core to each candidate.

Ask yourself if you can make a (piece of) string longer by cutting it into
pieces and tying the pieces together.
R.D. Silverman is offline   Reply With Quote
Old 2010-11-18, 14:39   #17
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22·5·373 Posts
Thumbs up

Quote:
Originally Posted by mdettweiler View Post
In principle, yes. However, in practice, there is a certain latency associated with interprocess communication between cores. The TF code (as I understand it) does not straightforwardly do one factor candidate, then the next, then the next; it takes advantage of certain algorithmic shortcuts that entail doing the factor candidates within a particular bit level out of order. To split this up requires periodic (on the order of milliseconds) communication between threads to coordinate their effort. (Don't ask me why this is, I don't fully understand the specifics. )

You are correct, though, that TF does naturally lend itself better to multithreading than other worktypes. Similar programs used to search for other (non-Mersenne) types of primes have implemented such multithreading to great effect. However, even the best-optimized multithreaded programs will still have some performance loss compared to running separate jobs on each core--ideally this is kept down to <1-2% or so, but there is a loss nonetheless. This is why single-exponent multithreaded TF hasn't been a priority at GIMPS to date; as individual TF bit-level assignments take only a few hours, there would be very little benefit at this point to splitting them over multiple cores.
Reading common sense is so pleasant!
R.D. Silverman is offline   Reply With Quote
Old 2010-11-18, 14:46   #18
otutusaus
 
Nov 2010
Ann Arbor, MI

2×47 Posts
Default

Quote:
Originally Posted by R.D. Silverman View Post
Why would you want to?

If you are doing TF on (say) 10 different Mersenne candidates, it is
even MORE efficient to devote a single core to each candidate.

Ask yourself if you can make a (piece of) string longer by cutting it into
pieces and tying the pieces together.
Already answered in previous post:

Quote:
Originally Posted by otutusaus View Post
Whatever associated loss there is, it's already there now. I am not expert, but when I run a FT I can see on the task manager that the job is already shared between cores (amounting a total of not more than a single core job). So the "interprocess communication between cores" is already happening!
Overall I don't see why extending the process to all cores should slow the process much more.
otutusaus is offline   Reply With Quote
Old 2010-11-18, 14:57   #19
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22×5×373 Posts
Default

Quote:
Originally Posted by otutusaus View Post
Already answered in previous post:
It was not answered.

You should rename yourself obtuseosaurus.

If you want to be argumentative, go somewhere else. Your question
was answered by several different people.
R.D. Silverman is offline   Reply With Quote
Old 2010-11-18, 15:02   #20
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts
Default

Quote:
Originally Posted by otutusaus View Post
Whatever associated loss there is, it's already there now. I am not expert, but when I run a FT I can see on the task manager that the job is already shared between cores
I'm not sure how to explain the observed behavior, (is it really running on one core and the task manager is somehow wrong? is it a single thread switching between cores? I don't know, but in any case it's still just one thread, and is appropriately fast; if you want to experiment, tell Prime95 to put that worker on a specific core and see what happens to the speed and what appears in the task manager) but just accept the fact that it is slower to run a multi-threaded job than many single-threaded jobs. Why has already been explained quite nicely.

Last fiddled with by Mini-Geek on 2010-11-18 at 15:07
Mini-Geek is offline   Reply With Quote
Old 2010-11-18, 15:14   #21
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

3·1,993 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
I'm not sure how to explain the observed behavior, (is it really running on one core and the task manager is somehow wrong? is it a single thread switching between cores? I don't know, but in any case it's still just one thread, and is appropriately fast; if you want to experiment, tell Prime95 to put that worker on a specific core and see what happens to the speed and what appears in the task manager) but just accept the fact that it is slower to run a multi-threaded job than many single-threaded jobs. Why has already been explained quite nicely.
It's switching between cores. If you like you can set processor affinity for the thread and see if there's a performance difference; I doubt it.
CRGreathouse is offline   Reply With Quote
Old 2010-11-18, 15:32   #22
otutusaus
 
Nov 2010
Ann Arbor, MI

2×47 Posts
Default

Quote:
Originally Posted by R.D. Silverman View Post
You should rename yourself obtuseosaurus.

If you want to be argumentative, go somewhere else. Your question
was answered by several different people.
Mr. Silverman, I started posting less than a week ago and I am still getting familiar with how Prime95 software works and with the maths behind prime search.
I don't intend to be a burden to the forum, but just learn (maths, programming) and suggest ways to improve our overall efforts.
I regret having to read posts like yours. Please be more respectful and tolerant with other people's ignorance.
otutusaus is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to focus manually a lens for which it was not designed. fivemack Astronomy 73 2018-02-04 20:07
Dual Core to process single work unit? JimboPrimer Homework Help 18 2011-08-28 04:08
exclude single core from quad core cpu for gimps jippie Information & Answers 7 2009-12-14 22:04
Core i7 Assigned Trial Factoring ScrappyJoel Software 8 2009-04-20 06:01
4 checkins in a single calendar month from a single computer Gary Edstrom Lounge 7 2003-01-13 22:35

All times are UTC. The time now is 19:59.


Sun Aug 1 19:59:15 UTC 2021 up 9 days, 14:28, 0 users, load averages: 1.68, 1.42, 1.37

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.