Register FAQ Search Today's Posts Mark Forums Read

 2008-03-25, 04:34 #12 ixfd64 Bemusing Prompter     "Danny" Dec 2002 California 7·331 Posts It looks like that StarQwest has got competition!
 2008-03-25, 11:27 #13 Cruelty     May 2005 31228 Posts I have also observed CPU utilization during this short test, and Prime95 was using 90-95% which is not bad I wonder how the newer C2Q will improve this timings given the larger L2 cache, FSB @ 1600 (X48 chipset) and some fast DDR3 memory...
2008-03-25, 14:08   #14
lycorn

Sep 2002
Oeiras, Portugal

142310 Posts

Quote:
 Originally Posted by SlashDude I didn't know about this! Some quick numbers: Running 1 test on 16 cores (on the HP DL580) runs at .020 seconds per iteration on an exponent in the 47,000,000 range. This has the box running at ~51% total. Single test: (CPU Affinity set to run on first CPU - Each addition thread took the CPU's in order - 0,1,2,3,4,5...) Code: Cores Iteration Box% 16 .020 sec 51% 8 .022 sec 40% 7 .023 sec 36% 6 .024 sec 32% 5 .029 sec 26% 4 .029 sec 23% 3 .034 sec 18% 2 .037 sec 13% 1 .067 sec 7% Code: Single test: (CPU Affinity set to run on any) Cores Iteration Box% 16 .024 sec 41-51% 8 .024 sec 31% 7 .046 sec 20% 6 .051 sec 18% 5 .055 sec 17% 4 .029 sec 25% 3 .035 sec 19% 2 .050 sec 13% 1 .067 sec 7%
Thank you for running the tests.
I think that in this case the scalability is not just a matter of memory contention (if at all), but it has also something to do with the way the computation of the FFTs is performed (the sharing of the work between the different CPUs).
It would be good to have George Woltman´s word on this.

 2008-03-25, 17:32 #15 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 715810 Posts I'm hardly surprised prime95 scales this poorly. I'd have to research this more, but I suspect the problem is in pass 1 of the FFT. Whereas in pass 2 the 16 threads can all operate independently, in pass 1 the carry propagation requires that all the threads to closely cooperate. Also, the more threads you throw at an FFT, the more you negate the benefits of memory prefetching. To get better scaling might require a complete rethinking of memory layouts and thread cooperation.
2008-03-25, 19:55   #16
SlashDude

Aug 2002
Minneapolis, MN

111001002 Posts

Quote:
 Originally Posted by lpmurray Sorry but the number you did was only 21,977,176 digits long, a hundred million digit number must be 9 digits long..... 332192809 is exactly 100million digits long. You must run that number or larger to be running a 100million digit number. Also thank you for running.

Here is a test using M332192831:

Code:
Threads	HP	IBM
2.4GHz	3.98GHz
16	0.172	0.17
15	0.174	0.176
14	0.172	0.174
13	0.176	0.182
12	0.173	0.179
11	0.177	0.187
10	0.176	0.185
9	0.186	0.2
8	0.189	0.189
7	0.198	0.204
6	0.21	0.22
5	0.251	0.262
4	0.262	0.304
3	0.309	0.38
2	0.359	0.33
1	0.661	0.586
-SD

 2008-03-25, 21:54 #17 Cruelty     May 2005 2×809 Posts Looks like 16-core system matches my overclocked C2Q
 2008-11-16, 13:15 #18 tekkamichael   Sep 2008 5 Posts Are there news about the scalability of P95?
2009-01-17, 21:03   #19
joblack

Oct 2008
n00bville

2D516 Posts

Quote:
 Originally Posted by tekkamichael Are there news about the scalability of P95?
I´m interested in that as well. Without the scalability an oct-core doesn´t make any sense.

 2009-01-18, 06:05 #20 jasong     "Jason Goatcher" Mar 2005 DB116 Posts I know I'm going to sound like a broken record, but when a graphics card prime number testing program finally becomes public, it's going to blow the cpu-based ones out of the water. What people don't seem to understand is that if an implementation of an algorithm quits working, or doesn't work for a new situation, then maybe it's time to re-think how things are done. Some of the first cars were powered by steam. If you'd told those people that someday people would be in their cars and gone in under a minute, with potential speeds of over 100mph, they would have thought you were nuts. I don't understand the math, and I'm not going to pretend like I have the skills to do what I'm talking about. I'm not even going to suggest that someone "should" attempt it. But think about it. We're dealing with a very specialized algorithm, basically you're doing the same thing over and over again an ungodly number of times. It's just(I wish there was a more humble word to put here) that you're dealing with a brand new architecture. Think about other cultures, older cultures. Not all of them used base-10, some used base-5, base-20, base-120, but the basic truths of mathematics and logical thinking hold true no matter what base you use. If you express the value pi in base-7(or whatever) it's still the same number in a sense. And in the case of the problem we're dealing with, you're still dealing with base-2. It's simply(again, I wish there was a more humble word to put here) a matter of adjusting to a new architecture.
2009-01-18, 09:06   #21
ET_
Banned

"Luigi"
Aug 2002
Team Italia

13×367 Posts

Quote:
 Originally Posted by jasong I know I'm going to sound like a broken record, but when a graphics card prime number testing program finally becomes public, it's going to blow the cpu-based ones out of the water. What people don't seem to understand is that if an implementation of an algorithm quits working, or doesn't work for a new situation, then maybe it's time to re-think how things are done. Some of the first cars were powered by steam. If you'd told those people that someday people would be in their cars and gone in under a minute, with potential speeds of over 100mph, they would have thought you were nuts. I don't understand the math, and I'm not going to pretend like I have the skills to do what I'm talking about. I'm not even going to suggest that someone "should" attempt it. But think about it. We're dealing with a very specialized algorithm, basically you're doing the same thing over and over again an ungodly number of times. It's just(I wish there was a more humble word to put here) that you're dealing with a brand new architecture. Think about other cultures, older cultures. Not all of them used base-10, some used base-5, base-20, base-120, but the basic truths of mathematics and logical thinking hold true no matter what base you use. If you express the value pi in base-7(or whatever) it's still the same number in a sense. And in the case of the problem we're dealing with, you're still dealing with base-2. It's simply(again, I wish there was a more humble word to put here) a matter of adjusting to a new architecture.
If the new architecture you are talking about has no basement and allow only one (beautiful) floor with nice windows and arches, you just can't build a four floor building on it.

Luigi

2009-01-18, 10:00   #22
joblack

Oct 2008
n00bville

72510 Posts

Even if a graphics card version will be released it is still a good idea to optimize the parallel prime95 version (quad and especially oct cores will be normal in two years).

Quote:
 Originally Posted by jasong I know I'm going to sound like a broken record, but when a graphics card prime number testing program finally becomes public, it's going to blow the cpu-based ones out of the water. What people don't seem to understand is that if an implementation of an algorithm quits working, or doesn't work for a new situation, then maybe it's time to re-think how things are done. Some of the first cars were powered by steam. If you'd told those people that someday people would be in their cars and gone in under a minute, with potential speeds of over 100mph, they would have thought you were nuts. I don't understand the math, and I'm not going to pretend like I have the skills to do what I'm talking about. I'm not even going to suggest that someone "should" attempt it. But think about it. We're dealing with a very specialized algorithm, basically you're doing the same thing over and over again an ungodly number of times. It's just(I wish there was a more humble word to put here) that you're dealing with a brand new architecture. Think about other cultures, older cultures. Not all of them used base-10, some used base-5, base-20, base-120, but the basic truths of mathematics and logical thinking hold true no matter what base you use. If you express the value pi in base-7(or whatever) it's still the same number in a sense. And in the case of the problem we're dealing with, you're still dealing with base-2. It's simply(again, I wish there was a more humble word to put here) a matter of adjusting to a new architecture.

 Similar Threads Thread Thread Starter Forum Replies Last Post EdH Hardware 19 2017-06-08 22:06 ixfd64 Hardware 11 2009-03-09 18:17 CRGreathouse Hardware 51 2009-03-04 01:32 sgrupp Hardware 54 2008-01-25 22:01 R.D. Silverman Hardware 76 2007-11-19 21:57

All times are UTC. The time now is 00:45.

Tue Nov 24 00:45:23 UTC 2020 up 74 days, 21:56, 4 users, load averages: 2.48, 2.58, 2.44