mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Quad Quad-cores (https://www.mersenneforum.org/showthread.php?t=10120)

ixfd64 2008-03-25 04:34

It looks like that StarQwest has got competition! :grin:

Cruelty 2008-03-25 11:27

I have also observed CPU utilization during this short test, and Prime95 was using 90-95% which is not bad :tu: I wonder how the newer C2Q will improve this timings given the larger L2 cache, FSB @ 1600 (X48 chipset) and some fast DDR3 memory...

lycorn 2008-03-25 14:08

[QUOTE=SlashDude;129592]I didn't know about this! :smile:

Some quick numbers:

Running 1 test on 16 cores (on the HP DL580) runs at .020 seconds per iteration on an exponent in the 47,000,000 range. This has the box running at ~51% total.
Single test: (CPU Affinity set to run on first CPU - Each addition thread took the CPU's in order - 0,1,2,3,4,5...)
[CODE]Cores Iteration Box%
16 .020 sec 51%
8 .022 sec 40%
7 .023 sec 36%
6 .024 sec 32%
5 .029 sec 26%
4 .029 sec 23%
3 .034 sec 18%
2 .037 sec 13%
1 .067 sec 7%[/CODE][CODE]Single test: (CPU Affinity set to run on any)
Cores Iteration Box%
16 .024 sec 41-51%
8 .024 sec 31%
7 .046 sec 20%
6 .051 sec 18%
5 .055 sec 17%
4 .029 sec 25%
3 .035 sec 19%
2 .050 sec 13%
1 .067 sec 7%[/CODE]
[/QUOTE]

Thank you for running the tests.
I think that in this case the scalability is not just a matter of memory contention (if at all), but it has also something to do with the way the computation of the FFTs is performed (the sharing of the work between the different CPUs).
It would be good to have George Woltman´s word on this.
George, any comments?

Prime95 2008-03-25 17:32

I'm hardly surprised prime95 scales this poorly. I'd have to research this more, but I suspect the problem is in pass 1 of the FFT. Whereas in pass 2 the 16 threads can all operate independently, in pass 1 the carry propagation requires that all the threads to closely cooperate. Also, the more threads you throw at an FFT, the more you negate the benefits of memory prefetching. To get better scaling might require a complete rethinking of memory layouts and thread cooperation.

SlashDude 2008-03-25 19:55

[QUOTE=lpmurray;129627]Sorry but the number you did was only 21,977,176 digits long, a hundred million digit number must be 9 digits long..... 332192809 is exactly 100million digits long. You must run that number or larger to be running a 100million digit number. Also thank you for running.[/QUOTE]


Oops, my bad :redface:

Here is a test using M332192831:

[CODE]Threads HP IBM
2.4GHz 3.98GHz
16 0.172 0.17
15 0.174 0.176
14 0.172 0.174
13 0.176 0.182
12 0.173 0.179
11 0.177 0.187
10 0.176 0.185
9 0.186 0.2
8 0.189 0.189
7 0.198 0.204
6 0.21 0.22
5 0.251 0.262
4 0.262 0.304
3 0.309 0.38
2 0.359 0.33
1 0.661 0.586[/CODE]
-SD

Cruelty 2008-03-25 21:54

Looks like 16-core system matches my overclocked C2Q :smile:

tekkamichael 2008-11-16 13:15

Are there news about the scalability of P95?

joblack 2009-01-17 21:03

[quote=tekkamichael;149521]Are there news about the scalability of P95?[/quote]

I´m interested in that as well. Without the scalability an oct-core doesn´t make any sense.

jasong 2009-01-18 06:05

I know I'm going to sound like a broken record, but when a graphics card prime number testing program finally becomes public, it's going to blow the cpu-based ones out of the water.

What people don't seem to understand is that if an implementation of an algorithm quits working, or doesn't work for a new situation, then maybe it's time to re-think how things are done. Some of the first cars were powered by steam. If you'd told those people that someday people would be in their cars and gone in under a minute, with potential speeds of over 100mph, they would have thought you were nuts.

I don't understand the math, and I'm not going to pretend like I have the skills to do what I'm talking about. I'm not even going to suggest that someone "should" attempt it. But think about it. We're dealing with a very specialized algorithm, basically you're doing the same thing over and over again an ungodly number of times. It's just(I wish there was a more humble word to put here) that you're dealing with a brand new architecture. Think about other cultures, older cultures. Not all of them used base-10, some used base-5, base-20, base-120, but the basic truths of mathematics and logical thinking hold true no matter what base you use. If you express the value pi in base-7(or whatever) it's still the same number in a sense. And in the case of the problem we're dealing with, you're still dealing with base-2. It's simply(again, I wish there was a more humble word to put here) a matter of adjusting to a new architecture.

ET_ 2009-01-18 09:06

[QUOTE=jasong;159208]I know I'm going to sound like a broken record, but when a graphics card prime number testing program finally becomes public, it's going to blow the cpu-based ones out of the water.

What people don't seem to understand is that if an implementation of an algorithm quits working, or doesn't work for a new situation, then maybe it's time to re-think how things are done. Some of the first cars were powered by steam. If you'd told those people that someday people would be in their cars and gone in under a minute, with potential speeds of over 100mph, they would have thought you were nuts.

I don't understand the math, and I'm not going to pretend like I have the skills to do what I'm talking about. I'm not even going to suggest that someone "should" attempt it. But think about it. We're dealing with a very specialized algorithm, basically you're doing the same thing over and over again an ungodly number of times. It's just(I wish there was a more humble word to put here) that you're dealing with a brand new architecture. Think about other cultures, older cultures. Not all of them used base-10, some used base-5, base-20, base-120, but the basic truths of mathematics and logical thinking hold true no matter what base you use. If you express the value pi in base-7(or whatever) it's still the same number in a sense. And in the case of the problem we're dealing with, you're still dealing with base-2. It's simply(again, I wish there was a more humble word to put here) a matter of adjusting to a new architecture.[/QUOTE]

If the new architecture you are talking about has no basement and allow only one (beautiful) floor with nice windows and arches, you just can't build a four floor building on it.

Luigi

joblack 2009-01-18 10:00

Even if a graphics card version will be released it is still a good idea to optimize the parallel prime95 version (quad and especially oct cores will be normal in two years).

[quote=jasong;159208]I know I'm going to sound like a broken record, but when a graphics card prime number testing program finally becomes public, it's going to blow the cpu-based ones out of the water.

What people don't seem to understand is that if an implementation of an algorithm quits working, or doesn't work for a new situation, then maybe it's time to re-think how things are done. Some of the first cars were powered by steam. If you'd told those people that someday people would be in their cars and gone in under a minute, with potential speeds of over 100mph, they would have thought you were nuts.

I don't understand the math, and I'm not going to pretend like I have the skills to do what I'm talking about. I'm not even going to suggest that someone "should" attempt it. But think about it. We're dealing with a very specialized algorithm, basically you're doing the same thing over and over again an ungodly number of times. It's just(I wish there was a more humble word to put here) that you're dealing with a brand new architecture. Think about other cultures, older cultures. Not all of them used base-10, some used base-5, base-20, base-120, but the basic truths of mathematics and logical thinking hold true no matter what base you use. If you express the value pi in base-7(or whatever) it's still the same number in a sense. And in the case of the problem we're dealing with, you're still dealing with base-2. It's simply(again, I wish there was a more humble word to put here) a matter of adjusting to a new architecture.[/quote]


All times are UTC. The time now is 12:51.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.