![]() |
|
|
#12 |
|
Jan 2008
France
3·199 Posts |
|
|
|
|
|
|
#13 |
|
Sep 2006
The Netherlands
3·269 Posts |
|
|
|
|
|
|
#14 | ||
|
Mar 2003
Melbourne
5×103 Posts |
Quote:
My GT430 runs about 35-40GHzdays/day. Quote:
The time for PS3s to be useful has come and gone. The legal issues is just icing on the cake to not use them. -- Craig |
||
|
|
|
|
|
#15 | |
|
Sep 2006
The Netherlands
3×269 Posts |
Quote:
CELL-2 is more powerful from double precision viewpoint. Those chippies deliver good throughput, whereas for example the multiplication unit on the x64's is notorious ugly. Integer 64x64 for example at i7 can deliver one multiplication each 3.75 cycles, that sucks just so much you know, this kills all the integer NTT's running on the x64 and this is why we have dug 'em up again for the GPU's to have a look how they doing. AMD k8 and newer need 2.25 cycles for each multiplication (throughput, don't confuse it with latency) and both the intel as well as the AMD block the other execution units a lot while multiplying (first cycle and last cycle in case of AMD and worse for intel). So a chippie then that keeps having a throughput of 1 instruction per cycle then can do some damage. That's how originally CELL positioned itself, besides having 8 cores in total. Just the release of the CELL chip for the PS3 was years years too late if you ask me. Realize no one had 8 cores at the time at a low budget manner. Of course it was a disappointment then realizing the PS3 had just 6 cores available out of 8 PE's. Huge loss. For todays standards of course all this is outdated junk. Basically the gpu's with their thousands of cores now are the new kid on the block. That said, never say ibm can't surprise again. |
|
|
|
|
|
|
#16 |
|
Jan 2008
France
3×199 Posts |
No, the SPU is a dual-issue in-order CPU. OTOH FP computations (except for reciprocal and reciprocal sqrt) only go through pipe 0, but that doesn't prevent you from doing something else with pipe 1, for instance ld/st. More information in Appendix B of CellBE Handbook.
BTW PS3 has 7 SPU enabled, but one is reserved when running under Linux. |
|
|
|
|
|
#17 | |
|
Sep 2006
The Netherlands
14478 Posts |
Quote:
PS3 is from 2005 having CELL-1, CELL-2 is from years later. I remember very well how Sato from the University of Tokio was complaining loud about the single instruction per cycle throughput the CELL had, which was his explanation why it was not outperforming a core2-duo for him. Now i'm not going to dig up the handbooks online as i don't have time for that, but i'm sure you refer to CELL-2 rather than CELL-1, the last chip is the one that is in the PS3. The much improved CELL-2 is the chip that was used for a number of supercomputers which gets roughly a 150 Gflop at each rackmount which has 2 of those chips inside. That was quite ok at the time. Regards, Vincent |
|
|
|
|
|
|
#18 | |
|
Jan 2008
France
59710 Posts |
Quote:
If you made the effort of downloading the .pdf you could check by yourself
|
|
|
|
|
|
|
#19 | |
|
Mar 2003
Melbourne
10038 Posts |
Quote:
Yeah the new super comp from China with off-the-shelf GPUs beating IBM's machine must have hurt IBM's pride. IBM has huge research into this. Big hardware is their pride and joy. I don't discount IBM by any stretch of the imagination. -- Craig |
|
|
|
|
|
|
#20 | |
|
Sep 2006
The Netherlands
32716 Posts |
Quote:
Several chinese researchers report their software can run both on nvidia as well as on AMD. this is software even from before opencl was there. So they have something that works basically at the assembler level of those cpu's and todays researchers probably make use a lot from CUDA and AMD-CAL. The IPC's they report also were always confirmed by experimental attempts by quite good programmers over here. Yet only a few bigger companies here invested into gpu technology and simply don't want to talk much about it. It's not me who then is going to post publicly "company XYZ are using massively gpu's". They do use them. As i already had posted at some other spot, i feel the 1st world nations are the nations hurting themselves most by not investing into manycore technology. In this manner the 3d world nations will completely overrun the 1st world nations also in research. Where some geniuses of their time will be able to predict what will be important in the future to some extend, there is so few of them and science has become so big, that the vaste majority of progress on this planet by capable people happens by calculating your way into the future. What they see is what they realize and what they therefore can analyze and therefore take into account. This is of course a tad lower form of intelligence than the first genius group, yet it works and it is what happens in overwhelming quantity. GPU's can play a large role there as their bigger calculation capacity allows to further look into the future. So not having focussed upon programming those gpu's is a big mistake. A mistake that happened mainly in 1st world nations. Now where the N*SA type organisations will figure out a way to do their calculations (reports from already 10 years ago about a 500-1000 mm^2 chip with a dozen memory controllers and eating hundreds of watts huh?), the real victims of all this is the students in the 1st world. They do have the money budget to get the gpu's, unlike vaste majority of students in 3d world nations, yet the gpu companies really limited the 1st world nation students by not releasing good software support to run fast on those gpu's, or by not releasing enough information. AMD/ATI having the first problem, nvidia the second problem more than AMD yet both lack proper documentation of their gpu's which you can get for cpu's. So the students/researchers are not capable to look far ahead in the future in the 1st world and miss thereby an advantage over the 3d world students, namely a better understanding of science for the average student (again the geniuses in 3d world will always manage of course as they don't need to get confronted with the data to realize what is out there). If we look objectively to cpu's, it is ancient technology; the cpu's RAM basically can keep up with the cpu speed. This at total ancient RAM. DDR3 we arrived at today. GPU's already for half a decade use DDR5. Now i'm not a hardware expert, yet what i do imagine is that it is rather easy to have lots of movement within some silicon. Getting outside of that silicon chip over copper wires that are on the mainboard, to some off-chip RAM is of course a tricky and slow journey. It makes sense to have a chip which calculation power is far superior over the RAM speed. GPU's travel that path nowadays. Yet it's not so many generations ago that that they started travelling this path. So where CPU's already have been optimized completely to be delivering good performance, the manycore technology is still rather new. I'd argue that the CELL was a tad of in between stage. Not a mature manycore, yet already offloading calculations to PE's that deliver more performance. Kind of hybrid in between a CPU and a manycore-GPU. Of course just raw calculation power is not everything. We also need caches that to some extend keep up with the PE speeds. Now a question is of course for how long GPU's will dominate crunching. That has yet to be seen. But one advantage we can all see easily and that is that they are dirt cheap delivering that crunching, whereas interesting cpu's are really expensive. If we look to the total productioncost of a laptop and then compare that with the price of the CPU inside, it's trivial that CPU's are too expensive and therefore outdated from crunching viewpoint. They're not obsolete, as they do have useful function. Think of latency they deliver. To some that is important. Think of ease of writing software for cpu's, of course not for the SSE2 part. I'd argue it's more complicated to write SSE2 code than it is to write good code for a GPU. Writing code for a gpu is still rather high level. Sure the understanding needs to be low level, yet the actual writing is high level. SSE2 really only works well in assembler, as george woltman here has proven. SSE2 really extended the life of cpu's for too long i'd argue. Last fiddled with by diep on 2011-05-12 at 14:41 |
|
|
|
|
|
|
#21 | |
|
Sep 2009
246410 Posts |
Quote:
I remember a post saying they are much faster than PCs for stage 1, but can only run stage 2 by brute force to low limits. I think it was by Bob Silverman, but I can't find it by searching on PS3 or Playstation. And can they run any other factoring software better than a PC? Chris K |
|
|
|
|
|
|
#22 | |
|
Sep 2006
The Netherlands
3×269 Posts |
Quote:
You're speaking of a PS3 cell chip from the year 2005 or so? Don't confuse those 2005 ps3 cell chippies with the later CELL2 chip that isn't in PS3. Codenamed PowerXCell 8i. This chippie was put by IBM in some supercomputers... Another huge difference is that PS3 has 6 PE's available clocked 3.17Ghz, executing single instruction per cycle versus the supercomputer cell2 chip has 8 PE's available. This is a huge performance difference. Modern PC processors such as cheapo AMD 6 core cpu's, or maybe a tad more expensive intel i7's with 6 cores, they execute handsdown 4 flops per cycle per core using SSE2 and newer. Clocked 3.4Ghz for little money. Power consumption is the same like the PS3. Yet why compare 2005 technology with 2011 technology, isn't real fair you know. IBM also sells rackmounts with 2 newer CELL processors which according to IBM get around 150 Gflops. Maybe you can get those second hand. It'll be a $1200k or so each rackmount. Similar power consumption like the PS3 though. Some years ago this PowerXCell 8i wasn't bad. See for example Roadrunner supercomputer is using them in Los Alamos. http://www.lanl.gov/ Right now this gets overwhelmed by GPU's. They deliver 0.5 Tflop to 1 Tflop a card. Regards, Vincent |
|
|
|
|