mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2004-11-30, 00:21   #1
georgekh
 
georgekh's Avatar
 
Oct 2004

2×33 Posts
Default New "Cell" Chips coming out?

ok guys check this article out......

http://www.geek.com/news/geeknews/20...1129028023.htm

if i'm not mistaken, there are goin to be chips coming out in a few years clocking at 6.4Ghz

let me know what u think.
georgekh is offline   Reply With Quote
Old 2004-12-07, 21:54   #2
E_tron
 
E_tron's Avatar
 
Sep 2002
Austin, TX

10618 Posts
Default

I'm not excited about it, because i won't be allowed to program for it. For all i care, it doesn't exist.

Same thing with the Emotion Engine. It's a great architecture, but i'm not allowed to program for it.
E_tron is offline   Reply With Quote
Old 2005-06-13, 18:00   #3
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

22·5·11·53 Posts
Question

Saw this interesting tidbit about the Cell in a recent NY Times article about Apple's decision to ditch the Power PC processor (emphasis mine):

Quote:
As it happens, Intel's was not the only alternative chip design that Apple had explored for the Mac. An executive close to Sony said that last year Mr. Jobs met in California with both Nobuyuki Idei, then the chairman and chief executive of the Japanese consumer electronics firm, and with Kenichi Kutaragi, the creator of the Sony PlayStation.

Mr. Kutaragi tried to interest Mr. Jobs in adopting the Cell chip, which is being developed by I.B.M. for use in the coming PlayStation 3, in exchange for access to certain Sony technologies. Mr. Jobs rejected the idea, telling Mr. Kutaragi that he was disappointed with the Cell design, which he believes will be even less effective than the PowerPC.
One is left to speculate what "less effective" means. Apparently the major reason Apple is saying bye-bye to the PPC is that it's been trending in the wrong direction in terms of performance-per-watt, which is the reason you won't see a G5 in a laptop. If that is Jobs' main metric for Apple's CPU roadmap, it would imply that the Cell has a relatively poor performance for its power-consumption.
ewmayer is offline   Reply With Quote
Old 2005-08-25, 07:31   #4
T.Rex
 
T.Rex's Avatar
 
Feb 2004
France

91810 Posts
Default Cell Broadband Engine documentation

IBM has just pre-published several documents (~ 520 pages) presenting the architecture of the Cell: Cell Broadband Engine documentation .
Tony
T.Rex is offline   Reply With Quote
Old 2005-08-25, 09:04   #5
T.Rex
 
T.Rex's Avatar
 
Feb 2004
France

2·33·17 Posts
Default Cell Double Floating Instructions

There is a mistake in their page. The total is rather more that 750 pages.
There are 7 double floating instructions:
add
multiply
substract
multiply and add
multiply and substract
negative multiply and add
negative multiply and substract
and 17 single floating instructions.
Tony
T.Rex is offline   Reply With Quote
Old 2005-08-26, 06:52   #6
ColdFury
 
ColdFury's Avatar
 
Aug 2002

26·5 Posts
Default

Apparently the Cells do support IEEE double-precision for the most part, however not all operations obey the standard.

The small local memory is still a problem when it comes to GIMPS. I suppose one could stream parts of the FFT in and out of the units using DMA, but you'd have to find enough operations to cover the time the DMA transfers take.
ColdFury is offline   Reply With Quote
Old 2005-08-26, 18:40   #7
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

101101100011002 Posts
Default

Quote:
Originally Posted by ColdFury
Apparently the Cells do support IEEE double-precision for the most part, however not all operations obey the standard.
The non-fully-IEEE-compliant part shouldn't be an issue for decently-written FFT code. So what if underflows flush to zero? That sort of thing is more important in e.g. linear algebra, especially with ill-conditioned and near-singular matrices. The data that occur in an FFT-based big-integer MUL tend to be extremely well-conditioned, especially when balanced-digit representation is used for the whole-number input digits. I was mainly concerned about rounding mode, but in this respect the Cell SPU is actually better w.r.to double than single precision - double-floats round to nearest by default (though chopping can also be invoked), whereas for single-floats only chopping is available. Similarly, the lack of sNAN and Infinity isn't a problem for carefully written big-FFT code.

Quote:
The small local memory is still a problem when it comes to GIMPS. I suppose one could stream parts of the FFT in and out of the units using DMA, but you'd have to find enough operations to cover the time the DMA transfers take.
Also shouldn't be a problem in principle, though one will likely need to restructure one's FFT slightly to make it more small-local-data-chunk friendly. The fact that DMA transfers between the local stores of different SPUs are fast is also helpful - basically one will need similar tricks for mitigating main-memory-access latencies that one uses on all the other currently-deployed cache-based microprocessors, just with a view to dealing with multiple processors, each with a small local cache and fast communication with its neighbors, but very slow communication with main memory.

Decent profiling tools (in order to track down bottlenecks, rather than just poking around half-blind and having to guess what might be happening to slow down one's code, as one winds up doing with 90% of compiler/CPU combos, especially with freeware compilers and systems one only has remote access to) will be crucial for code development on the Cell. IBM's website has some links to soon-to-be-available Cell simulators, but I'm always leery of simulators in terms of gauging how well code will run on the real hardware. The weird thing to me is, it seems that IBM developers are already building, profiling and running code on real Cell processors (e.g. this Big-FFT paper linked by Matthias (a.k.a. Dresdenboy) in the "Gimps Awaiting a Major Transition? thread certainly gives one that impression,) so why not make actual Cell-based systems available for codedev?

Last fiddled with by ewmayer on 2005-08-27 at 18:41 Reason: Bad URL - thanks to Tony Reix for pointing this out
ewmayer is offline   Reply With Quote
Old 2005-08-27, 20:57   #8
T.Rex
 
T.Rex's Avatar
 
Feb 2004
France

2×33×17 Posts
Default mprime and future multi-core machines

Hi,
It seems that the future of PCs are multi-core machines.
With 2-cores-only machines, mprime could run with a core while the applications used by the end-user run mainly on the other core.
With 4-cores or more, it could be interesting for mprime to use 2 cores or more.

So, my question is:
Is it possible/interesting to modify the architecture of mprime so that it can run 2 or more threads, like GLucas does ?

Tony
T.Rex is offline   Reply With Quote
Old 2005-08-29, 05:58   #9
ColdFury
 
ColdFury's Avatar
 
Aug 2002

1010000002 Posts
Default

That paper claims a 100x speed-up on a 16 MB FFT, most impressive. I wonder if such a speed-up is achievable in practice.
ColdFury is offline   Reply With Quote
Old 2005-08-29, 20:24   #10
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

22·5·11·53 Posts
Default

Quote:
Originally Posted by ColdFury
That paper claims a 100x speed-up on a 16 MB FFT, most impressive. I wonder if such a speed-up is achievable in practice.
The IBM engineers actually *did* achieve this speedup, i.e. it was in practice. Note however that this was for a single-precision FFT (Apple used to love these for showing off its AltiVec SIMD unit, as well, since that plays to the SIMD hardware's strengths), which isn't useful for large-integer arithmetic. For double-precision FFTs (and other kinds of DP computations as well) the potential speedup is more modest, but still potentially in the 10x realm.
ewmayer is offline   Reply With Quote
Old 2005-09-04, 15:42   #11
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3·19·67 Posts
Default

Quote:
Originally Posted by T.Rex
Is it possible/interesting to modify the architecture of mprime so that it can run 2 or more threads, like GLucas does ?
It is possible. I might try coding up such an FFT if I buy a Pentium D machine.

However, it is probably the case that testing two different exponents will have more throughput.
Prime95 is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Aouessare-El Haddouchi-Essaaidi "test": "if Mp has no factor, it is prime!" wildrabbitt Miscellaneous Math 11 2015-03-06 08:17
"Sandy Bridge" is coming out today ixfd64 Hardware 8 2011-01-07 00:07
Intel i7 ("Nehalem") chips launched ixfd64 Hardware 34 2008-11-25 18:22
Would Minimizing "iterations between results file" may reveal "is not prime" earlier? nitai1999 Software 7 2004-08-26 18:12

All times are UTC. The time now is 23:15.


Sun Oct 24 23:15:52 UTC 2021 up 93 days, 17:44, 0 users, load averages: 1.64, 1.45, 1.33

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.