mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2005-02-12, 05:50   #23
georgekh
 
georgekh's Avatar
 
Oct 2004

2·33 Posts
Exclamation

Hey guys, just to let you all know, i might be getting an early shipment of the Cell chips when they are finalized for PC use and i will put them up for auction on Ebay. i will keep everyone posted on the situation.

PS: Its nice to know people in North America and Asia
georgekh is offline   Reply With Quote
Old 2005-02-18, 13:56   #24
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Quote:
Originally Posted by dsouza123
The Intel P4 can do 8 single precision ops using SSE2, (using two SSE2 registers), so the cell processor isn't ahead of it, yet.

As paulie mentioned GIMPS needs double precision so it doesn't help.
8 single precision ops per 2 cycles or 4 SP ops per cycle. Like with SSE2, the P4 FPU only handles one 64bit half per unit (FP Add/Mul) at once per cycle (but in parallel across different pipeline stages).



@all:
Let's discuss, if the buyers of PS3s (not those, who'd buy them intentionally for Prime95) would see some use in letting the machine do some calculations while they aren't using it.
Dresdenboy is offline   Reply With Quote
Old 2005-03-17, 04:55   #25
E_tron
 
E_tron's Avatar
 
Sep 2002
Austin, TX

3×11×17 Posts
Default

about this single precision stuff; couldn't we just increase the FFT length to improve accuracy?

Even if we wanted to do this, we have to approach sony for the DRM keys to unlock the PS3 hardware. I don't think sony will do this, because they make all their money on the software. Sony would want over 10USD for each copy of P95 for PS3.
E_tron is offline   Reply With Quote
Old 2005-03-17, 06:36   #26
blackguard
 
Jan 2005
Singapore

13 Posts
Default

Quote:
Originally Posted by Dresdenboy
Let's discuss, if the buyers of PS3s (not those, who'd buy them intentionally for Prime95) would see some use in letting the machine do some calculations while they aren't using it.
People don't normally leave their consoles running unless they are playing, so that means P95 wouldn't get much effective running time. Taking into account the difficulties involved, it is probably not worth investing the time to port P95 to these machines. That is, unless for some reason they start substituting Intel/AMD processors in general purpose PCs, which doesn't look very likely. Anyway, just my 2 cents...
blackguard is offline   Reply With Quote
Old 2005-03-17, 22:15   #27
moo
 
moo's Avatar
 
Jul 2004
Nowhere

80910 Posts
Default

not really in the early era of ps2s there was linux distros for the ps2 there still avaible but only jap versions of it.. wiat here
http://www.us.playstation.com/periph...?id=SCPH-97047
http://blackrhino.xrhino.com/main.php?page=home oo there is a free distro i think.


intresting net bootng
http://playstation2-linux.com/projects/diskless

Last fiddled with by moo on 2005-03-17 at 22:21
moo is offline   Reply With Quote
Old 2005-03-18, 02:02   #28
ColdFury
 
ColdFury's Avatar
 
Aug 2002

14016 Posts
Default

Quote:
Originally Posted by E_tron
about this single precision stuff; couldn't we just increase the FFT length to improve accuracy?

Even if we wanted to do this, we have to approach sony for the DRM keys to unlock the PS3 hardware. I don't think sony will do this, because they make all their money on the software. Sony would want over 10USD for each copy of P95 for PS3.
Someone posted about this once and using single-precision would bloat the FFT to totally unreasonable sizes.

The FFTs can't fit in the Cell's memory anyways. People don't realize they don't have MMUs and uses DMA to perform memory transfer. Imagine running Prime95 with 256kb of RAM and paging everything to and from the hard drive.
ColdFury is offline   Reply With Quote
Old 2005-03-18, 07:27   #29
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Quote:
Originally Posted by ColdFury
Someone posted about this once and using single-precision would bloat the FFT to totally unreasonable sizes.

The FFTs can't fit in the Cell's memory anyways. People don't realize they don't have MMUs and uses DMA to perform memory transfer. Imagine running Prime95 with 256kb of RAM and paging everything to and from the hard drive.
A Cell SPE (as the PPE) can do double precision math. But at a slower rate than SP (something like factor 10 IIRC). It is much better for the FFT to work with such a slow double precision but having enough mantissa bits for calculation than to do a huge FFT, which can get use only a few bits per SP number.

The next thing is: Why would the FFT have to fit into Cell's memory? It usually doesn't fit into the caches of a K7, K8, P4 or Pentium-M CPU. Instead Cell has a dual channel XDR memory controller, delivering 25GB/s. That's a hell more than what we get with the MCT of a K8 (although it already is at ~98% of the max bandwidth of 6103 MiB/s for 2xDDR400 RAM) or with DDR2 on a newer P4 board.

FFTs can be calculated in parallel very well if the interconnection bandwidth is high enough. And the algorithms are very straightforward. While executing the first instruction you could actually say, what'd happen 1000 instructions later. A FFT algorithm for a certain size has a pattern how it is being executed and when and where it reads and stores data. The perfect job for a SPE on Cell. Even the fact, that the local memory is not a cache is not as bad as it may seem, since it has low latency (6 cycles, because it's SRAM like in a cache) and it's behaviour is predictable (not like a cache) since it does nothing on its own. It's like a cache without logic. And because of the mentioned access patterns you can easily load the data 6 cycles before it will be used.

And even the times, where the memory's data has to be exchanged, will be small thanks to the EIB. The SPEs can also access the L2 and external XDR memory. The 256kB local SRAM should be good for possibly up to 14 levels of the Prime95 FFT (it also needs space for code and some tables).

Some links (although already mentioned in some threads):
Understanding the Cell Microprocessor
ISSCC 2005: The Cell Microprocessor
Introducing the IBM/Sony/Toshiba Cell Processor — Part I
Introducing the IBM/Sony/Toshiba Cell Processor — Part II
Dresdenboy is offline   Reply With Quote
Old 2005-03-18, 09:12   #30
TauCeti
 
TauCeti's Avatar
 
Mar 2003
Braunschweig, Germany

2×113 Posts
Default

In addition to Dresdenboys comments i'd like to point you to the excellent anandtech article Understanding the Cell Microprocessor.

The article covers the implications of the cell cacheless In Order architecture.

Tau
TauCeti is offline   Reply With Quote
Old 2005-03-18, 12:45   #31
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

5518 Posts
Default

An addition regarding DP capabilities:
David Wang wrote (2nd link in my earlier posting):
"Given this estimate, the peak DP FP throughput of an 8 SPE CELL processor is approximately 25~30 GFlops when the DP FP capability of the PPE is also taken into consideration."

Lets look at a Netburst CPU at 3.4 GHz as an example: 6.8 GFlops.

What is left to say, is:
Cell (and similar MPUs) should currently give the best bang for the buck regarding LLR testing or even TF. No FPGA, GPU or general purpose CPU could currently deliver more, because of high price, missing universality or FP throughput.

Last fiddled with by Dresdenboy on 2005-03-18 at 12:51
Dresdenboy is offline   Reply With Quote
Old 2005-04-02, 09:26   #32
kim
 
Apr 2005

28 Posts
Default

i wounder what this will give...

Viral processor that builds it self. 50nm

http://www.spectrum.ieee.org/WEBONLY...3/1103bio.html
kim is offline   Reply With Quote
Old 2005-04-03, 09:48   #33
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

There is a Cell Presentation from GDC 2005 online, which sheds further light on the capabilities of this class of MPUs.

IMO the Cell's SPE FP and other capabilities look even more useful for algorithms like FFTs than before.
Dresdenboy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Some transition probabilities fivemack Aliquot Sequences 9 2012-03-16 08:49
Caught in transition? cheesehead Forum Feedback 1 2011-12-11 05:14
Major overhaul of the DB 10metreh Aliquot Sequences 5 2010-08-29 01:10
How will the transition to v5 work? ixfd64 PrimeNet 3 2008-10-01 01:42
server transition news ltd Prime Sierpinski Project 4 2006-04-19 20:25

All times are UTC. The time now is 17:05.


Wed Oct 20 17:05:25 UTC 2021 up 89 days, 11:34, 0 users, load averages: 1.76, 1.72, 1.47

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.