mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2005-03-07, 23:10   #34
PrimeCruncher
 
PrimeCruncher's Avatar
 
Sep 2003
Borg HQ, Delta Quadrant

2·33·13 Posts
Default

Quote:
Originally Posted by dsouza123
AMD has the Turion 64 a 64 bit processor with 1MB L2 cache
1.6 Ghz using the 754 socket, 90 nm technology 1.056v.
A mobile CPU. Hmm. Might be worth looking into when it becomes available, if it's prices right.
PrimeCruncher is offline   Reply With Quote
Old 2005-03-09, 21:55   #35
T.Rex
 
T.Rex's Avatar
 
Feb 2004
France

32×103 Posts
Default Cell processor

What about (some day in the future) using a PS3 (with IBM/Sony/Toshiba Cell processor inside) for running prime95 ?
Does someone have an idea about if the Cell processor will beat AMD and Intel processors for running prime95 ?

Tony
T.Rex is offline   Reply With Quote
Old 2005-03-10, 14:33   #36
Jeff Gilchrist
 
Jeff Gilchrist's Avatar
 
Jun 2003
Ottawa, Canada

100100101012 Posts
Default

I think someone already commented that the current Cell design uses single precision hardware and might not be enough for Prime95. That doesn't mean that in the future it won't change though...
Jeff Gilchrist is offline   Reply With Quote
Old 2005-03-10, 16:38   #37
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

1A1916 Posts
Default

I thought I had posted this in this thread, but don't see it. The discussion of cell processors is in this thread:

http://www.mersenneforum.org/showthread.php?t=3686
rogue is offline   Reply With Quote
Old 2005-03-17, 04:44   #38
E_tron
 
E_tron's Avatar
 
Sep 2002
Austin, TX

3·11·17 Posts
Default

Quote:
Originally Posted by T.Rex
What about (some day in the future) using a PS3 (with IBM/Sony/Toshiba Cell processor inside) for running prime95 ?
Does someone have an idea about if the Cell processor will beat AMD and Intel processors for running prime95 ?

Tony
Cell is largely single percision. It wont be accurate enough for LL testing.

Regardless we would never get that far, the Digital Rights Management in the PS3 looks like a nightmare! Homebrew software might never be able to execute on this platform.

The Xbox2 looks like a more desirable platform for LL testing, because it uses a more traditional architecture with many versatile processors. Plus the DRM looks weaker than PS3's DRM.

It’s too bad that Sony is abandoning the Emotion Engine Architecture. Naughty Dog and Insomniac has refined their code for Emotion Engine so well.
E_tron is offline   Reply With Quote
Old 2005-03-17, 08:08   #39
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

The Cell's SPEs are able to do double precision calculations, but at a much slower rate than single precision. I assume, they simply do some emulation.

But as long as there is still enough raw power to do single precision calculations, even these emulated DP calculations could boost the FFTs compared to a P4.
Dresdenboy is offline   Reply With Quote
Old 2005-03-18, 01:59   #40
ColdFury
 
ColdFury's Avatar
 
Aug 2002

26·5 Posts
Default

Quote:
Originally Posted by Dresdenboy
The Cell's SPEs are able to do double precision calculations, but at a much slower rate than single precision. I assume, they simply do some emulation.

But as long as there is still enough raw power to do single precision calculations, even these emulated DP calculations could boost the FFTs compared to a P4.
The FFTs can't even fit in the cells' memory, so its a moot point.
ColdFury is offline   Reply With Quote
Old 2005-03-18, 04:17   #41
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2DDB16 Posts
Default

Quote:
Originally Posted by ColdFury
The FFTs can't even fit in the cells' memory, so its a moot point.
No, the full FFT vector for a big FFT can't fit into the Cell's *cache* - crucial distinction. A 256kB cache (which each Cell sub-CPU apparently has) isn't *that* small - a typical PC processor doesn't have much more cache than that - does the LL test run grindingly slow on your PC because of it? Of course not, because careful coding has minimized the number of accesses to main memory and hidden the large main-memory latency from you. Careful coding would allow one to break the full-length FFT into Cell-sized chunks, each of which could then be processed independently, aside from a single pass (e.g. that pass might do the final radix of the inverse FFT of the current iteration, the round/carry step and the initial radix of the forward FFT for the next iteration.) The bigger restriction IMO is the single-precision, but with enough SIMD crunching power that could still be useful. You're just not going to see an Nx speedup with an N-subprocessor Cell CPU vs. a conventional CPU that has one or two double-precision FPUs, so there's the question of whether a (say) 2x per-cycle speedup vs. a Pentium would be worth the large coding effort that would likely be needed.

Last fiddled with by ewmayer on 2005-03-18 at 04:20
ewmayer is offline   Reply With Quote
Old 2005-03-19, 01:54   #42
ColdFury
 
ColdFury's Avatar
 
Aug 2002

32010 Posts
Default

Quote:
Originally Posted by ewmayer
No, the full FFT vector for a big FFT can't fit into the Cell's *cache* - crucial distinction. A 256kB cache (which each Cell sub-CPU apparently has) isn't *that* small - a typical PC processor doesn't have much more cache than that - does the LL test run grindingly slow on your PC because of it? Of course not, because careful coding has minimized the number of accesses to main memory and hidden the large main-memory latency from you. Careful coding would allow one to break the full-length FFT into Cell-sized chunks, each of which could then be processed independently, aside from a single pass (e.g. that pass might do the final radix of the inverse FFT of the current iteration, the round/carry step and the initial radix of the forward FFT for the next iteration.) The bigger restriction IMO is the single-precision, but with enough SIMD crunching power that could still be useful. You're just not going to see an Nx speedup with an N-subprocessor Cell CPU vs. a conventional CPU that has one or two double-precision FPUs, so there's the question of whether a (say) 2x per-cycle speedup vs. a Pentium would be worth the large coding effort that would likely be needed.
The Cells do not have cache. They have 256kb of memory, which is not shared between cell's. If you want to move something from the main memory to the cell's memory, you need to use a DMA transfer. They do not have memory management units.
ColdFury is offline   Reply With Quote
Old 2005-03-20, 15:48   #43
dsouza123
 
dsouza123's Avatar
 
Sep 2002

2·331 Posts
Default

If the hurdle can be overcome, allowing the program to run on PS3,
and realize the limitation that the double precision runs at 1/10 the speed of single precision,
even if LL testing doesn't work out, it could be used for TF (Trial Factoring)
which has much lower memory requirements, allowing work on multiple numbers in parallel.

There are supposed to be workstations that will use the Cell processors
so even if the PS3 is a dead end there still would be a reason for a program
that uses Cell processors at some time in the future.
dsouza123 is offline   Reply With Quote
Old 2005-03-20, 18:06   #44
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

36110 Posts
Default

As I explained in another thread, the memory requirements shouldn't be a problem. The same goes with DMA. What's better: 25GB/s with DMA or 6GB/s with MMU?
Dresdenboy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
AVX on AMD processors. pepi37 Software 16 2013-01-10 00:37
HT processors paulunderwood 3*2^n-1 Search 7 2007-02-15 15:47
New 65nm Processors moo Hardware 8 2005-12-22 05:46
mflops and new processors lpmurray Hardware 7 2005-12-06 00:38
New processors and chipsets Peter Nelson Hardware 4 2005-11-28 20:09

All times are UTC. The time now is 09:02.


Thu Aug 11 09:02:12 UTC 2022 up 35 days, 3:49, 2 users, load averages: 1.56, 1.26, 1.13

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔