mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2005-04-29, 04:38   #34
dsouza123
 
dsouza123's Avatar
 
Sep 2002

12268 Posts
Default

Other processing possibilities.

AGEIA Technologies Inc PhysX chip, dedicated Physics Processing Unit (PPU)
Expect to see PPU enabled systems and boards in time for the 2005 Christmas buying season.
Native hardware support of NovedeX Physics engine.
2 Terabits/second of bandwidth is presently contemplated
for internal memories facilitating data movement to/from the FPE.
The internal memory structure has no "set associativity" limitations.
PPU provides a library of common linear algebra and physics related algorithms
implemented using the DME and FPE.
However, application specific or custom algorithms may also be defined
within PPU for execution by the DME and FPE.


Xbox 2/Xbox Next/Xenon/Xbox 360 (? which name ?)
"Xenon's CPU has three 3.0 GHz PowerPC cores.
Each core is capable of two instructions per cycle
and has an L1 cache with 32 KB for data and 32 KB for instructions.
The three cores share 1 MB of L2 cache."
? Ship before end of 2005, two versions one with hard drive
dsouza123 is offline   Reply With Quote
Old 2005-07-12, 12:21   #35
Cruelty
 
Cruelty's Avatar
 
May 2005

22×11×37 Posts
Default

Would it be possible / feasible to run LL tests on graphics hardware? The latest graphics chips from nV and ATi have a big processing power, the question is: would it be possible to utilize it in such a way?
Any thoughts on that
Cruelty is offline   Reply With Quote
Old 2005-07-12, 15:48   #36
PrimeCruncher
 
PrimeCruncher's Avatar
 
Sep 2003
Borg HQ, Delta Quadrant

12768 Posts
Default

Quote:
Originally Posted by Cruelty
Would it be possible / feasible to run LL tests on graphics hardware? The latest graphics chips from nV and ATi have a big processing power, the question is: would it be possible to utilize it in such a way?
Any thoughts on that
This question about GPUs has been asked about a thousand times in these forums. Unfortunately, the answer remains no. These GPUs will do floating point math, but only single-precision. Prime95 requires double-precision.
PrimeCruncher is offline   Reply With Quote
Old 2005-07-12, 21:23   #37
T.Rex
 
T.Rex's Avatar
 
Feb 2004
France

32×103 Posts
Default Cell : DP

Do you know this presentation of the Cell architecture ?
There is a slide talking about the FP : Slide 15
Is the 2 ways DP means Double Precision ?

But slide 17 says: "SIMD FLoat only" .

So, single or double float precision ?

Tony
T.Rex is offline   Reply With Quote
Old 2005-07-12, 21:41   #38
T.Rex
 
T.Rex's Avatar
 
Feb 2004
France

32·103 Posts
Default Cell :Simple or Double Floating point ?

Quote:
Originally Posted by Paulie
Unfortunately Cell is geared to single precision SIMD. GIMPS needs double precision.
What about this paper talking of double-precision: ISSCC ?

Also, look at: IBM .

Tony
T.Rex is offline   Reply With Quote
Old 2005-07-13, 06:27   #39
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22×3×641 Posts
Default

Quote:
Originally Posted by T.Rex
Is the 2 ways DP means Double Precision ?
I think so.

Quote:
But slide 17 says: "SIMD FLoat only".
Read that slide's table right-to-left.
cheesehead is offline   Reply With Quote
Old 2005-07-13, 06:28   #40
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Quote:
Originally Posted by T.Rex
Is the 2 ways DP means Double Precision ?
Yes.

Quote:
Originally Posted by T.Rex
But slide 17 says: "SIMD FLoat only" .
But not in the "SPE" column (Cell), which states "SIMD int, float, double". "VU" seems to be just some DSP's vector unit, which is being compared to Cell's SPEs.

The double precision capabilities of Cell (more the SPE ones than the PPE's) have already been discussed in this thread. Just read above.
Dresdenboy is offline   Reply With Quote
Old 2005-08-20, 14:11   #41
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

While reading through an article in a Linux magazine, where they show the
possibilities of letting Linux run on Cell, I found an interesting document
mentioned:
Unleashing the power: A programming example of large FFTs on Cell

The original source is here:
http://www.power.org/news/events/barcelona/

And the document is here:
http://www.power.org/news/events/barcelona/11_chow.pdf

It speaks about single precision FFTs, but that doesn't matter, since it covers
nearly all important factors, which might be interesting for implementing a LL
test on Cell. They say, their FFT implementation would be already close to being
computationally bound, so this would even be more the case when using double
precision.
Dresdenboy is offline   Reply With Quote
Old 2005-08-21, 04:03   #42
JHagerson
 
JHagerson's Avatar
 
May 2005
Naperville, IL, USA

110001002 Posts
Default

Will the SSE3 instructions bail out Intel or does the AMD implementation of SSE3 keep AMD in the lead for serious number crunching?
JHagerson is offline   Reply With Quote
Old 2005-08-21, 13:30   #43
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

36110 Posts
Default

Quote:
Originally Posted by JHagerson
Will the SSE3 instructions bail out Intel or does the AMD implementation of SSE3 keep AMD in the lead for serious number crunching?
The set of new SSE3 instructions, which are useful for complex number math, still have about the same FADD/FMUL throughput for the instructions, which are important for Prime95, so this wouldn't help. Additionally Prime95 already avoids the need for horizontal operations by doing the complex math "by hand" in 2 separate sets of calculations going on in the lower and higher halves of the SSE2 registers.

However, here is the data I collected from the appropriate optimization manuals for the register-to-register variants of these instructions. Intel didn't give the numbers for the case, when memory operands are involved and delivered from shortest latency cache (as it is often the case in Prime95). For the K8 these instructions have a 1 (HADDPx/HSUBPx) or 2 cycles (ADDSUBPx, MOVxDUP) longer latency.

Code:
Prescott/Nocona:
Instruction(s)          Latency/        involved Units
                        Throughput
ADDSUBPD/ADDSUBPS       5 / 2           FP_ADD
HADDPD/HADDPS           13 / 4          FP_ADD,FP_MISC
HSUBPD/HSUBPS           13 / 4          FP_ADD,FP_MISC
MOVDDUP xmm1, xmm2      4 / 2           FP_MOVE
MOVSHDUP xmm1, xmm2     6 / 2           FP_MOVE
MOVSLDUP xmm1, xmm2     6 / 2           FP_MOVE

K8 Stepping E:
Instruction(s)          Latency/        involved Units
                        Throughput
ADDSUBPD/ADDSUBPS       5 / 2           FADD
HADDPD/HADDPS           5 / 2           FADD
HSUBPD/HSUBPS           5 / 2           FMUL (maybe for parallel execution)
MOVDDUP xmm1, xmm2      2 / 2           FMUL
MOVSHDUP xmm1, xmm2     3 / 2           FMUL
MOVSLDUP xmm1, xmm2     3 / 2           FADD
Throughput is given as "cycles between instruction issue".

As you can see, SSE3 wouldn't change the situation.
Dresdenboy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Some transition probabilities fivemack Aliquot Sequences 9 2012-03-16 08:49
Caught in transition? cheesehead Forum Feedback 1 2011-12-11 05:14
Major overhaul of the DB 10metreh Aliquot Sequences 5 2010-08-29 01:10
How will the transition to v5 work? ixfd64 PrimeNet 3 2008-10-01 01:42
server transition news ltd Prime Sierpinski Project 4 2006-04-19 20:25

All times are UTC. The time now is 08:23.


Thu Aug 11 08:23:41 UTC 2022 up 35 days, 3:11, 2 users, load averages: 0.75, 0.98, 1.19

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔