mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2007-04-19, 10:06   #1
schnaader
 

BBB16 Posts
Question Prime95 software for PS3/GPU?

Looking at the success folding@home has with PS3/GPU support, this would be nice for Prime95, too. This was already mentioned in some threads, however, the argument against it is "only 32-bit floating point supported", and the FFT that is used needs 64-bit floating point numbers. But there are 2 ways to work around this, or am I wrong?

1. Use integer-FFT instead of floating point FFT.

Quote from the "The math" side ( http://www.mersenne.org/math.htm ):

Quote:
Although GIMPS uses a floating point FFT for reasons specific to the Intel Pentium architecture, Peter Montgomery showed that an all-integer weighted transform can also be used.
Does a "normal" version of Prime95 with integer-FFT exist already?

2. "Emulate" 64-bit floating point support.

64-bit integers can be emulated using 32-bit integers (using some shift operations), the same is possible for 32-bit floating point numbers (there are also some papers around how to do this at least for NVidia GPUs). However, this would definitely decrease the speed. Anyway, if the theoretical 100-500 GFlop/s would decrease to 10-50 GFlop/s, it would still be a lot faster than the CPU alone.

Greetings,
schnaader
  Reply With Quote
Old 2007-04-19, 13:38   #2
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3·1,181 Posts
Default

Quote:
Originally Posted by schnaader View Post
1. Use integer-FFT instead of floating point FFT.

Does a "normal" version of Prime95 with integer-FFT exist already?

2. "Emulate" 64-bit floating point support.
Regarding number 1, nothing much has changed since this thread. If your CPU has a pipelined 64-bit multiplier then you have a shot at an all-integer transform that's fast enough. What's the widest multiply you can do on a Cell engine? If it's like the altivec PowerPC architecture, you can do four 16-bit multiplies at a time. Plus Montgomery's algorithm requires access to carry bits in order to be efficient.

Regarding point 2, emulating 53-bit floating point with 24-bit floating point will still limit you to an 8-bit exponent, and this may not have enough range to handle an FFT with a million elements. Plus of course it's 10x slower. FFTW in single precision on the Cell, with seven engines, manages 20 GFlops for moderate size transforms. I doubt you can achieve that rate when the floating point primitives are 10x slower.

Edit: the biggest hurdle is convincing George Woltman (or whoever is interested in this) that there will be enough Playstations sold to prime number enthusiasts to justify a port to a completely different architecture. I sincerely doubt the number of consoles is going to rival the number of PCs anytime soon.

jasonp

Last fiddled with by jasonp on 2007-04-19 at 13:40
jasonp is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Legal to implant prime95 as a stress-test in another software SvenBent Software 1 2014-10-07 17:36
prime95 software question crash893 Software 8 2010-12-12 22:31
Software for IBM AIX pacionet Software 29 2010-09-01 15:36
Prime95 slowing down ExpressPCB software on XP benbradley Software 7 2008-10-08 03:09
How update prime95 software Sam_X Software 4 2005-07-09 12:44

All times are UTC. The time now is 23:25.


Fri Aug 6 23:25:25 UTC 2021 up 14 days, 17:54, 1 user, load averages: 3.92, 4.06, 4.04

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.