![]() |
/. Video processor compiler
Slashdot is running an article on a compiler for processors of video cards
[URL=http://developers.slashdot.org/developers/03/12/21/169200.shtml?tid=152&tid=185]Slashdot article[/URL] |
Wow. Only works on a few vid cards apparently (GeForceFX and above and ATI 97/9800). Also, briefly reading through the documentation, there appears to be a stumbling block:
From [URL=http://graphics.stanford.edu/projects/brookgpu/lang.html]this page[/URL]: [QUOTE]Examples of legal reductions are sum, product, min/max, OR, AND, and XOR bit operations. Examples of invalid reductions include subtraction and dividision.[/QUOTE] I might be wrong about this since I don't have much experience in C. Perhaps a more experienced programmer could comment on this? |
Unfortunately, all my farm boxes use VGA cards that I bought for literally $1 each on eBay.
:razz: |
Once upon a time, Nick Craig-Wood wrote an integer Lucas-Lehmer tester for the ARM chip.
[url]http://www.craig-wood.com/nick/armprime/[/url] The ARM chip had no divide instruction, but he found an ingenious way to calculate x mod p without division. Perhaps this code could be revived for use on graphics cards. Unfortunately it doesn't seem like he's worked on it since 1999. |
If division isn't available then subtraction of multiples x from p until the amount left is less then x.
If divistion and subtraction aren't available then add multiples of x while less than p. Shifts can be used for multiplying and dividing by 2. I think integers in 2's complement will add negatives. |
For the LL test everything is done mod Mp, so there are other tricks to replace division and subtraction.
-- Tho I'm not sure adding (Mp - 2) instead of subtracting 2 is reasonable. :wink: |
[URL=http://www.cs.unm.edu/~kmorel/documents/fftgpu/]http://www.cs.unm.edu/~kmorel/documents/fftgpu/[/URL]
FFT's can be done on a GPU. However, as usual, the problem is precision. Anyone know any useful tricks to use two singles as a double? |
There is an older thread about using GPUs for calculations:
[url]http://mersenneforum.org/showthread.php?s=&threadid=432[/url] Even something simple like game of life is hard to port to a GPU by using an efficient algorithm (like calculating one life cell per bit) |
[QUOTE][i]Originally posted by GP2 [/i]
[B]Once upon a time, Nick Craig-Wood wrote an integer Lucas-Lehmer tester for the ARM chip. [url]http://www.craig-wood.com/nick/armprime/[/url] The ARM chip had no divide instruction, but he found an ingenious way to calculate x mod p without division.[/B][/QUOTE] The LL test itself requires no divide, since the modulus has a convenient binary form, and because (since 1994) the Crandall/Fagin DWT algorithm allows us to further effect the mod automatically while doing the trasform-based multiply. In Nick's case, any mods that would have been needed would have been with respect to the prime he used for his all-integer transform, p = 2^64 - 2^32 + 1. That prime has a nice binary form that allows efficient modding without division. For general primes of no special form, we can use the well-known Montgomery division-less mod, which effects the mod via a clever precomputation of the inverse of p modulo a power of two, typically one chosen to coincide with the natural wordsize boundary of the hardware. This inverse can be efficiently computed using a Newton-style iteration of the same kind used to quickly find floating-point inverses without dividing, and using the fact that the standard hardware integer multiply of two (say) 64-bit ints is really mod-2^64 multiply. Once one has p^(-1) mod 2^n, a few more muls (including several which require a full double-wide integer product) suffice to return x*y*2^(-n) mod p, which can easily be converted to the desired x*y mod p. My sieve-based factoring code uses the Montgomery mod. I believe George's uses a completely different (but similarly fast, especially on hardware which doesn't support double-wide integer muls) way to get x*y mod p. |
BrookGPU
General Programming of GPUs with C:
[url]http://developers.slashdot.org/arti...052&tid=185[/url] [url]http://www.xbitlabs.com/news/video/...1222075229.html[/url] [url]http://graphics.stanford.edu/projects/brookgpu/[/url] [url]http://sourceforge.net/projects/brook[/url] maybe we could use this to speed up Prime95/MPrime ?????? |
Re: BrookGPU
[QUOTE][i]Originally posted by optim [/i]
[B]maybe we could use this to speed up Prime95/MPrime ?????? [/B][/QUOTE] Maybe if we can't use the GPU for [I]every[/I] instruction, maybe we can use it for [I]some[/I] and speed things up. George? |
| All times are UTC. The time now is 08:01. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.