![]() |
![]() |
#1 |
"Mihai Preda"
Apr 2015
2×677 Posts |
![]()
In light of Nvidia's new GPU launch, it appears we need to find a way of doing big convolutions using SP FP (FP32). This has been an elusive task in the past.
That new GPU has 2x FP32 vs. INT32, and 64x FP32 vs. FP64. |
![]() |
![]() |
![]() |
#2 |
"Mihai Preda"
Apr 2015
2×677 Posts |
![]()
AKA "The Holy Grail" :)
|
![]() |
![]() |
![]() |
#3 |
Random Account
Aug 2009
19×101 Posts |
![]()
Sorry, I cannot make a connection to 24-bit. FP32 seems to represent 32-bit. FP64 is 64-bit. 2x FP32 suggests 64-bit as wall. Would you care to elaborate a little?
|
![]() |
![]() |
![]() |
#4 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
22×5×251 Posts |
![]() Last fiddled with by kriesel on 2020-09-19 at 00:26 |
![]() |
![]() |
![]() |
#5 |
"/X\(‘-‘)/X\"
Jan 2013
3·977 Posts |
![]()
Normal FP32 has 23 bits for the fraction component, 8 for the exponent, and 1 for the sign (+/-). The exponent bits effectively give one more bit of precision, either being all zero or not, meaning FP32 can do INT24 math.
|
![]() |
![]() |
![]() |
#6 |
"Mihai Preda"
Apr 2015
54A16 Posts |
![]()
Some previous discussion:
https://www.mersenneforum.org/showthread.php?t=23926 |
![]() |
![]() |
![]() |
#7 |
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
585710 Posts |
![]()
It sounds like the additional memory usage(and hence memory bandwidth) may be an issue. Would 64x be enough that arithmetic using double-floats would be useful?
|
![]() |
![]() |
![]() |
#8 |
P90 years forever!
Aug 2002
Yeehaw, FL
32×823 Posts |
![]()
Years ago I toyed with using two or three 32-bit ints to create a 64 or 96-bit float (no exponent bits -- all mantissa).
I did enough work to prove to myself it was feasible and, at the time, would be about a fast as a double-precision FFT. As nVidia has lowered and lowered the DP-to-SP ratio, it would be a substantial winner now. An awful lot of code to write though. |
![]() |
![]() |
![]() |
#9 |
"Composite as Heck"
Oct 2017
3×263 Posts |
![]()
This post ( https://mersenneforum.org/showpost.p...4&postcount=85 ) suggests that the doubling of fp32 is because they upgraded the int32 units to also do fp32. If int32 and fp32 operations can be freely mixed or if the workload can be split into int32-only and fp32-only operations then there's more bits up for grabs. A split solution should also work on the 20 series as that can do fp32 and int32 concurrently but that is highly memory limited so there may not be a benefit.
|
![]() |
![]() |
![]() |
#10 | |
"Mihai Preda"
Apr 2015
54A16 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#11 |
Random Account
Aug 2009
19×101 Posts |
![]()
I got to thinking about color palettes. A 24-bit palette is capable of 16,777,215 unique values. This has been in use a long time. Before was 16-bit capable of only 65,536 colors. Whether this is any way relative to the discussion here, I don't know.
![]() |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
What does net neutrality mean for the future? | jasong | jasong | 1 | 2015-04-26 08:55 |
The future of Msieve | jasonp | Msieve | 23 | 2008-10-30 02:23 |
Future of Primes. | mfgoode | Lounge | 3 | 2006-11-18 23:43 |
The future of NFSNET | JHansen | NFSNET Discussion | 15 | 2004-06-01 19:58 |
15k Future? | PrimeFun | Lounge | 21 | 2003-07-25 02:50 |