20200918, 21:46  #1 
"Mihai Preda"
Apr 2015
2×677 Posts 
The future is 24bit
In light of Nvidia's new GPU launch, it appears we need to find a way of doing big convolutions using SP FP (FP32). This has been an elusive task in the past.
That new GPU has 2x FP32 vs. INT32, and 64x FP32 vs. FP64. 
20200918, 22:48  #2 
"Mihai Preda"
Apr 2015
2×677 Posts 
AKA "The Holy Grail" :)

20200918, 23:44  #3 
Random Account
Aug 2009
19×101 Posts 
Sorry, I cannot make a connection to 24bit. FP32 seems to represent 32bit. FP64 is 64bit. 2x FP32 suggests 64bit as wall. Would you care to elaborate a little?

20200919, 00:26  #4 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2^{2}×5×251 Posts 
Last fiddled with by kriesel on 20200919 at 00:26 
20200919, 00:26  #5 
"/X\(‘‘)/X\"
Jan 2013
3·977 Posts 
Normal FP32 has 23 bits for the fraction component, 8 for the exponent, and 1 for the sign (+/). The exponent bits effectively give one more bit of precision, either being all zero or not, meaning FP32 can do INT24 math.

20200919, 03:14  #6 
"Mihai Preda"
Apr 2015
54A_{16} Posts 
Some previous discussion:
https://www.mersenneforum.org/showthread.php?t=23926 
20200919, 05:31  #7 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
5857_{10} Posts 
It sounds like the additional memory usage(and hence memory bandwidth) may be an issue. Would 64x be enough that arithmetic using doublefloats would be useful?

20200919, 05:35  #8 
P90 years forever!
Aug 2002
Yeehaw, FL
3^{2}×823 Posts 
Years ago I toyed with using two or three 32bit ints to create a 64 or 96bit float (no exponent bits  all mantissa).
I did enough work to prove to myself it was feasible and, at the time, would be about a fast as a doubleprecision FFT. As nVidia has lowered and lowered the DPtoSP ratio, it would be a substantial winner now. An awful lot of code to write though. 
20200919, 08:48  #9 
"Composite as Heck"
Oct 2017
3×263 Posts 
This post ( https://mersenneforum.org/showpost.p...4&postcount=85 ) suggests that the doubling of fp32 is because they upgraded the int32 units to also do fp32. If int32 and fp32 operations can be freely mixed or if the workload can be split into int32only and fp32only operations then there's more bits up for grabs. A split solution should also work on the 20 series as that can do fp32 and int32 concurrently but that is highly memory limited so there may not be a benefit.

20200919, 09:11  #10  
"Mihai Preda"
Apr 2015
54A_{16} Posts 
Quote:


20200919, 16:11  #11 
Random Account
Aug 2009
19×101 Posts 
I got to thinking about color palettes. A 24bit palette is capable of 16,777,215 unique values. This has been in use a long time. Before was 16bit capable of only 65,536 colors. Whether this is any way relative to the discussion here, I don't know.

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
What does net neutrality mean for the future?  jasong  jasong  1  20150426 08:55 
The future of Msieve  jasonp  Msieve  23  20081030 02:23 
Future of Primes.  mfgoode  Lounge  3  20061118 23:43 
The future of NFSNET  JHansen  NFSNET Discussion  15  20040601 19:58 
15k Future?  PrimeFun  Lounge  21  20030725 02:50 