View Single Post
Old 2020-07-14, 21:07   #6
ewmayer's Avatar
Sep 2002
Rep├║blica de California

1162510 Posts

Intel could easily reduce their transistor budget for SIMD support and provide the much-improved integer-math functionality Linus Torvalds yearns for if they weren't so crazy-biased towards FP support and thought more about multiple kinds of instructions sharing the same transistors insofar as possible. Let's consider the notoriously-transistor-hungry case of multiply: instead of first offering only avx-512 FP multiply and low-width vector integer mul, then later adding another half-measure, using those FP-mul units to generate the high 52 bits of a 64x64-bit integer product, plunk down a bunch of full 64x64->128-bit integer multipliers, supporting a vector analog (at long last) of the longstanding integer MUL instructions. Then design things so those units can be used for both integer and FP operands. Need bottom 64-bits of 64x64-bit integer mul? Just discard the high product halves, and maybe shave a few cycles. Signed vs unsigned high half of 64x64-bit product? Easily handled via a tiny bit of extra logic. Vector-DP product, either high-53-bits or full-width FMA style? No problem, just use the usual FP-operand preprocessing logic, then feed the resulting integer mantissas to the multi-purpose vector-MUL unit, then the usual postprocessing pipeline stages to properly deal with the resulting 106-bit product.

The HPC part comes in in the above context this way: very few programs are gonna need *both* high-perf integer and FP mul - the ones that do are *truly* outliers, unlike Torvalds' inane labeling of all HPC as some kind of fringe community. Using the same big-block transistor budget to support multiple data types is a big-picture win, even it leads to longer pipelines: the 32 avx-512 vector registers are more thn enough to allow coders to do a good job at latency hiding even with fairly long instruction pipelines.

Last fiddled with by ewmayer on 2020-07-14 at 21:08
ewmayer is offline   Reply With Quote