View Single Post
Old 2009-02-20, 18:33   #2
Tribal Bullet
jasonp's Avatar
Oct 2004

22×881 Posts

Maybe PMULUDQ was designed for fixed-point operations where the result is expected to be rounded and shifted right to destroy the low bits. Splitting that across two registers would be very painful...

I also think many of the quirks in SSE2 instructions boil down to a constrained intruction encoding space. The real problem is that multiple precision arithmetic was not on the agenda when this stuff was designed, so we'll just have to make do with what we have.
jasonp is offline   Reply With Quote