![]() |
C and the scarry flag
Is there a way to handle the carry flag in C or C++?
It would be nice to do something like this: [code]uint_64* aptr, bprt; ... for (i=start;i<end;i++) aptr[i]+c=bptr[i];[/code] where +c= is the same as += but using add with carry. The only way I can think of actually doing this in c would be to handle the carries explicitly, essentially reimplementing at the high level what the hardware is perfectly capable of doing far more efficiently by itself. |
[QUOTE=Mr. P-1;335075]Is there a way to handle the carry flag in C or C++?[/QUOTE]No. Unless you use inline asm.[QUOTE=Mr. P-1;335075]The only way I can think of actually doing this in c would be to handle the carries explicitly, essentially reimplementing at the high level what the hardware is perfectly capable of doing far more efficiently by itself.[/QUOTE]Yes. Unless you use inline asm.
BTW: You could use inline asm. |
[QUOTE=retina;335079]No. Unless you use inline asm.Yes. Unless you use inline asm.
BTW: You could use inline asm.[/QUOTE] I was hoping to avoid having to use inline asm. |
[QUOTE=Mr. P-1;335084]I was hoping to avoid having to use inline asm.[/QUOTE]
If you just want to avoid having the inline asm everywhere you could use an inline asm macro. If it is that you don't want to be compiling asm then no can do. |
[QUOTE=Mr. P-1;335084]I was hoping to avoid having to use inline asm.[/QUOTE]Create your own language based upon C but supporting access to the hardware CPU carry flag. Call it C+c, or something else more catchy.
BTW: You need to be careful about making sure the CPU you compile your new language to uses the carry in the same way as you expect. Different CPUs generate the carry differently (like ARM vs x86 for instance). |
[QUOTE=henryzz;335085]If you just want to avoid having the inline asm everywhere you could use an inline asm macro. If it is that you don't want to be compiling asm then no can do.[/QUOTE]
I haven't written any serious asm in a long, long, and it wasn't Intel. It was for a Motorola M68010, which was a beautiful processor from the programmer's point of view. It had a flat address space, and largely orthogonal instruction set, at a time when the comparable Intel chip, probably an 80286 or thereabouts was a total mess in terms of it's segmented address space, and which register you could use for what. I've always thought it a crying shame that Intel and not Motorola won the processor war. But that's as maybe. The point is that I'm really not familiar with the architecture of a 21st century core i5 to feel confident of doing it right, and still less of doing it even half way optimal. One thing I could to would be write the program without the carry functionality. Just ignore it. Then disassemble the resulting code and replace the add instructions with add-with-carrys Dunno if that would work, but it's a plan. |
I imagine there are people on the forum willing to help with a good macro.
|
Accessing the carry flag is easy using inline asm, but it's less easy to come up with the asm that does exactly what you want. For example, you have a loop:
for (i = 0; i < n; i++) { asm(add_with_carry_here) } It does what you want, but using the carry flag will only work inside the loop and if it's not implemented exactly correct then the arithmetic for handling 'i' will silently mess up the carry flag. Even if it's implemented correctly there could be a lot of register shuffling around the magic use-the-carry instruction. Likewise, since the carry is implicit on x86 you can use it, but actually saving it will take one or two more instructions that may negate the benefit of dropping to this level. Of course, many processors don't have a carry flag at all (MIPS, Alpha), some have one carry flag but lots of other flag registers (PowerPC) and ARM has a carry flag that works the opposite way sometimes. For an example of inline asm that has an entire multiple precision add loop in it, where the arrays are of fixed size, see the mp_add and mp_sub functions [url="http://msieve.svn.sourceforge.net/viewvc/msieve/trunk/common/mp.c?revision=286&view=markup"]here[/url]. |
I'd say it boils down to a binary decision:
0. It's performance-critical: Use inline ASM. [Hey, if I can learn it after age 40, there may be hope for us yet.] 1. It's not performance-critical: Emulate using unsigned integer compare, e.g. uint *a,*b, cy = 0; loop over i: a[i] += b[i] + cy; cy = a[i] < cy; /loop This often proves faster - especially when mixed with other code - even on CPUs supporting a carry flag, due to the easier job it makes in terms of scheduling. |
[QUOTE=ewmayer;335122]I'd say it boils down to a binary decision:
0. It's performance-critical: Use inline ASM. [Hey, if I can learn it after age 40, there may be hope for us yet.] 1. It's not performance-critical: Emulate using unsigned integer compare, e.g. uint *a,*b, cy = 0; loop over i: a[i] += b[i] + cy; cy = a[i] < cy; /loop[/quote] That won't work. a[i] can only be < cy if a[i] is 0 and cy is 1 Even if you interpret the < as a signed inequality, it won't work. For example, if a[i]=b[i] = 10101010[sub]2[/sub] adding b[i] to a[i] will generate a carry, but the sum will be 0101010c[sub]2[/sub], a positive number. A bitwise test for a carry is rather difficult to do. Basically uint a + uint b + carry-in will generate a carry-out if and only if at least one of the following is true: 1. The result is zero 2. The most significant zero bit of a XOR b corresponds to 1 in a (or equivalently in b) I can't see a way to test for this without multiple test and branches. Here's how it probably should be done: short_uint *a,*b long_uint cy= 0; loop over i: cy+=a[i]; cy+=b[i]; a[i] = cy; Shift right cy, (size of short_unit); /loop [/QUOTE] |
[QUOTE=Mr. P-1;335181]That won't work. a[i] can only be < cy if a[i] is 0 and cy is 1[/QUOTE]
I think "cy = a[i] < b[i]" is the correct version EDIT:- Even this version has a corner case that it can't handle. Hmmm... |
| All times are UTC. The time now is 01:19. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.