mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Cunningham Tables (https://www.mersenneforum.org/forumdisplay.php?f=51)
-   -   Help! Compiler bug (https://www.mersenneforum.org/showthread.php?t=14004)

R.D. Silverman 2010-10-01 00:36

Help! Compiler bug
 
Here is a code snippet:

[code]
void prepare_bounds(v1, v2, sign)
int v1[2], v2[2], sign;

{ /* start of prepare_bounds */

double m_over_a1, n_over_b1, b0_over_b1, a0_over_a1;
double a0, a1, b0, b1, one_over_a1, one_over_b1;
double l1and4, l2and3, l1and3, determ, inv_determ;
double stime;
double temp;


temp = get_time();
if (TIME_STATS) stime = get_time();
printf("1");
a0 = (double)v1[0];
a1 = (double)v2[0];
b0 = (double)v1[1];
b1 = (double)v2[1];
printf("2");
[/code]

I am trying to produce a release (i.e. non-debug) executable for my siever
using Visual Studio 2010.

As written exactly above the code runs just fine.

If I delete the "printf("1");" statement, the code [b]core dumps[/b] before
hitting printf("2");

If I delete the calls to get_time(), the code runs fine.

This seems to be the same compiler bug that I saw with VS 2008.
Apparently, the code is doing a register mis-allocation involving the call
to get_time. Adding the printf("1") statement seems to make the problem
go bye-bye.....

This is a total mystery.

Comments would be GREATLY appreciated.

R.D. Silverman 2010-10-01 00:39

[QUOTE=R.D. Silverman;232143]Here is a code snippet:

[code]
void prepare_bounds(v1, v2, sign)
int v1[2], v2[2], sign;

{ /* start of prepare_bounds */

double m_over_a1, n_over_b1, b0_over_b1, a0_over_a1;
double a0, a1, b0, b1, one_over_a1, one_over_b1;
double l1and4, l2and3, l1and3, determ, inv_determ;
double stime;
double temp;


temp = get_time();
if (TIME_STATS) stime = get_time();
printf("1");
a0 = (double)v1[0];
a1 = (double)v2[0];
b0 = (double)v1[1];
b1 = (double)v2[1];
printf("2");
[/code]

I am trying to produce a release (i.e. non-debug) executable for my siever
using Visual Studio 2010.

As written exactly above the code runs just fine.

If I delete the "printf("1");" statement, the code [b]core dumps[/b] before
hitting printf("2");

If I delete the calls to get_time(), the code runs fine.

This seems to be the same compiler bug that I saw with VS 2008.
Apparently, the code is doing a register mis-allocation involving the call
to get_time. Adding the printf("1") statement seems to make the problem
go bye-bye.....

This is a total mystery.

Comments would be GREATLY appreciated.[/QUOTE]

I forgot to mention. If I produce a debug version, even without the
printf statements, the code runs fine. It also runs fine without the
printf's when compiled with Visual C++ 5.0 and Visual C++ 6.0

WTF is going on here????

It is only when I produce a RELEASE version under Vs2010 that the code
barfs.

R.D. Silverman 2010-10-01 00:54

[QUOTE=R.D. Silverman;232145]I forgot to mention. If I produce a debug version, even without the
printf statements, the code runs fine. It also runs fine without the
printf's when compiled with Visual C++ 5.0 and Visual C++ 6.0

WTF is going on here????

It is only when I produce a RELEASE version under Vs2010 that the code
barfs.[/QUOTE]

One final piece of information. get_time() is called in MANY places throughout
the code. It only barfs within this little piece of code.

alpertron 2010-10-01 01:32

Are you compiling in 32-bit mode? Notice that there are some differences in 64 bits so if you for example cast pointers which are 64 bits long to integer (32 bits) and then you try to use them again as pointers, it will fail.

It is possible that before entering that function the heap and/or the stack were corrupted, so the function get_time() just triggers the error that was already present.

Robert Holmes 2010-10-01 01:34

That's a weird problem.

I assume this is being compiled in 32-bit mode. Instead of printf("1"), try putting an inline asm block that does nothing, like

[CODE]
__asm { nop }
[/CODE]

The compiler should be forced to separate the function before and after the assembly block, and that might work around the bug.

paulunderwood 2010-10-01 01:44

[code]
void prepare_bounds(v1, v2, sign)
int v1[2], v2[2], sign;
[/code]
[QUOTE=R.D. Silverman]

WTF is going on here????

[/QUOTE]

[url]http://msdn.microsoft.com/en-us/library/efx873ys.aspx[/url]

Looks like you function variable declarations/definitions are old school

R.D. Silverman 2010-10-01 02:44

[QUOTE=alpertron;232150]Are you compiling in 32-bit mode? Notice that there are some differences in 64 bits so if you for example cast pointers which are 64 bits long to integer (32 bits) and then you try to use them again as pointers, it will fail.
[/QUOTE]

There are no 64 bit pointers in the code.

[QUOTE]

It is possible that before entering that function the heap and/or the stack were corrupted, so the function get_time() just triggers the error that was already present.[/QUOTE]

This does not explain why adding a printf fixes the problem.

This is a register misallocation. Adding the printf causes the compiler to use
its registers differently.

R.D. Silverman 2010-10-01 02:45

[QUOTE=paulunderwood;232153][code]
void prepare_bounds(v1, v2, sign)
int v1[2], v2[2], sign;
[/code]


[url]http://msdn.microsoft.com/en-us/library/efx873ys.aspx[/url]

Looks like you function variable declarations/definitions are old school[/QUOTE]

Irrelevant.

alpertron 2010-10-01 11:35

[QUOTE=R.D. Silverman;232160]There are no 64 bit pointers in the code.

This does not explain why adding a printf fixes the problem.

This is a register misallocation. Adding the printf causes the compiler to use
its registers differently.[/QUOTE]

Maybe it is a problem on the optimization stage. I suggest you to change the project in order to add debugging information in the RELEASE version, and then run your program until the function is reached. Then run it step by step in the disassembler window. You can see in this way whether the function was compiled correctly or not.

R.D. Silverman 2010-10-01 11:45

[QUOTE=alpertron;232203]Maybe it is a problem on the optimization stage. I suggest you to change the project in order to add debugging information in the RELEASE version, and then run your program until the function is reached. Then run it step by step in the disassembler window. You can see in this way whether the function was compiled correctly or not.[/QUOTE]

Instead, I am just going to kick out the assembler code and look at it.

R.D. Silverman 2010-10-01 23:22

[QUOTE=R.D. Silverman;232143]Here is a code snippet:

[code]
void prepare_bounds(v1, v2, sign)
int v1[2], v2[2], sign;

{ /* start of prepare_bounds */

double m_over_a1, n_over_b1, b0_over_b1, a0_over_a1;
double a0, a1, b0, b1, one_over_a1, one_over_b1;
double l1and4, l2and3, l1and3, determ, inv_determ;
double stime;
double temp;


temp = get_time();
if (TIME_STATS) stime = get_time();
printf("1");
a0 = (double)v1[0];
a1 = (double)v2[0];
b0 = (double)v1[1];
b1 = (double)v2[1];
printf("2");
[/code]

I am trying to produce a release (i.e. non-debug) executable for my siever
using Visual Studio 2010.

As written exactly above the code runs just fine.

If I delete the "printf("1");" statement, the code [b]core dumps[/b] before
hitting printf("2");

If I delete the calls to get_time(), the code runs fine.

This seems to be the same compiler bug that I saw with VS 2008.
Apparently, the code is doing a register mis-allocation involving the call
to get_time. Adding the printf("1") statement seems to make the problem
go bye-bye.....

This is a total mystery.

Comments would be GREATLY appreciated.[/QUOTE]

More weirdness. If I replace printf("1") with do_nothing("1") where
do_nothing is just a dummy routine the code STILL fails.

How can the addition of a printf of a static string cure a core dump
caused by a read access failure?

R.D. Silverman 2010-10-02 00:09

[QUOTE=R.D. Silverman;232204]Instead, I am just going to kick out the assembler code and look at it.[/QUOTE]

I looked at the assembler output. Nothing obvious stands out except
that the get_time() call is in-lined.

[code]
; 5650 : { /* start of prepare_bounds */

push ebp
mov ebp, esp
sub esp, 20 ; 00000014H

; 5651 : int do_nothing(char *x);
; 5652 : double m_over_a1, n_over_b1, b0_over_b1, a0_over_a1;
; 5653 : double a0, a1, b0, b1, one_over_a1, one_over_b1;
; 5654 : double l1and4, l2and3, l1and3, determ, inv_determ;
; 5655 : double stime;
; 5656 : double temp;
; 5657 :
; 5658 :
; 5659 : if (TIME_STATS) stime = get_time();

DB 15 ; 0000000fH
DB 49 ; 00000031H
mov DWORD PTR _a$89577[ebp], eax
mov DWORD PTR _b$89578[ebp], edx

; 5660 :
; 5661 : a0 = (double)v1[0];
; 5662 : a1 = (double)v2[0];
; 5663 : b0 = (double)v1[1];
; 5664 : b1 = (double)v2[1];
; 5665 :
; 5666 : one_over_a1 = 1.0/a1;
; 5667 : a0_over_a1 = a0 * one_over_a1;
; 5668 : one_over_b1 = 1.0/b1;
; 5669 : b0_over_b1 = b0 * one_over_b1;
; 5670 :
; 5671 : /* We have a parallelogram. One point is always (0,0). f is the */
; 5672 : /* vertical axis, e the horizontal. Compute emin & emax */
; 5673 : /* Also, compute intersections and boundary slopes */
; 5674 : /* Note that determ = p (up to sign) so could precompute: but it */
; 5675 : /* would require xtra storage to hold the sign bit */
; 5676 :
; 5677 : determ = (a0 * b1 - a1 * b0);
; 5678 : inv_determ = 1.0/determ;
; 5679 :
; 5680 : if (SHOW_PREP)
; 5681 : {
; 5682 : (void) printf("Prep: a0,a1,b0,b1 = %g %g %g %g\n",a0,a1,b0,b1);
; 5683 : (void) printf("m_over_a1, n_over_b1 = %g %g\n",m_over_a1, n_over_b1);
; 5684 : (void) printf("a0_over_a1 , b0_over_b1 = %g %g\n",a0_over_a1, b0_over_b1);
; 5685 : (void) printf("determ, inv = %g %g\n",determ,inv_determ);
; 5686 : }
; 5687 :
; 5688 : if (sign == 1) {

cmp DWORD PTR _sign$[ebp], 1
mov ecx, DWORD PTR _a$89577[ebp]
movsd xmm2, QWORD PTR __real@3ff0000000000000
movd xmm5, DWORD PTR [edx+4]
movd xmm7, DWORD PTR [edx]
mov DWORD PTR _t$89576[ebp], ecx
mov ecx, DWORD PTR _b$89578[ebp]
mov DWORD PTR _t$89576[ebp+4], ecx
fild QWORD PTR _t$89576[ebp]
mov ecx, DWORD PTR [eax]
mov eax, DWORD PTR [eax+4]
xorps xmm4, xmm4
cvtsi2sd xmm4, eax
movapd xmm1, xmm2
divsd xmm1, xmm4
cvtdq2pd xmm5, xmm5
cvtdq2pd xmm7, xmm7
xorps xmm6, xmm6
cvtsi2sd xmm6, ecx
movapd xmm0, xmm2
mulsd xmm1, xmm5
mulsd xmm4, xmm7
mulsd xmm5, xmm6
subsd xmm4, xmm5
divsd xmm0, xmm6
movapd xmm3, xmm0
divsd xmm2, xmm4

; 5689 :
; 5690 : if (a0 > 0 && a1 > 0)

[/code]

retina 2010-10-02 00:24

[QUOTE=R.D. Silverman;232273][code] DB 15 ; 0000000fH
DB 49 ; 00000031H
mov DWORD PTR _a$89577[ebp], eax
mov DWORD PTR _b$89578[ebp], edx

cmp DWORD PTR _sign$[ebp], 1
mov ecx, DWORD PTR _a$89577[ebp]
movsd xmm2, QWORD PTR __real@3ff0000000000000
movd xmm5, DWORD PTR [edx+4]
movd xmm7, DWORD PTR [edx]
[/code][/QUOTE]RDTSC (0x0f,0x31) returns the counter in edx:eax
And then "movd xmm5, DWORD PTR [edx+4]" will randomly crash.

Why is edx never initialised to point to anything proper after reading the TSC? Did you really show all the compiled code for that section? If so then get a new compiler.

R.D. Silverman 2010-10-02 00:38

[QUOTE=retina;232275]RDTSC (0x0f,0x31) returns the counter in edx:eax
And then "movd xmm5, DWORD PTR [edx+4]" will randomly crash.

Why is edx never initialised to point to anything proper after reading the TSC? Did you really show all the compiled code for that section? If so then get a new compiler.[/QUOTE]

Yes. This is all the code. It grabs the clock counter, then converts it to a double.

The compiler is Microsoft Visual Studio 2010 (and VS 2008)

R.D. Silverman 2010-10-02 00:45

[QUOTE=retina;232275]RDTSC (0x0f,0x31) returns the counter in edx:eax
And then "movd xmm5, DWORD PTR [edx+4]" will randomly crash.

Why is edx never initialised to point to anything proper after reading the TSC? Did you really show all the compiled code for that section? If so then get a new compiler.[/QUOTE]

Nice catch. I failed to see it.

Here is the debug assembler. Note that it does not in-line the get_time()
call and subsequently does not use the edx register:

[code]
; 5650 : { /* start of prepare_bounds */

push ebp
mov ebp, esp
sub esp, 228 ; 000000e4H
push ebx
push esi
push edi

; 5651 : int do_nothing(char *x);
; 5652 : double m_over_a1, n_over_b1, b0_over_b1, a0_over_a1;
; 5653 : double a0, a1, b0, b1, one_over_a1, one_over_b1;
; 5654 : double l1and4, l2and3, l1and3, determ, inv_determ;
; 5655 : double stime;
; 5656 : double temp;
; 5657 :
; 5658 :
; 5659 : if (TIME_STATS) stime = get_time();

mov eax, 1
test eax, eax
je SHORT $LN21@prepare_bo
call _get_time
fstp QWORD PTR _stime$[ebp]
$LN21@prepare_bo:

; 5660 :
; 5661 : a0 = (double)v1[0];

mov eax, DWORD PTR _v1$[ebp]
fild DWORD PTR [eax]
fstp QWORD PTR _a0$[ebp]

; 5662 : a1 = (double)v2[0];

mov eax, DWORD PTR _v2$[ebp]
fild DWORD PTR [eax]
fstp QWORD PTR _a1$[ebp]

; 5663 : b0 = (double)v1[1];

mov eax, DWORD PTR _v1$[ebp]
fild DWORD PTR [eax+4]
fstp QWORD PTR _b0$[ebp]

; 5664 : b1 = (double)v2[1];

mov eax, DWORD PTR _v2$[ebp]

etc.
[/code]

retina 2010-10-02 00:47

[QUOTE=R.D. Silverman;232278]Yes. This is all the code. It grabs the clock counter, then converts it to a double.[/QUOTE]The double conversion is done further down, after transferring through ecx to another location - "fild QWORD PTR _t$89576[ebp]"[QUOTE=R.D. Silverman;232278]The compiler is Microsoft Visual Studio 2010 (and VS 2008)[/QUOTE]Write a nice letter to MS and complain.

R.D. Silverman 2010-10-02 01:04

[QUOTE=retina;232280]The double conversion is done further down, after transferring through ecx to another location - "fild QWORD PTR _t$89576[ebp]"Write a nice letter to MS and complain.[/QUOTE]

get_time calls an assembler routine that samples the clock and returns
a 64 bit int. It converts the 64-bit int to a double and returns it.

I could reorganize the code inside get_time(), but I doubt it will help.

The problem is the misuse of the edx register AFTER get_time() returns.
I suspect it is a bug in the code optimizer when dealing with (the other) floating point code.

R.D. Silverman 2010-10-02 01:47

[QUOTE=R.D. Silverman;232283]get_time calls an assembler routine that samples the clock and returns
a 64 bit int. It converts the 64-bit int to a double and returns it.

I could reorganize the code inside get_time(), but I doubt it will help.

The problem is the misuse of the edx register AFTER get_time() returns.
I suspect it is a bug in the code optimizer when dealing with (the other) floating point code.[/QUOTE]

I changed the code so that the 64-bit routine that samples the clock
returns a double instead of an int64 and then called it directly.
The call is not inlined. Instead I just get

call _get_time1

However, the emitted code STILL mis-uses the edx register in the
middle of the floating computations that follow.

3 lines later it does:

mov ecx DWORD PTR [edx] but without initializing where edx is pointing.

In fact, it is still pointing to whatever was placed in it by the clock
sample code.

axn 2010-10-02 03:31

[QUOTE=R.D. Silverman;232288]I changed the code so that the 64-bit routine that samples the clock
returns a double instead of an int64 and then called it directly.
The call is not inlined. Instead I just get

call _get_time1

However, the emitted code STILL mis-uses the edx register in the
middle of the floating computations that follow.

3 lines later it does:

mov ecx DWORD PTR [edx] but without initializing where edx is pointing.

In fact, it is still pointing to whatever was placed in it by the clock
sample code.[/QUOTE]
I believe the first few parameters are passed via registers -- look at the place where it _calls_ your routine. I bet v1 and v2 are passed in eax and edx. These four lines use apparently uninitialized registers.

[CODE] movd xmm5, DWORD PTR [edx+4]
movd xmm7, DWORD PTR [edx]

mov ecx, DWORD PTR [eax]
mov eax, DWORD PTR [eax+4]
[/CODE]

retina 2010-10-02 04:08

[QUOTE=axn;232293]I believe the first few parameters are passed via registers -- look at the place where it _calls_ your routine. I bet v1 and v2 are passed in eax and edx.[/QUOTE]Unlikely because the debug code show the values being loaded from the stack. It would be extremely strange code that places pointers to the data on the stack AND loads eax/edx with pointers to the data and then calls the subroutine. There is no calling standard that defines that behaviour.

Random Poster 2010-10-02 09:56

[QUOTE=R.D. Silverman;232267]More weirdness. If I replace printf("1") with do_nothing("1") where do_nothing is just a dummy routine the code STILL fails.[/QUOTE]

Did you define do_nothing in the same source file? If so, then the optimizer probably replaced the call to it by the contents of the function. Try defining do_nothing in a different source file.

[QUOTE=R.D. Silverman;232267] How can the addition of a printf of a static string cure a core dump caused by a read access failure?[/QUOTE]

Values of general registers aren't expected to survive across function calls, so adding any call (that can't be inlined away) will force the compiler to reassign registers, and (as you noticed) this often works around optimization bugs where registers get garbled.

R.D. Silverman 2010-10-02 11:20

[QUOTE=Random Poster;232311]Did you define do_nothing in the same source file? If so, then the optimizer probably replaced the call to it by the contents of the function. Try defining do_nothing in a different source file.



Values of general registers aren't expected to survive across function calls, so adding any call (that can't be inlined away) will force the compiler to reassign registers, and (as you noticed) this often works around optimization bugs where registers get garbled.[/QUOTE]

I found the following in an Intel development manual:

"As discussed in section 2.3, some compilers do not implicitly recognize the RDTSC and CPUID function in inline
assembly code. Compilers like Microsoft® Visual C++® 5.0 normally "guarantee" that any register affected by an
inline assembly code section will not affect the C code around it. When overriding the compiler by using the emit
statements, however, the compiler does not know those instructions are overwriting registers (RDTSC overwrites
EAX and EDX, and CPUID overwrites EAX, EBX, ECX, and EDX). Thus, the compiler may not properly store away
the affected registers, so this must be done manually by the programmer by pushing them onto the stack.
There are a few cases where this will not matter. If the code being time measured is a stand-alone section of code,
completely surrounded by the calls to RDTSC, then the register overwriting cannot affect the code around it. If the
measured code section is written in assembly, and the variables are actually used inside of this section, the compiler
will handle the stack allocation itself. Finally, it will not matter if affecting the correctness of the code around the
measured section is not an issue while cycle testing."

Note however, that I did a workaround such that the compiler does NOT
in-line the clock sample code, but instead calls a subroutine. The emitted
code is still mis-using the edx register a few lines later. It seems certain
that the clock sample subroutine is not restoring the edx register when it
returns.

R.D. Silverman 2010-10-02 11:31

[QUOTE=R.D. Silverman;232315]I found the following in an Intel development manual:

"As discussed in section 2.3, some compilers do not implicitly recognize the RDTSC and CPUID function in inline
assembly code. Compilers like Microsoft® Visual C++® 5.0 normally "guarantee" that any register affected by an
inline assembly code section will not affect the C code around it. When overriding the compiler by using the emit
statements, however, the compiler does not know those instructions are overwriting registers (RDTSC overwrites
EAX and EDX, and CPUID overwrites EAX, EBX, ECX, and EDX). Thus, the compiler may not properly store away
the affected registers, so this must be done manually by the programmer by pushing them onto the stack.
There are a few cases where this will not matter. If the code being time measured is a stand-alone section of code,
completely surrounded by the calls to RDTSC, then the register overwriting cannot affect the code around it. If the
measured code section is written in assembly, and the variables are actually used inside of this section, the compiler
will handle the stack allocation itself. Finally, it will not matter if affecting the correctness of the code around the
measured section is not an issue while cycle testing."

Note however, that I did a workaround such that the compiler does NOT
in-line the clock sample code, but instead calls a subroutine. The emitted
code is still mis-using the edx register a few lines later. It seems certain
that the clock sample subroutine is not restoring the edx register when it
returns.[/QUOTE]

One possible (ugly!) workaround might be:

saved_edx_value = save_edx();
stime = get_time();
restored_edx_value(saved_edx_value);

where save_edx and restore_edx_value are simple hand-coded _asm
routines.

But this is [b]such[/b] a kludge!

A better way is to rewrite get_time() in a way such that the compiler
restores the edx register when it exits. I am unsure how to do that.

Here is the current code:

/************************************************************************/
/* */
/* Routine to access the clock. On Pentium prectime returns ticks */
/* */
/************************************************************************/

double get_time()

{ /* start of get_time */

return((double)prectime());

} /* end of get_time */



__int64 prectime(void)
{
__int64 t;
unsigned int a,b;
unsigned int *c = (unsigned int *)&t;
_asm
{
_emit 0x0f;
_emit 0x31;
mov a,eax;
mov b,edx;
}

c[0]=a;
c[1]=b;
return t;
}

R.D. Silverman 2010-10-02 11:37

[QUOTE=R.D. Silverman;232318]One possible (ugly!) workaround might be:

saved_edx_value = save_edx();
stime = get_time();
restored_edx_value(saved_edx_value);

where save_edx and restore_edx_value are simple hand-coded _asm
routines.

But this is [b]such[/b] a kludge!

A better way is to rewrite get_time() in a way such that the compiler
restores the edx register when it exits. I am unsure how to do that.

Here is the current code:

/************************************************************************/
/* */
/* Routine to access the clock. On Pentium prectime returns ticks */
/* */
/************************************************************************/

double get_time()

{ /* start of get_time */

return((double)prectime());

} /* end of get_time */



__int64 prectime(void)
{
__int64 t;
unsigned int a,b;
unsigned int *c = (unsigned int *)&t;
_asm
{
_emit 0x0f;
_emit 0x31;
mov a,eax;
mov b,edx;
}

c[0]=a;
c[1]=b;
return t;
}[/QUOTE]

I suppose that I could manually insert code into prectime that saves/restores
the edx register via:

__int64 t;
unsigned int a,b,tmp;
unsigned int *c = (unsigned int *)&t;
_asm
{
mov tmp, edx
_emit 0x0f;
_emit 0x31;
mov a,eax;
mov b,edx;
mov edx,tmp
}

Yech!

axn 2010-10-02 14:18

[QUOTE=R.D. Silverman;232319]I suppose that I could manually insert code into prectime that saves/restores
the edx register via:

__int64 t;
unsigned int a,b,tmp;
unsigned int *c = (unsigned int *)&t;
_asm
{
mov tmp, edx
_emit 0x0f;
_emit 0x31;
mov a,eax;
mov b,edx;
mov edx,tmp
}

Yech![/QUOTE]

Please see my earlier post -- It might not just be the edx register that you need to preserver. eax may also need to be preserved.

axn 2010-10-02 14:24

[QUOTE=retina;232294]Unlikely because the debug code show the values being loaded from the stack. It would be extremely strange code that places pointers to the data on the stack AND loads eax/edx with pointers to the data and then calls the subroutine. There is no calling standard that defines that behaviour.[/QUOTE]

Debug code and release code need not have any relation to each other. This is a case where the optimizer is not realizing that eax/edx got clobbered. But without seeing the assembly of how the routine is _called_, it is just speculation.

axn 2010-10-02 14:32

Can you post the code of your get_time function? Are you using opcodes using inline assembly or are you using __rdtsc() intrinsic ([url]http://msdn.microsoft.com/en-us/library/twchhe95.aspx[/url])?

R.D. Silverman 2010-10-02 16:06

[QUOTE=axn;232333]Can you post the code of your get_time function? Are you using opcodes using inline assembly or are you using __rdtsc() intrinsic ([url]http://msdn.microsoft.com/en-us/library/twchhe95.aspx[/url])?[/QUOTE]

?? See #23, #24 in this thread!

R.D. Silverman 2010-10-02 16:07

[QUOTE=R.D. Silverman;232342]?? See #23, #24 in this thread![/QUOTE]

BTW, I did an [b]explicit[/b] save/restore of eax, edx, and the code
now works fine.

axn 2010-10-02 18:17

[QUOTE=R.D. Silverman;232343]BTW, I did an [b]explicit[/b] save/restore of eax, edx, and the code
now works fine.[/QUOTE]

Oops. Didn't realize that was the code. Can you use the __rdtsc intrinsic instead? As in:
[CODE]double get_time()
{ /* start of get_time */
return((double)__rdtsc());
} /* end of get_time */
[/CODE]
Does the compiler recognize that eax/edx is getting clobbered in this case and saves it?

R.D. Silverman 2010-10-02 22:12

[QUOTE=axn;232362]Oops. Didn't realize that was the code. Can you use the __rdtsc intrinsic instead? As in:
[CODE]double get_time()
{ /* start of get_time */
return((double)__rdtsc());
} /* end of get_time */
[/CODE]
Does the compiler recognize that eax/edx is getting clobbered in this case and saves it?[/QUOTE]

I don't know. It is moot at this point.

BTW, when I first wrote this code, the intrinsic did not exist......


All times are UTC. The time now is 08:06.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.