mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2016-05-31, 12:56   #1
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2×32×353 Posts
Default x86 asm question

I'm posting this to the group because I cannot figure it out. I working on a sieve for the Primes in π project. The sieve is generic enough that it can be used by similar projects. The problem I'm running into is that the results being returned from the assembler function I wrote are not correct when I use any optimization and I have no idea why. For example on my Mac if I compile with this: gcc -m64 *.S *.c -o pixsieve and run with this: ./pixsieve -L20000 -spi.txt -S10 -P1e6 -m1e9 -iterms.txt -Rterms.pfgw -N, it returns the correct results. (pi.txt is file with the first million decimal digits of pi). If I compile with this: gcc -O1 -m64 *.S *.c -o pixsieve and run with the command line, then I get the wrong results.

In the code it makes a call to an asm routine called pixsieve(). One of the parameters to this routine is an array of values that holds the results of some mulmods. Without any optimization, the former returns the correct values for the mulmods. In the one that doesn't, I get 0 for all of them. I have no idea why optimizing would break the code. I'm hoping that one of the asm gurus on this forum could help me figure out what I am doing wrong. There are some lines of code that show the results from the mulmod.

BTW, on Windows, this same code always returns 0s for all mulmods regardless of optimization level.

Finally, Ernst did suggest that I inline the code, which would likely solve the problem, but I would like to understand why it fails in the first place before I make such a change.
Attached Files
File Type: 7z pixsieve_1.1.7z (19.8 KB, 59 views)
rogue is offline   Reply With Quote
Old 2016-05-31, 21:02   #2
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2·32·353 Posts
Default

Although I don't have a solution to this specific problem, I have a different way to continue to using the asm, but not rely on output values from it. If anyone can happen to figure out the root cause, I would appreciate it for my own education.
rogue is offline   Reply With Quote
Old 2016-05-31, 22:34   #3
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2×33×109 Posts
Default

-O1 turns on various optimization flags. Which one causes the problem?
-fdefer-pop -fmerge-constants -fthread-jumps -floop-optimize -fcrossjumping -fif-conversion -fif-conversion2 -fdelayed-branch -fguess-branch-probability -fcprop-registers

Does -Wall suggest anything?
henryzz is online now   Reply With Quote
Old 2016-11-10, 21:57   #4
Grey4
 
Nov 2016

210 Posts
Default

I think the problem would be that the pixsieve function is declared with __attribute__((pure)) although it modifies memory via the pointer passed as its third argument which pure functions are not allowed to do.
Grey4 is offline   Reply With Quote
Old 2016-11-16, 03:27   #5
Grey4
 
Nov 2016

2 Posts
Default

Also, the Windows code stores the fourth argument as a 64-bit value with "mov %r9, mult" while the not-Windows code stores it as a 32-bit value with "mov %ecx, mult", but later it is reloaded as a 64-bit value with "mov mult, %r12" which means the not-Windows code will have junk in the upper 32-bits of the reloaded value.
Grey4 is offline   Reply With Quote
Reply



All times are UTC. The time now is 17:49.


Sun Aug 1 17:49:22 UTC 2021 up 9 days, 12:18, 0 users, load averages: 2.67, 2.20, 1.86

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.