![]() |
I guess I will use modmul_2() from jasonp's mp.c for a fast workaround.
With gratitude. |
[QUOTE=Batalov;370447]Will this (or similar) hack work on Win64?
Or, in other words, what should go in the Win64 section?[/QUOTE] I only have access to a Win32/MSVC devenv so can;t actually test the 64-bit inline asm, but the only obvious flaw I notice is that you need to swap the operand order in each of your two MOVs due to the opposite-order convention of Intel and AT&T syntax, e.g. mov a,rax writes rax into variable a (store), whereas you obviously intend a load there. |
In 64-bit, nothing will work, people say. "You had me at __asm, says the compiler."
It is not a critical piece of code, though, so I will use either Jason's code or George's tf_validate from mmff. The latest CUDA refuses to install if one doesn't have VS and sends you to the generic microsoft download page. You get the latest and greatest 2013, and CUDA refuses to install pointing you to its link of system requirements, where it says: "Supported are VS 2008, 2010, 2012." These bastards know how to get you to curse. ;-) The flowers in my daughter's garden outside of my window wilted from what they've recently heard... Oh fun fun... |
[QUOTE=Batalov;370452]In 64-bit, nothing will work, people say. "You had me at __asm, says the compiler."[/QUOTE]
FYI, here is a similar 32-bit Win/MSVC [erm, I mean Visual Studio] inline asm macro from my codebase - if you can get that to build in 32-bit mode, next try an r**-register-named version of same in 64-bit mode. If that also compiles ok, try the one-asm-per-line syntax in your own modmul macro. FYI, I ditched any semblance of Microsoft/Win build support for my Mlucas-related codes several years ago, mainly due to MS charging full price for what I consider a crippled compiler product, namely VS having no 64-bit inline asm support even several years after x86_64 became widespread. No excuse for them to not support such stuff in VS as soon as they added x86_64 support of any kind. Allegedly they have since remedied their oversight, but far too late for me to give a crap. AFAIC the sooner that dinosaur turd of a company fades into irrelevancy the better, and anything, however small, I can personally do to hasten the process, I will. :) While I would never refuse to offer help to a friend in need such as yourself, may I suggest you similarly think long and hard about whether you want to pollute your own code with WinCrap, when you could simply use GCC inside a linux emulator? [code]/* Multiplies two 32-bit unsigned integers _x and _y, and generates a two-word (64-bit) product in _lo and _hi. */ #define MUL_LOHI32(_x,_y,_lo,_hi)\ {\ __asm mov eax, _x \ __asm mul _y /* Result of eax*_y stored in edx:eax as _hi:_lo */ \ __asm mov _lo, eax \ __asm mov _hi, edx \ }[/code] |
[QUOTE=ewmayer;370453]
While I would never refuse to offer help to a friend in need such as yourself, may I suggest you similarly think long and hard about whether you want to pollute your own code with WinCrap, when you could simply use GCC inside a linux emulator?[/QUOTE] Use MinGW...? |
[QUOTE=mfaktc]
############################# # 2.2 Compilation (Windows) # ############################# The following instructions have been tested on Windows 7 64bit using Visual Studio 2008 Professional. A GNU compatible version of make is also required as the Makefile is not compatible with nmake. GNU Make for Win32 can be downloaded from [URL]http://gnuwin32.sourceforge.net/packages/make.htm[/URL] Run the Visual Studio 2008 x64 Win64 Command Prompt and change into the "src/" subdirectory. Run 'make -f Makefile.win' for a 64bit built (recommended on 64bit systems) or 'make -f Makefile.win32' for a 32bit built. Perhaps you have to adjust the paths to your CUDA installation and the Microsoft Visual Studio binaries in the makefiles. The binaries "mfaktc-win-64.exe" or "mfaktc-win-32.exe" are placed in the parent directory.[/QUOTE] Use MinGW...? and then tell us all about it? I am simply retracing the steps for rebuilding a [URL="http://mersenneforum.org/showthread.php?p=370282#post370282"]modified version[/URL] (for a new purpose). I am actually using linux. I wanted to spend an evening helping others (they want a Windows binary), but there's just so much wincrap I can take. I will just post the patched source where I will rid of asm code (it will become much uglier though). Thought I'd build it, but not for the price of a full studio version, no. Even uninstalling the mscrap will take the rest of my evening, and I prefer to catch a movie instead. |
[QUOTE=Batalov;370455]Use MinGW...? and then tell us all about it?
I am simply retracing the steps for rebuilding a [URL="http://mersenneforum.org/showthread.php?p=370282#post370282"]modified version[/URL] (for a new purpose). I am actually using linux. I wanted to spend an evening helping others (they want a Windows binary), but there's just so much wincrap I can take. I will just post the patched source where I will rid of asm code (it will become much uglier though). Thought I'd build it, but not for the price of a full studio version, no. Even uninstalling the mscrap will take the rest of my evening, and I prefer to catch a movie instead.[/QUOTE] ? MinGW is basically a gcc for windows. I use it almost the same as I would compiling Linux. EDIT: for fun, I can compile clLucas and mfakto with it. |
[QUOTE=Batalov;370452]In 64-bit, nothing will work, people say. "You had me at __asm, says the compiler."
It is not a critical piece of code, though, so I will use either Jason's code or George's tf_validate from mmff. The latest CUDA refuses to install if one doesn't have VS and sends you to the generic microsoft download page. You get the latest and greatest 2013, and CUDA refuses to install pointing you to its link of system requirements, where it says: "Supported are VS 2008, 2010, 2012." These bastards know how to get you to curse. ;-) The flowers in my daughter's garden outside of my window wilted from what they've recently heard... Oh fun fun...[/QUOTE] Rather than supporting 64-bit inline ASM, they offer an intrinsic function for most (all?) 64 bit operations (including SIMD stuff... there is scads of it in yafu). So you can accomplish this using, e.g., [URL="http://msdn.microsoft.com/en-us/library/vstudio/67xf5cy5(v=vs.100).aspx"]__umulh[/URL] [edit] Presumably if you put the hi-word producing __umulh(x,y) near the lo-word producing x*y, the compiler would be smart enough to only emit one multiply. But you should probably check the disassembly to be sure. |
Erm, cool (I think)... I can see that we can start with
[CODE]// umul128.c // processor: IPF, x64 #include <stdio.h> #include <intrin.h> #pragma intrinsic(_umul128) unsigned __int64 mulmod(unsigned __int64 a, unsigned __int64 b, unsigned __int64 m) { unsigned __int64 c, d; d = _umul128(a, b, &c); // ...but where is _udiv64 or something? There's none. }[/CODE] I'll have a look in yafu... I've already used Jason's modmul_2(). My interest is purely academic at this point. |
| All times are UTC. The time now is 13:03. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.