mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Mlucas (https://www.mersenneforum.org/forumdisplay.php?f=118)
-   -   multithread Mlucas attempts in MSYS2 for Windows (https://www.mersenneforum.org/showthread.php?t=25362)

kriesel 2020-03-14 20:48

multithread Mlucas attempts in MSYS2 for Windows
 
1 Attachment(s)
Ernst and I have been batting this back and forth a while by email and PM (after dealing with some issues with signals and msys2 for single-threaded Mlucas v18 and v19 builds). If multithreading can be worked out for msys2 & gcc or other methods of builds on Windows for Mlucas, it can probably also for MFactor.

We seem to be momentarily out of ideas for getting the multithreaded Windows build to work. So, unsure of where to go from here, I'm looking for constructive input from someone with multithreaded programming experience in C on Windows preferably with msys2 & gcc, and on linux. (George? Anyone?)

Now for some of the gory detail.
The pthread header files are present and current.[CODE]c:\msys64\mingw64\x86_64-w64-mingw32\include>dir pthread*
Volume in drive C has no label.
Volume Serial Number is 3E40-A384

Directory of c:\msys64\mingw64\x86_64-w64-mingw32\include

11/14/2019 01:30 AM 34,696 pthread.h
11/14/2019 01:30 AM 3,449 pthread_compat.h
11/14/2019 01:30 AM 1,304 pthread_signal.h
11/14/2019 01:30 AM 2,979 pthread_time.h
11/14/2019 01:30 AM 5,012 pthread_unistd.h
5 File(s) 47,440 bytes[/CODE]Lines 85-91 of platform.h seem to intend to treat msys2 as linux, although as I understand it, msys2/mingw64 is just a compatibility layer for bash syntax and gnu tools on Windows:
[CODE]// Jun 2017: For Win-builds under msys/mingw, allow defined(__MINGW32__) to override normal windos|linux preprocessing logic:
#if !defined(__MINGW32__) && (defined(WINDOWS) || defined(_WINDOWS) || defined(WIN32) || defined(_WIN32) || defined(_WIN64))
#define OS_TYPE
#define OS_TYPE_WINDOWS
#elif defined(__MINGW32__) || (defined(linux) || defined(__linux__) || defined(__linux))
#define OS_TYPE
#define OS_TYPE_LINUX[/CODE]In the multithreading support section of platform.h that begins at line 1253, there's an include sys/sysctl.h that causes trouble on msys2/mingw64/Win, around line 1306, since there's no such file on msys2/mingw/Win, so gets commented out:
[CODE][B]/* [/B] #ifndef OS_TYPE_GNU_HURD
#include <sys/sysctl.h>
#endif [B] */[/B][/CODE]If I comment out the following in platform.h to get rid of the fatal compile error they cause (since unistd.h is not present in those locations on msys2/mingw64/Win7)
[CODE]// These additional Linux-only includes make sure __NR_gettid, used in our syscall-based get-thread-ID, is defined:
[B]// line 1312[/B] #include <linux/unistd.h>
[B]// line 1313[/B] #include <asm/unistd.h>
[/CODE]I note "_NR_gettid" is not present in any of the following *unistd.h present on the system, as determined by notepad string search. Nor is "gettid". So __NR_gettid remains undeclared.
[CODE]C:\msys64>dir/s *unistd.h
Volume in drive C has no label.
Volume Serial Number is 3E40-A384

Directory of C:\msys64\mingw64\lib\gcc\x86_64-w64-mingw32\9.2.0\include\ssp

08/29/2019 03:03 AM 2,815 unistd.h
1 File(s) 2,815 bytes

Directory of C:\msys64\mingw64\x86_64-w64-mingw32\include

11/14/2019 01:30 AM 5,012 pthread_unistd.h
02/18/2020 12:39 PM 2,625 unistd.h
2 File(s) 7,637 bytes

Directory of C:\msys64\mingw64\x86_64-w64-mingw32\include\sys

02/18/2020 12:39 PM 323 unistd.h
1 File(s) 323 bytes[/CODE]Since platform.h line 1336 seems to indicate multithreading on Windows had not been coded,
#endif // OS_TYPE_WINDOWS?
and since msys2 is being identified as a flavor of linux, I altered platform.h line 1327 like so
[CODE] #if(defined(_PTHREAD_H) || defined(_PTHREAD_H_)[B] || defined(WIN_PTHREADS_H)[/B] ) // Apr 2018: Thanks to Elias Mariani for the OpenBSD mods //added || defined(WIN_PTHREADS_H)
#define MULTITHREAD
#define USE_PTHREAD
#else[/CODE]In threadpool.c line 301, there's an #if 0 which can be toggled to 1. With it at the usual 0,
$ gcc -c -O3 -DUSE_THREADS ../src/*.c >& build.log
$ grep error build.log
yields:
[CODE]../src/threadpool.c:323:3: error: unknown type name 'cpu_set_t'
../src/threadpool.c:325:30: error: '__NR_gettid' undeclared (first use in this function)[/CODE]While with it at 1,
$ gcc -c -O3 -DUSE_THREADS ../src/*.c >& build.log
$ grep error build.log
yields:
[CODE]../src/threadpool.c:304:3: error: unknown type name 'cpuset_t'
../src/threadpool.c:312:15: error: 'cpuid_t' undeclared (first use in this function); did you mean 'pid_t'?
../src/threadpool.c:312:23: error: expected ')' before 'i'[/CODE]The code block swtiched in or out by that ifdef 1|0 in threadpool.c is
[CODE] #if 1 // was 0
// This is the affinity API tied to pthread library ... interestingly, it's less portable than the Linux system-centric one below
int i,errcode;
cpuset_t *cset;
pthread_t pth;

cset = cpuset_create();
if (cset == NULL) {
err(EXIT_FAILURE, "cpuset_create");
}
i = my_id % pool->num_of_cores; // get cpu mask using sequential thread ID modulo #available cores
cpuset_set((cpuid_t)i, cset);

pth = pthread_self();
errcode = pthread_setaffinity_np(pth, cpuset_size(cset), cset);
if (errcode) {
perror("pthread_setaffinity_np");
}
cpuset_destroy(cset);

#else

cpu_set_t cpu_set;
int i,errcode;
pid_t thread_id = syscall (__NR_gettid);
#if THREAD_POOL_DEBUG
printf("executing worker thread id %u, syscall_id = %u\n", my_id, thread_id);
#endif
CPU_ZERO (&cpu_set); i = my_id % pool->num_of_cores;
i = mi64_ith_set_bit(CORE_SET, i+1, MAX_CORES>>6); // Remember, [i]th-bit index in arglist is *unit* offset, i.e. must be in [1,MAX_CORES]
if(i < 0) {
fprintf(stderr,"Affinity CORE_SET does not have a [%u]th set bit!",my_id % pool->num_of_cores);
ASSERT(HERE, 0, "Aborting.");
}
// get cpu mask using sequential thread ID modulo #available cores in runtime-specified affinity set
CPU_SET(i, &cpu_set);
errcode = sched_setaffinity(thread_id, sizeof(cpu_set), &cpu_set);
#if THREAD_POOL_DEBUG
printf("syscall_id = %u, tid = %d, setaffinity[%d] = %d, ISSET[%d] = %d\n", thread_id,my_id,i,errcode,i,CPU_ISSET(i, &cpu_set));
#endif
if (errcode) {
perror("sched_setaffinity");
}

#endif

[/CODE]

ewmayer 2020-03-14 21:32

Thanks, Ken!

To any would-be-players-with-this attempters, note a complete MSYS2 build of Mlucas v19 needs some #if-hackery of 3 sourcefiles (Mlucas.c mers_mod_square.c, fermat_mod_square.c) to disable the signals-catching code I added starting in v18 ... but simpler for the present purpose is simply to make the h-file hacks Ken details in one's copy of v19 src-files and then try to compile threadpool.c using the multithreaded-build instructions detailed in the Mlucas online readme, with whichever -DUSE_[SSE2|AVX|AVX2|AVX512] flag is appropriate to one's CPU.

I suspect (hope) it's simply a matter of adding a Win-pthreads-appropriate bit of C code which does the thread/core affinity-setting ... that is what the __NR_gettid code Ken detailed is doing under Linux.

kriesel 2020-03-15 13:24

signals and rt need to be handled for msys2 first
 
Mlucas V18 msys2/mingw64/Win build attempt
[CODE]$ gcc -c -O3 ../src/*.c >& build.log
$ grep error build.log
../src/fermat_mod_square.c:1869:18: error: 'SIGHUP' undeclared (first use in this function)
../src/mers_mod_square.c:2382:18: error: 'SIGHUP' undeclared (first use in this function)
../src/Mlucas.c:182:21: error: 'SIGHUP' undeclared (first use in this function)[/CODE]Mlucas V19 msys2/mingw64/Win build attempt[CODE]$ gcc -c -O3 ../src/*.c >& build.log
$ grep error build.log
../src/fermat_mod_square.c:1842:18: error: 'SIGHUP' undeclared (first use in this function)
../src/fermat_mod_square.c:1844:18: error: 'SIGALRM' undeclared (first use in this function); did you mean 'SIGABRT'?
../src/fermat_mod_square.c:1846:18: error: 'SIGUSR1' undeclared (first use in this function)
../src/fermat_mod_square.c:1848:18: error: 'SIGUSR2' undeclared (first use in this function)
../src/mers_mod_square.c:2533:18: error: 'SIGHUP' undeclared (first use in this function)
../src/mers_mod_square.c:2535:18: error: 'SIGALRM' undeclared (first use in this function); did you mean 'SIGABRT'?
../src/mers_mod_square.c:2537:18: error: 'SIGUSR1' undeclared (first use in this function)
../src/mers_mod_square.c:2539:18: error: 'SIGUSR2' undeclared (first use in this function)
../src/Mlucas.c:187:21: error: 'SIGHUP' undeclared (first use in this function)
../src/Mlucas.c:189:21: error: 'SIGALRM' undeclared (first use in this function); did you mean 'SIGABRT'?
../src/Mlucas.c:191:21: error: 'SIGUSR1' undeclared (first use in this function)
../src/Mlucas.c:193:21: error: 'SIGUSR2' undeclared (first use in this function)[/CODE]$ gcc -o Mlucas-x86 *.o -lm -lrt
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: cannot find -lrt
collect2.exe: error: ld returned 1 exit status

$ pacman -Su librt
error: target not found: librt

$ pacman -Su rt
error: target not found: rt
See also [URL]https://mersenneforum.org/showpost.php?p=539118&postcount=20[/URL] and the next few posts after it.

Simplest is to get the 3 current files for Mlucas V19 from Ernst (March 8 or later).
But, if I recall/interpret the clues correctly:
Mlucas.c line 181-200 after the change[CODE]#ifdef USE_SIGNALS
void sig_handler(int signo)
{
if (signo == SIGINT) {
fprintf(stderr,"received SIGINT signal.\n"); sprintf(cbuf,"received SIGINT signal.\n");
} else if(signo == SIGTERM) {
fprintf(stderr,"received SIGTERM signal.\n"); sprintf(cbuf,"received SIGTERM signal.\n");
} else if(signo == SIGHUP) {
fprintf(stderr,"received SIGHUP signal.\n"); sprintf(cbuf,"received SIGHUP signal.\n");
} else if(signo == SIGALRM) {
fprintf(stderr,"received SIGALRM signal.\n"); sprintf(cbuf,"received SIGALRM signal.\n");
} else if(signo == SIGUSR1) {
fprintf(stderr,"received SIGUSR1 signal.\n"); sprintf(cbuf,"received SIGUSR1 signal.\n");
} else if(signo == SIGUSR2) {
fprintf(stderr,"received SIGUSR2 signal.\n"); sprintf(cbuf,"received SIGUSR2 signal.\n");
}
// Toggle a global to allow desired code sections to detect signal-received and take appropriate action:
MLUCAS_KEEP_RUNNING = 0;
}
#endif[/CODE]fermat_mod_square.c lines 1837-1851[CODE]#ifdef USE_SIGNALS
// Listen for interrupts:
if (signal(SIGINT, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGINT.\n");
else if (signal(SIGTERM, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGTERM.\n");
else if (signal(SIGHUP, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGHUP.\n");
else if (signal(SIGALRM, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGALRM.\n");
else if (signal(SIGUSR1, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGUSR1.\n");
else if (signal(SIGUSR2, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGUSR2.\n");
#endif[/CODE]mers_mod_square.c lines 2528-2542[CODE]#ifdef USE_SIGNALS
// Listen for interrupts:
if (signal(SIGINT, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGINT.\n");
else if (signal(SIGTERM, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGTERM.\n");
else if (signal(SIGHUP, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGHUP.\n");
else if (signal(SIGALRM, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGALRM.\n");
else if (signal(SIGUSR1, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGUSR1.\n");
else if (signal(SIGUSR2, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGUSR2.\n");
#endif[/CODE]Then build on msys2 single threaded was ok, as long as DUSE_SIGNALS was not used. CTL-C still worked.[CODE]$ gcc -c -O3 ../src/*.c >& build.log
$ grep error build.log
$ gcc -o Mlucas-x86 *.o -lm[/CODE]Extrapolated to Mlucas V18
Mlucas.c lines 176-189[CODE]#ifdef USE_SIGNALS
void sig_handler(int signo)
{
if (signo == SIGINT) {
fprintf(stderr,"received SIGINT signal.\n"); sprintf(cbuf,"received SIGINT signal.\n");
} else if(signo == SIGTERM) {
fprintf(stderr,"received SIGTERM signal.\n"); sprintf(cbuf,"received SIGTERM signal.\n");
} else if(signo == SIGHUP) {
fprintf(stderr,"received SIGHUP signal.\n"); sprintf(cbuf,"received SIGHUP signal.\n");
}
// Toggle a global to allow desired code sections to detect signal-received and take appropriate action:
MLUCAS_KEEP_RUNNING = 0;
}
#endif[/CODE]fermat_mod_square.c lines 1864-1872[CODE]#ifdef USE_SIGNALS
// Listen for interrupts:
if (signal(SIGINT, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGINT.\n");
else if (signal(SIGTERM, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGTERM.\n");
else if (signal(SIGHUP, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGHUP.\n");
#endif[/CODE]merse_mod_square.c lines 2377-2385[CODE]#ifdef USE_SIGNALS
// Listen for interrupts:
if (signal(SIGINT, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGINT.\n");
else if (signal(SIGTERM, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGTERM.\n");
else if (signal(SIGHUP, sig_handler) == SIG_ERR)
fprintf(stderr,"Can't catch SIGHUP.\n");
#endif[/CODE]Then various flavors of single-threaded mlucas v18 could be built cleanly on msys2 as follows.[CODE]gcc -c -O3 ../src/*.c >& build.log
grep error build.log
gcc -o mlucas-x86 *.o -lm

gcc -c -O3 -DUSE_SSE2 ../src/*.c >& build.log
grep error build.log
gcc -o mlucas-sse2 *.o -lm

gcc -c -O3 -DUSE_AVX2 -mavx2 ../src/*.c >& build.log
grep error build.log
gcc -o mlucas-fma3 *.o -lm[/CODE]

ewmayer 2020-03-15 18:40

Yes, only diff between sig-handling code in v18 and v19 is expansed list of sigtypes-listened-for in the latter.

Ken, confirm - the workaround for the missing runtime-lib was to simply not use -lrt in the link step under msys2, correct?

kriesel 2020-03-15 19:37

[QUOTE=ewmayer;539790]Yes, only diff between sig-handling code in v18 and v19 is expanded list of sigtypes-listened-for in the latter.

Ken, confirm - the workaround for the missing runtime-lib was to simply not use -lrt in the link step under msys2, correct?[/QUOTE]Right. I think that and the signals-related edits described in my previous post were all it took to get Mlucas v18 to build correctly on msys2, because I checked the file dates before those were applied, and the v18 files affected predated the beginning of sorting it out on v19, by months if I recall correctly. The signal handling was not an issue for v17.0 or v17.1 builds, because it was not present in those versions. There's still the multithread challenge, of course, for both v18 and v19.


All times are UTC. The time now is 04:43.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.