![]() |
MLucas on IBM Mainframe
Hello! It's possible compile on IBM Mainframe? Tried compile MLucas (mlucas-14.1.tar.gz and Mlucas_12.11.2014.tgz). But unfortunately have a some problem.
[CODE]Architecture: s390x CPU op-mode(s): 32-bit, 64-bit Byte Order: Big Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s) per book: 1 Book(s): 2 Vendor ID: IBM/S390 BogoMIPS: 20325.00 Hypervisor: z/VM 6.3.0 Hypervisor vendor: IBM Virtualization type: full Dispatching mode: horizontal L1d cache: 128K L1i cache: 96K L2d cache: 2048K L2i cache: 2048K [/CODE] By the way. Everyone can try. Just register and get access for 90 days. [url]https://developer.ibm.com/linuxone/?source=web&ca=linuxone&ovcode=ov44223&tactic=C47300NW[/url] I choose Linux Red Hat. |
[QUOTE=Lorenzo;425319]By the way. Everyone can try. Just register and get access for 90 days. [url]https://developer.ibm.com/linuxone/?source=web&ca=linuxone&ovcode=ov44223&tactic=C47300NW[/url][/QUOTE]
I will let others speak to building MLucas in this environment, but thank you for pointing this out to us. Interesting.... |
[QUOTE=chalsall;425323]Interesting....[/QUOTE]
Or, perhaps not so much... DNS is broken within the virtual machine as initially instanced (at least for RedHat Linux), and Mlucas doesn't even start to compile successfully, even though "../configure" completes successfully. Ernst? I know this is an experiment by IBM, but it's not going to compete with EC2 nor Google Compute et al any time soon. |
What's the error? Two people saying "it doesn't work" is about as useful as me saying that Prime95 won't compile either.
|
[QUOTE=Dubslow;425334]What's the error? Two people saying "it doesn't work" is about as useful as me saying that Prime95 won't compile either.[/QUOTE]
Fair enough... [CODE][linux1@i4l-2 build]$ cat /proc/cpuinfo vendor_id : IBM/S390 # processors : 2 bogomips per cpu: 20325.00 features : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs processor 0: version = FF, identification = 016A77, machine = 2964 processor 1: version = FF, identification = 016A77, machine = 2964 [linux1@i4l-2 build]$ uname -a Linux i4l-2 2.6.32-573.12.1.el6.s390x #1 SMP Mon Nov 23 12:58:30 EST 2015 s390x s390x s390x GNU/Linux [linux1@i4l-2 build]$ pwd /home/linux1/mlucas/mlucas-14.1/build [linux1@i4l-2 build]$ ../configure checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether make supports nested variables... yes checking whether make supports nested variables... (cached) yes checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking whether gcc understands -c and -o together... yes checking for style of include used by make... GNU checking dependency style of gcc... none checking for library containing ceil, log, pow, sqrt, sincos, floor, lrint, atan... -lm checking how to run the C preprocessor... gcc -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking fenv.h usability... yes checking fenv.h presence... yes checking for fenv.h... yes checking limits.h usability... yes checking limits.h presence... yes checking for limits.h... yes checking mach/mach.h usability... no checking mach/mach.h presence... no checking for mach/mach.h... no checking stddef.h usability... yes checking stddef.h presence... yes checking for stddef.h... yes checking for stdlib.h... (cached) yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for stdbool.h that conforms to C99... yes checking for _Bool... yes checking for inline... inline checking for pid_t... yes checking for size_t... yes checking for uint64_t... yes checking for stdlib.h... (cached) yes checking for GNU libc compatible malloc... yes checking for stdlib.h... (cached) yes checking for GNU libc compatible realloc... yes checking for clock_gettime... no checking for gethrtime... no checking for gettimeofday... yes checking for memset... yes checking for pow... yes checking for sqrt... yes checking for strerror... yes checking for strstr... yes checking for strtoul... yes checking whether _LARGEFILE_SOURCE is declared... no checking build system type... s390x-ibm-linux-gnu checking host system type... s390x-ibm-linux-gnu checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating Makefile config.status: creating config.h config.status: config.h is unchanged config.status: executing depfiles commands [linux1@i4l-2 build]$ make make all-am make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build' CC $NORMAL_O $THREADS_O make[1]: *** [NORMAL_O-THREADS_O.stamp] Error 1 make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build' make: *** [all] Error 2[/CODE] |
[QUOTE=chalsall;425331]
I know this is an experiment by IBM, but it's not going to compete with EC2 nor Google Compute et al any time soon.[/QUOTE] Would you mind to elaborate? Luigi |
[QUOTE=chalsall;425336]Fair enough... [CODE]
[linux1@i4l-2 build]$ make make all-am make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build' CC $NORMAL_O $THREADS_O make[1]: *** [NORMAL_O-THREADS_O.stamp] Error 1 make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build' make: *** [all] Error 2[/CODE][/QUOTE] What a perfectly useless error message. I see now why you didn't immediately include it. |
[QUOTE=ET_;425337]Would you mind to elaborate?[/QUOTE]
Sure... As much as I hate to say it, Intel has mostly won the assembly race. As in, most code is targeted to x86 (and this is coming from someone who hates x86; I much prefer 680x0 assembly (may it rest in peace)). Even source code in C, C++ etc might not work under a different platform. This might be because of the build environment not being truly cross-platform, or because there are subtle bugs in the code which don't manifest under the most commonly used CPUs. Yes, this does mean the code is buggy, but most don't care -- they just want the code to work for them without hassle. At the end of the day, my argument is that cloud computing providers need to provide x86 based virtual machines to (almost all of) their customers, or they're just not going to get traction. |
[QUOTE=chalsall;425339]...
At the end of the day, my argument is that cloud computing providers need to provide x86 based virtual machines to (almost all of) their customers, or they're just not going to get traction.[/QUOTE] I am working with servers in cloud, and notice that even on x86 architectures, one "virtual CPU" equals 40%-50% of a real one. If the IBM fellows deliver a virtual CPU worth 60%-65% of a real CPU, they would win the race. That "virtual CPu" thingie is going to take the place of the "minimum guaranteed upload bandwidth" for ADSL... |
[QUOTE=ET_;425340]I am working with servers in cloud, and notice that even on x86 architectures, one "virtual CPU" equals 40%-50% of a real one. If the IBM fellows deliver a virtual CPU worth 60%-65% of a real CPU, they would win the race.[/QUOTE]
I also mostly work in "the cloud". For my serious work I lease dedicated servers, where I get 100% of the machine. For my "on demand" work I sometimes go "virtual". And, if you know what you're doing, you can get ~99% of a machine for pennies on the dollar. IBM is not going to win this race unless and until they offer x86 instances. And, so you know, I ran a simple benchmark from the command line, and their instance was half the speed of my desktop (one CPU used on each). |
[QUOTE=chalsall;425336]Fair enough... [CODE][linux1@i4l-2 build]$ cat /proc/cpuinfo
vendor_id : IBM/S390 # processors : 2 bogomips per cpu: 20325.00 features : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs processor 0: version = FF, identification = 016A77, machine = 2964 processor 1: version = FF, identification = 016A77, machine = 2964 [linux1@i4l-2 build]$ uname -a Linux i4l-2 2.6.32-573.12.1.el6.s390x #1 SMP Mon Nov 23 12:58:30 EST 2015 s390x s390x s390x GNU/Linux [linux1@i4l-2 build]$ pwd /home/linux1/mlucas/mlucas-14.1/build [linux1@i4l-2 build]$ ../configure checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether make supports nested variables... yes checking whether make supports nested variables... (cached) yes checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking whether gcc understands -c and -o together... yes checking for style of include used by make... GNU checking dependency style of gcc... none checking for library containing ceil, log, pow, sqrt, sincos, floor, lrint, atan... -lm checking how to run the C preprocessor... gcc -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking fenv.h usability... yes checking fenv.h presence... yes checking for fenv.h... yes checking limits.h usability... yes checking limits.h presence... yes checking for limits.h... yes checking mach/mach.h usability... no checking mach/mach.h presence... no checking for mach/mach.h... no checking stddef.h usability... yes checking stddef.h presence... yes checking for stddef.h... yes checking for stdlib.h... (cached) yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for stdbool.h that conforms to C99... yes checking for _Bool... yes checking for inline... inline checking for pid_t... yes checking for size_t... yes checking for uint64_t... yes checking for stdlib.h... (cached) yes checking for GNU libc compatible malloc... yes checking for stdlib.h... (cached) yes checking for GNU libc compatible realloc... yes checking for clock_gettime... no checking for gethrtime... no checking for gettimeofday... yes checking for memset... yes checking for pow... yes checking for sqrt... yes checking for strerror... yes checking for strstr... yes checking for strtoul... yes checking whether _LARGEFILE_SOURCE is declared... no checking build system type... s390x-ibm-linux-gnu checking host system type... s390x-ibm-linux-gnu checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating Makefile config.status: creating config.h config.status: config.h is unchanged config.status: executing depfiles commands [linux1@i4l-2 build]$ make make all-am make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build' CC $NORMAL_O $THREADS_O make[1]: *** [NORMAL_O-THREADS_O.stamp] Error 1 make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build' make: *** [all] Error 2[/CODE][/QUOTE] Hi people, could you try appending --enable-verbose-compiler and --disable-silent-rule when running configure and paste the output? (I don't want to give away my phone number to register...) Currently, Mlucas only compiles and passes the self-test on 4 targets when [URL="https://buildd.debian.org/status/package.php?p=mlucas"]building[/URL] it on [URL="https://packages.debian.org/sid/math/mlucas"]Debian[/URL]. It would be great if it compiles targets other than x86 and powerpc as well. Besides, do you think compiler warnings and verbose rule should be enabled by default? I used to think it is too verbose but it seems the short error message is pretty useless. |
[QUOTE=alexvong1995;425343]Hi people, could you try appending --enable-verbose-compiler and --disable-silent-rule when running configure and paste the output? (I don't want to give away my phone number to register...)[/QUOTE]
Sure. But you can always use a "burner phone" in such cases. I know guys who have a half dozen or so for just such instances... :wink: But here are the results of what you asked for: [CODE]linux1@i4l-2 build]$ ../configure --enable-verbose-compiler --disable-silent-rule configure: WARNING: unrecognized options: --disable-silent-rule checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether make supports nested variables... yes checking whether make supports nested variables... (cached) yes checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking whether gcc understands -c and -o together... yes checking for style of include used by make... GNU checking dependency style of gcc... none checking for library containing ceil, log, pow, sqrt, sincos, floor, lrint, atan... -lm checking how to run the C preprocessor... gcc -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking fenv.h usability... yes checking fenv.h presence... yes checking for fenv.h... yes checking limits.h usability... yes checking limits.h presence... yes checking for limits.h... yes checking mach/mach.h usability... no checking mach/mach.h presence... no checking for mach/mach.h... no checking stddef.h usability... yes checking stddef.h presence... yes checking for stddef.h... yes checking for stdlib.h... (cached) yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for stdbool.h that conforms to C99... yes checking for _Bool... yes checking for inline... inline checking for pid_t... yes checking for size_t... yes checking for uint64_t... yes checking for stdlib.h... (cached) yes checking for GNU libc compatible malloc... yes checking for stdlib.h... (cached) yes checking for GNU libc compatible realloc... yes checking for clock_gettime... no checking for gethrtime... no checking for gettimeofday... yes checking for memset... yes checking for pow... yes checking for sqrt... yes checking for strerror... yes checking for strstr... yes checking for strtoul... yes checking whether _LARGEFILE_SOURCE is declared... no checking build system type... s390x-ibm-linux-gnu checking host system type... s390x-ibm-linux-gnu checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating Makefile config.status: creating config.h config.status: config.h is unchanged config.status: executing depfiles commands configure: WARNING: unrecognized options: --disable-silent-rule [linux1@i4l-2 build]$ make make all-am make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build' CC $NORMAL_O $THREADS_O In file included from ../src/types.h:30, from ../src/align.h:29, from ../src/Mlucas.h:29, from ../src/br.c:23: ../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds! In file included from ../src/types.h:30, from ../src/Mdata.h:30, from ../src/dft_macro.c:24: ../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds! In file included from ../src/types.h:30, from ../src/util.h:30, from ../src/factor.h:29, from ../src/factor.c:27: ../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds! In file included from ../src/types.h:30, from ../src/align.h:29, from ../src/Mlucas.h:29, from ../src/fermat_mod_square.c:23: ... In file included from ../src/types.h:30, from ../src/util.h:30, from ../src/factor.h:29, from ../src/twopmodq80.c:23: ../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds! ../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE’: ../src/twopmodq80.c:342: warning: right shift count >= width of type ../src/twopmodq80.c:342: warning: left shift count is negative ../src/twopmodq80.c:342: warning: right shift count >= width of type ../src/twopmodq80.c:342: warning: left shift count is negative ../src/twopmodq80.c:342: warning: right shift count >= width of type ../src/twopmodq80.c:342: warning: right shift count is negative ../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q2’: ../src/twopmodq80.c:811: warning: right shift count >= width of type ../src/twopmodq80.c:811: warning: left shift count is negative ../src/twopmodq80.c:811: warning: right shift count >= width of type ../src/twopmodq80.c:811: warning: left shift count is negative ../src/twopmodq80.c:811: warning: right shift count >= width of type ../src/twopmodq80.c:811: warning: right shift count is negative ../src/twopmodq80.c:812: warning: right shift count >= width of type ../src/twopmodq80.c:812: warning: left shift count is negative ../src/twopmodq80.c:812: warning: right shift count >= width of type ../src/twopmodq80.c:812: warning: left shift count is negative ../src/twopmodq80.c:812: warning: right shift count >= width of type ../src/twopmodq80.c:812: warning: right shift count is negative ../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q4’: ../src/twopmodq80.c:2210: warning: right shift count >= width of type ../src/twopmodq80.c:2210: warning: left shift count is negative ../src/twopmodq80.c:2210: warning: right shift count >= width of type ../src/twopmodq80.c:2210: warning: left shift count is negative ../src/twopmodq80.c:2210: warning: right shift count >= width of type ../src/twopmodq80.c:2210: warning: right shift count is negative ../src/twopmodq80.c:2211: warning: right shift count >= width of type ../src/twopmodq80.c:2211: warning: left shift count is negative ../src/twopmodq80.c:2211: warning: right shift count >= width of type ../src/twopmodq80.c:2211: warning: left shift count is negative ../src/twopmodq80.c:2211: warning: right shift count >= width of type ../src/twopmodq80.c:2211: warning: right shift count is negative ../src/twopmodq80.c:2212: warning: right shift count >= width of type ../src/twopmodq80.c:2212: warning: left shift count is negative ../src/twopmodq80.c:2212: warning: right shift count >= width of type ../src/twopmodq80.c:2212: warning: left shift count is negative ../src/twopmodq80.c:2212: warning: right shift count >= width of type ../src/twopmodq80.c:2212: warning: right shift count is negative ../src/twopmodq80.c:2213: warning: right shift count >= width of type ../src/twopmodq80.c:2213: warning: left shift count is negative ../src/twopmodq80.c:2213: warning: right shift count >= width of type ../src/twopmodq80.c:2213: warning: left shift count is negative ../src/twopmodq80.c:2213: warning: right shift count >= width of type ../src/twopmodq80.c:2213: warning: right shift count is negative ../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q4_REF’: ../src/twopmodq80.c:2954: warning: right shift count >= width of type ../src/twopmodq80.c:2954: warning: left shift count is negative ../src/twopmodq80.c:2954: warning: right shift count >= width of type ../src/twopmodq80.c:2954: warning: left shift count is negative ../src/twopmodq80.c:2954: warning: right shift count >= width of type ../src/twopmodq80.c:2954: warning: right shift count is negative ../src/twopmodq80.c:2955: warning: right shift count >= width of type ../src/twopmodq80.c:2955: warning: left shift count is negative ../src/twopmodq80.c:2955: warning: right shift count >= width of type ../src/twopmodq80.c:2955: warning: left shift count is negative ../src/twopmodq80.c:2955: warning: right shift count >= width of type ../src/twopmodq80.c:2955: warning: right shift count is negative ../src/twopmodq80.c:2956: warning: right shift count >= width of type ../src/twopmodq80.c:2956: warning: left shift count is negative ../src/twopmodq80.c:2956: warning: right shift count >= width of type ../src/twopmodq80.c:2956: warning: left shift count is negative ../src/twopmodq80.c:2956: warning: right shift count >= width of type ../src/twopmodq80.c:2956: warning: right shift count is negative ../src/twopmodq80.c:2957: warning: right shift count >= width of type ../src/twopmodq80.c:2957: warning: left shift count is negative ../src/twopmodq80.c:2957: warning: right shift count >= width of type ../src/twopmodq80.c:2957: warning: left shift count is negative ../src/twopmodq80.c:2957: warning: right shift count >= width of type ../src/twopmodq80.c:2957: warning: right shift count is negative In file included from ../src/types.h:30, from ../src/util.h:30, from ../src/factor.h:29, from ../src/twopmodq96.c:23: ../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds! In file included from ../src/types.h:30, from ../src/util.h:30, from ../src/factor.h:29, from ../src/twopmodq.c:23: ../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds! In file included from ../src/types.h:30, from ../src/types.c:23: ../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds! In file included from ../src/types.h:30, from ../src/threadpool.h:69, from ../src/threadpool.c:65: ../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds! ../src/threadpool.c:265: warning: ‘force_align_arg_pointer’ attribute directive ignored ../src/threadpool.c: In function ‘worker_thr_routine’: ../src/threadpool.c:312: error: ‘__NR_gettid’ undeclared (first use in this function) ../src/threadpool.c:312: error: (Each undeclared identifier is reported only once ../src/threadpool.c:312: error: for each function it appears in.) make[1]: *** [NORMAL_O-THREADS_O.stamp] Error 1 make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build' make: *** [all] Error 2[/CODE] |
[CODE]configure: WARNING: unrecognized options: --disable-silent-rule[/CODE]Oh I forget a "s". :rolleyes:
[CODE]../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds![/CODE]Judging from this line, maybe we should try to build without multithread. Maybe we can try ./configure --enable-verbose-compiler --disable-threads and see what happens. By the way, --help should show all options of configure. (only the Custom Options part is useful) |
[QUOTE=alexvong1995;425358]Maybe we can try ./configure --enable-verbose-compiler --disable-threads and see what happens.[/QUOTE]
This is the last help I'm going to provide on this matter. Create your own account to debug further on the IBM 390 mainframe.... [CODE][linux1@i4l-2 build]$ ../configure --enable-verbose-compiler --disable-threads checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether make supports nested variables... yes checking whether make supports nested variables... (cached) yes checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking whether gcc understands -c and -o together... yes checking for style of include used by make... GNU checking dependency style of gcc... none checking for library containing ceil, log, pow, sqrt, sincos, floor, lrint, atan... -lm checking how to run the C preprocessor... gcc -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking fenv.h usability... yes checking fenv.h presence... yes checking for fenv.h... yes checking limits.h usability... yes checking limits.h presence... yes checking for limits.h... yes checking mach/mach.h usability... no checking mach/mach.h presence... no checking for mach/mach.h... no checking stddef.h usability... yes checking stddef.h presence... yes checking for stddef.h... yes checking for stdlib.h... (cached) yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for stdbool.h that conforms to C99... yes checking for _Bool... yes checking for inline... inline checking for pid_t... yes checking for size_t... yes checking for uint64_t... yes checking for stdlib.h... (cached) yes checking for GNU libc compatible malloc... yes checking for stdlib.h... (cached) yes checking for GNU libc compatible realloc... yes checking for clock_gettime... no checking for gethrtime... no checking for gettimeofday... yes checking for memset... yes checking for pow... yes checking for sqrt... yes checking for strerror... yes checking for strstr... yes checking for strtoul... yes checking whether _LARGEFILE_SOURCE is declared... no checking build system type... s390x-ibm-linux-gnu checking host system type... s390x-ibm-linux-gnu checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating Makefile config.status: creating config.h config.status: config.h is unchanged config.status: executing depfiles commands [linux1@i4l-2 build]$ make make all-am make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build' CC $NORMAL_O ../src/get_fft_radices.c: In function ‘get_fft_radices’: ../src/get_fft_radices.c:1446: error: duplicate case value ../src/get_fft_radices.c:1443: error: previously used here ../src/Mlucas.c: In function ‘ernstMain’: ../src/Mlucas.c:1170: warning: cast from pointer to integer of different size In file included from ../src/radix1024_ditN_cy_dif1.c:1324: ../src/radix1024_main_carry_loop.h: In function ‘radix1024_ditN_cy_dif1’: ../src/radix1024_main_carry_loop.h:137: warning: assignment discards qualifiers from pointer target type In file included from ../src/radix1024_ditN_cy_dif1.c:1324: ../src/radix1024_main_carry_loop.h:529: warning: assignment discards qualifiers from pointer target type In file included from ../src/radix128_ditN_cy_dif1.c:1443: ../src/radix128_main_carry_loop.h: In function ‘radix128_ditN_cy_dif1’: ../src/radix128_main_carry_loop.h:325: warning: assignment discards qualifiers from pointer target type ../src/radix128_main_carry_loop.h:868: warning: assignment discards qualifiers from pointer target type In file included from ../src/radix144_ditN_cy_dif1.c:1044: ../src/radix144_main_carry_loop.h: In function ‘radix144_ditN_cy_dif1’: ../src/radix144_main_carry_loop.h:194: warning: assignment discards qualifiers from pointer target type ../src/radix144_main_carry_loop.h:503: warning: assignment discards qualifiers from pointer target type ../src/radix144_ditN_cy_dif1.c: In function ‘radix144_dif_pass1’: ../src/radix144_ditN_cy_dif1.c:1348: warning: assignment discards qualifiers from pointer target type ../src/radix144_ditN_cy_dif1.c: In function ‘radix144_dit_pass1’: ../src/radix144_ditN_cy_dif1.c:1609: warning: assignment discards qualifiers from pointer target type In file included from ../src/radix208_ditN_cy_dif1.c:1049: ../src/radix208_main_carry_loop.h: In function ‘radix208_ditN_cy_dif1’: ../src/radix208_main_carry_loop.h:144: warning: assignment discards qualifiers from pointer target type ../src/radix208_main_carry_loop.h:401: warning: assignment discards qualifiers from pointer target type ../src/radix208_ditN_cy_dif1.c: In function ‘radix208_dif_pass1’: ../src/radix208_ditN_cy_dif1.c:1322: warning: assignment discards qualifiers from pointer target type ../src/radix208_ditN_cy_dif1.c: In function ‘radix208_dit_pass1’: ../src/radix208_ditN_cy_dif1.c:1600: warning: assignment discards qualifiers from pointer target type In file included from ../src/radix256_ditN_cy_dif1.c:1675: ../src/radix256_main_carry_loop.h: In function ‘radix256_ditN_cy_dif1’: ../src/radix256_main_carry_loop.h:307: warning: assignment discards qualifiers from pointer target type ../src/radix256_main_carry_loop.h:876: warning: assignment discards qualifiers from pointer target type ../src/radix512_ditN_cy_dif1.c: In function ‘radix512_dif_pass1’: ../src/radix512_ditN_cy_dif1.c:322: warning: assignment discards qualifiers from pointer target type ../src/radix512_ditN_cy_dif1.c: In function ‘radix512_dit_pass1’: ../src/radix512_ditN_cy_dif1.c:491: warning: assignment discards qualifiers from pointer target type In file included from ../src/radix64_ditN_cy_dif1.c:1306: ../src/radix64_main_carry_loop.h: In function ‘radix64_ditN_cy_dif1’: ../src/radix64_main_carry_loop.h:218: warning: assignment discards qualifiers from pointer target type ../src/radix64_main_carry_loop.h:735: warning: assignment discards qualifiers from pointer target type ../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE’: ../src/twopmodq80.c:342: warning: right shift count >= width of type ../src/twopmodq80.c:342: warning: left shift count is negative ../src/twopmodq80.c:342: warning: right shift count >= width of type ../src/twopmodq80.c:342: warning: left shift count is negative ../src/twopmodq80.c:342: warning: right shift count >= width of type ../src/twopmodq80.c:342: warning: right shift count is negative ../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q2’: ../src/twopmodq80.c:811: warning: right shift count >= width of type ../src/twopmodq80.c:811: warning: left shift count is negative ../src/twopmodq80.c:811: warning: right shift count >= width of type ../src/twopmodq80.c:811: warning: left shift count is negative ../src/twopmodq80.c:811: warning: right shift count >= width of type ../src/twopmodq80.c:811: warning: right shift count is negative ../src/twopmodq80.c:812: warning: right shift count >= width of type ../src/twopmodq80.c:812: warning: left shift count is negative ../src/twopmodq80.c:812: warning: right shift count >= width of type ../src/twopmodq80.c:812: warning: left shift count is negative ../src/twopmodq80.c:812: warning: right shift count >= width of type ../src/twopmodq80.c:812: warning: right shift count is negative ../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q4’: ../src/twopmodq80.c:2210: warning: right shift count >= width of type ../src/twopmodq80.c:2210: warning: left shift count is negative ../src/twopmodq80.c:2210: warning: right shift count >= width of type ../src/twopmodq80.c:2210: warning: left shift count is negative ../src/twopmodq80.c:2210: warning: right shift count >= width of type ../src/twopmodq80.c:2210: warning: right shift count is negative ../src/twopmodq80.c:2211: warning: right shift count >= width of type ../src/twopmodq80.c:2211: warning: left shift count is negative ../src/twopmodq80.c:2211: warning: right shift count >= width of type ../src/twopmodq80.c:2211: warning: left shift count is negative ../src/twopmodq80.c:2211: warning: right shift count >= width of type ../src/twopmodq80.c:2211: warning: right shift count is negative ../src/twopmodq80.c:2212: warning: right shift count >= width of type ../src/twopmodq80.c:2212: warning: left shift count is negative ../src/twopmodq80.c:2212: warning: right shift count >= width of type ../src/twopmodq80.c:2212: warning: left shift count is negative ../src/twopmodq80.c:2212: warning: right shift count >= width of type ../src/twopmodq80.c:2212: warning: right shift count is negative ../src/twopmodq80.c:2213: warning: right shift count >= width of type ../src/twopmodq80.c:2213: warning: left shift count is negative ../src/twopmodq80.c:2213: warning: right shift count >= width of type ../src/twopmodq80.c:2213: warning: left shift count is negative ../src/twopmodq80.c:2213: warning: right shift count >= width of type ../src/twopmodq80.c:2213: warning: right shift count is negative ../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q4_REF’: ../src/twopmodq80.c:2954: warning: right shift count >= width of type ../src/twopmodq80.c:2954: warning: left shift count is negative ../src/twopmodq80.c:2954: warning: right shift count >= width of type ../src/twopmodq80.c:2954: warning: left shift count is negative ../src/twopmodq80.c:2954: warning: right shift count >= width of type ../src/twopmodq80.c:2954: warning: right shift count is negative ../src/twopmodq80.c:2955: warning: right shift count >= width of type ../src/twopmodq80.c:2955: warning: left shift count is negative ../src/twopmodq80.c:2955: warning: right shift count >= width of type ../src/twopmodq80.c:2955: warning: left shift count is negative ../src/twopmodq80.c:2955: warning: right shift count >= width of type ../src/twopmodq80.c:2955: warning: right shift count is negative ../src/twopmodq80.c:2956: warning: right shift count >= width of type ../src/twopmodq80.c:2956: warning: left shift count is negative ../src/twopmodq80.c:2956: warning: right shift count >= width of type ../src/twopmodq80.c:2956: warning: left shift count is negative ../src/twopmodq80.c:2956: warning: right shift count >= width of type ../src/twopmodq80.c:2956: warning: right shift count is negative ../src/twopmodq80.c:2957: warning: right shift count >= width of type ../src/twopmodq80.c:2957: warning: left shift count is negative ../src/twopmodq80.c:2957: warning: right shift count >= width of type ../src/twopmodq80.c:2957: warning: left shift count is negative ../src/twopmodq80.c:2957: warning: right shift count >= width of type ../src/twopmodq80.c:2957: warning: right shift count is negative make[1]: *** [NORMAL_O.stamp] Error 1 make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build' make: *** [all] Error 2[/CODE] |
I can help to build )) But i'm not familiar with C/C++ :blush:
|
[QUOTE=Lorenzo;425371]I can help to build )) But i'm not familiar with C/C++ :blush:[/QUOTE]
We are still awaiting Ernst to speak. Usually he is rather loud. His silence is notable. Edit: Casts got your tongue? |
There are two distinct issues here - [1] the basic code-build and [2] the automated Linux/x86-oriented build scripts - which should be tackled in turn. I'll let Alex focus on [2] since he developed the scripts.
As to [1], the OP mentions having 'tried' the pre-build-script-wrapped 14.1 release - what precisely was tried? The manual build instructions on the README page? Note that due to the non-x86 CPUs here we are stuck with the non-SIMD basic scalar-double C build. But it's still always useful to work through build and code issues on non-primary-target platforms, to make/keep the code as portable as reasonably possible. So to the OP (or anyone else with access to this platform - I will look into a guest account over the weekend), try the most-basic manual build first: straight C, unthreaded - i.e. no USE_ flags of any kind in the compile command. As for the threading-related preprocessor #error, if this platform supports pthreads it should be a simple matter of tweaking the platform.h header to look in the right place for the pthreads-related header files. Looking at the #error line Alex quoted in my platform.h file, it's clear the [i] #include <pthread.h> [/i] is not finding the pthread.h file. Is there one in the /include tree on this platform? |
[QUOTE=ewmayer;425381]As to [1], the OP mentions having 'tried' the pre-build-script-wrapped 14.1 release - what precisely was tried?[/QUOTE]
What part of "../configure; make" isn't clear? |
[QUOTE=chalsall;425386]What part of "../configure; make" isn't clear?[/QUOTE]
The part about the Mlucas_12.11.2014.tgz release not supporting that kind of auto-build, possibly. |
[QUOTE=ewmayer;425387]The part about the Mlucas_12.11.2014.tgz release not supporting that kind of auto-build, possibly.[/QUOTE]
You missed this: [CODE]linux1@i4l-2 build]$ pwd /home/linux1/mlucas/mlucas-14.1/build[/CODE] |
[QUOTE=chalsall;425388]You missed this:
[CODE]linux1@i4l-2 build]$ pwd /home/linux1/mlucas/mlucas-14.1/build[/CODE][/QUOTE] I was referring to the 'and' part of the OP's second sentence: [i]Tried compile MLucas (mlucas-14.1.tar.gz and Mlucas_12.11.2014.tgz)[/i] |
How about this (in get_fft_radices.c, lines 1443-1446):
[CODE] case 7 : numrad = 6; rvec[0] = 16; rvec[1] = 8; rvec[2] = 8; rvec[3] = 8; rvec[4] = 8; rvec[5] = 16; break; #ifndef USE_ONLY_LARGE_LEAD_RADICES case 7 : numrad = 5; rvec[0] = 8; rvec[1] = 16; rvec[2] = 16; rvec[3] = 32; rvec[4] = 16; break; [linux1@rhel7-3 build]$ gcc -Os ../src/get_fft_radices.c ../src/get_fft_radices.c: In function ‘get_fft_radices’: ../src/get_fft_radices.c:1446:3: error: duplicate case value case 7 : ^ ../src/get_fft_radices.c:1443:3: error: previously used here case 7 : ^ [/CODE]Their gcc compiler is a bit stricter than on x86_64 (I've built it before -- no problems were reported, only warnings). A bit later (in TRICKY section) [CODE]../src/util.c: In function ‘print_host_info’: ../src/util.c:1071:106: error: ‘OS_BITS’ undeclared (first use in this function)[/CODE]so add -DOS_BITS=64 to the compiler line. After polishing these two little blemishes, runs like a clock. |
[QUOTE=Batalov;425409]How about this (in get_fft_radices.c, lines 1443-1446):
[CODE] case 7 : numrad = 6; rvec[0] = 16; rvec[1] = 8; rvec[2] = 8; rvec[3] = 8; rvec[4] = 8; rvec[5] = 16; break; #ifndef USE_ONLY_LARGE_LEAD_RADICES case 7 : numrad = 5; rvec[0] = 8; rvec[1] = 16; rvec[2] = 16; rvec[3] = 32; rvec[4] = 16; break; [linux1@rhel7-3 build]$ gcc -Os ../src/get_fft_radices.c ../src/get_fft_radices.c: In function ‘get_fft_radices’: ../src/get_fft_radices.c:1446:3: error: duplicate case value case 7 : ^ ../src/get_fft_radices.c:1443:3: error: previously used here case 7 : ^ [/CODE]Their gcc compiler is a bit stricter than on x86_64 (I've built it before -- no problems were reported, only warnings).[/QUOTE] Yes, that bug in the unthreaded scalar-C builds has been fixed in my dev-branch code, users who run into it should just 'do the needful' in their local copy of the file. [QUOTE]A bit later (in TRICKY section) [CODE]../src/util.c: In function ‘print_host_info’: ../src/util.c:1071:106: error: ‘OS_BITS’ undeclared (first use in this function)[/CODE]so add -DOS_BITS=64 to the compiler line. After polishing these two little blemishes, runs like a clock.[/QUOTE] As long as it's a 64-bit OS - that would be 'yes' - your hack is fine. Note OS_BITS is set (or better, attempted-to-set) in platform.h - you can turn on a bunch of diagnostics for that file by adding -DPLATFORM_DEBUG to your compile command, say for just a single tiny source file like br.c. |
I managed to set up a s390x chroot! So I don't need to rely on QEMU image anymore (They are pretty hard to find and make). I can now build a "fresh" debian rootfs from scratch.
For curious people, this is how it is done (the following commands need root privilege):[CODE]# apt-get install binfmt-support qemu qemu-user-static[/CODE]Then, manually download [URL="https://packages.debian.org/jessie-backports/all/debootstrap/download"]debootstrap[/URL] and install it by [CODE]# dpkg -i debootstrap_1.0.73~bpo8+1_all.deb[/CODE]Note we cannot install the latest version because it has a [URL="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=813232"]bug[/URL] which prevents the two-stage bootstrap from working. (It takes me a looong time to figure this out, the error message isn't so obvious.) [CODE]qemu-debootstrap --arch=s390x --include=emacs-nox,less,build-essential,autogen --variant=buildd sid debian-s390x[/CODE] and we are done! :smile: You can now chroot into the chroot. This should works for any architecture that is supported by QEMU user mode emulation. In particular, armel (used by your phone) is supported. |
I dropped my compilation yesterday, but today I tried some more...
#define COMPILER_TYPE_GCC 1 in ../src/platform.h And Bob's your mother's closest living relative or something like that... |
1 Attachment(s)
OK, I got both un-and-multithreaded builds (all manual compile and link) to work, via the following hacks to platform.h:
o Got rid of these undefs at top - these (duh!) make it impossible for use of PLATFORM_DEBUG as intended, no clue WTF I was thinking with these undefs: #undef PLATFORM_DEBUG #undef OS_DEBUG #undef OS_BITS_DEBUG #undef CPU_DEBUG #undef CMPLR_DEBUG o Instead of initially undefing the following 3 things I now init them to 'unknown' (the not-defined CPU_NAME was hosing the util.c compile): #define CPU_TYPE "Unknown CPU type" #define CPU_NAME "Unknown CPU name" #define CPU_SUBTYPE_NAME "Unknown CPU subtype" o Changed #ifndef CPU_TYPE #error platform.h : CPU_TYPE not defined! #endif to a #warning. (OS_TYPE and COMPILER_TYPE are still required to be defined during preprocessing of the platform.h file, though.) o Added a #elif for the case of Unknown hardware platform, but under Linux/GCC. Fiddled version of the header attached. I only did a pair of quick spot-checks of the 2 binaries @FFT length 128K via './Mlucas -fftlen 128 -iters 100' in each of my 2 build dirs (one for unthreaded, one with pthreading). All the normal Linux thread-affinity stuff seems to be working, though no clue how much || scaling one can expect in these puny guest-accounts. (My virtual setup shows just 2 cores.) |
1 Attachment(s)
Hi Ernst,
I have added 2 new CPU_TYPE CPU_IS_S390 and CPU_IS_S390X to platform.h, does it passes the spot-check and self-test on your VM? The platform.h I use is the old one on my repo, not the one you have just posted . I will rebase my change on the latest platform.h if it works. By the way, could you post the latest version of get_fft_radices.c on your dev branch? I think I run into the duplicate case problem reported by Batalov when building with singlethread. Thanks people! |
1 Attachment(s)
Hi, Alex - I will diff your platform.h and try it on the Cloud when I'm home later today and have a secure connection.
Being paranoid I manage my dev-branch code privately, so here is the current version of the file you asked for - should be a drop-in replacement (md5 of the *gz = db5d2504d58897229d0366f4749b4131): |
Also, do we have any easy way of determining what underlying compute hardware is being used, or is the whole point of the 'generic' cloud setup to obfuscate that?
My gcc predefines-dump shows no obvious clues to the CPU type. |
1 Attachment(s)
[QUOTE=alexvong1995;425532]Hi Ernst,
I have added 2 new CPU_TYPE CPU_IS_S390 and CPU_IS_S390X to platform.h, does it passes the spot-check and self-test on your VM? The platform.h I use is the old one on my repo, not the one you have just posted . I will rebase my change on the latest platform.h if it works. By the way, could you post the latest version of get_fft_radices.c on your dev branch? I think I run into the duplicate case problem reported by Batalov when building with singlethread. Thanks people![/QUOTE] Alex, I integrated your additions into my latest, and also made a few tweaks to my yesterday work - the 3 predefines to "Unknown" (in place of the previous #undef) I used lead to preprocessor warnings on platforms where the variables do end being set to a specific platform-associated value: [code] ../platform.h:1222:10: warning: 'CPU_NAME' macro redefined #define CPU_NAME "x86_64" ^ ../platform.h:130:9: note: previous definition is here #define CPU_NAME "Unknown CPU name" [/code] so I fiddled those to set to the "unknown" values only if they are still undef'd by the time we reach the end of the platform.h file. But note I fubared my key-creation on the cloud, so to save time was using Serge's already-setup image with his unzip of the 14.1 packaged code. He has since blown his stuff away. If your instance is still up, could you try auto-build with the merged platform.h file attached below? Thanks, and happy lunar new year! I guess the celebrations have already started in Asia. (Or will within the next few hours.) |
After i replaced platform.h:
[CODE][linux1@lorenzoibm mlucas-14.1]$ ./mlucas Mlucas 14.1 http://hogranch.com/mayer/README.html INFO: testing qfloat routines... CPU Family = S390x, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 4.8.5 20150623 (Red Hat 4.8.5-4). INFO: Using inline-macro form of MUL_LOHI64. INFO: MLUCAS_PATH is set to "" INFO: using 53-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation. INFO: testing IMUL routines... INFO: System has 2 available processor cores. INFO: testing FFT radix tables... looking for number of threads to use in nthreads.ini file... Using NTHREADS = #CPUs = 2. looking for worktodo.ini file... worktodo.ini file found...checking next exponent in range... ERROR: at line 245 of file ./src/get_preferred_fft_radix.c Assertion failed: CONFIGFILE = mlucas.cfg: open failed![/CODE] |
Ohhh.Sorry. It's working!!! Amazing!!!
P.S. Forgot about perfomance tunes (mlucas -s m). |
[QUOTE=Lorenzo;425605]Ohhh.Sorry. It's working!!! Amazing!!!
P.S. Forgot about perfomance tunes (mlucas -s m).[/QUOTE] Can you tell us anything about its performances? I guess it's running with 2 threads... Luigi |
[QUOTE=ET_;425606]Can you tell us anything about its performances? I guess it's running with 2 threads...
Luigi[/QUOTE] Not fast :brian-e: [CODE] 448 msec/iter = 12.80 ROE[avg,max] = [0.224609375, 0.250000000] radices = 56 16 16 480 msec/iter = 14.04 ROE[avg,max] = [0.210880824, 0.250000000] radices = 60 16 16 512 msec/iter = 14.23 ROE[avg,max] = [0.281250000, 0.281250000] radices = 128 8 16 576 msec/iter = 16.14 ROE[avg,max] = [0.208354841, 0.250000000] radices = 144 8 16 640 msec/iter = 19.46 ROE[avg,max] = [0.257421875, 0.312500000] radices = 160 8 16 704 msec/iter = 21.52 ROE[avg,max] = [0.274654715, 0.343750000] radices = 176 8 16 768 msec/iter = 21.99 ROE[avg,max] = [0.209895543, 0.250000000] radices = 48 16 16 832 msec/iter = 24.75 ROE[avg,max] = [0.239439174, 0.312500000] radices = 208 8 16 896 msec/iter = 25.59 ROE[avg,max] = [0.227832031, 0.312500000] radices = 56 16 16 960 msec/iter = 28.33 ROE[avg,max] = [0.212360491, 0.250000000] radices = 60 16 16 1024 msec/iter = 28.24 ROE[avg,max] = [0.312500000, 0.312500000] radices = 128 16 16 1152 msec/iter = 32.89 ROE[avg,max] = [0.208562687, 0.253906250] radices = 144 16 16 1280 msec/iter = 40.32 ROE[avg,max] = [0.235714286, 0.312500000] radices = 20 8 16 1408 msec/iter = 42.28 ROE[avg,max] = [0.273688616, 0.343750000] radices = 176 16 16 1536 msec/iter = 44.80 ROE[avg,max] = [0.223493304, 0.281250000] radices = 192 16 16 1664 msec/iter = 48.48 ROE[avg,max] = [0.246149554, 0.312500000] radices = 208 16 16 1792 msec/iter = 51.94 ROE[avg,max] = [0.220703125, 0.281250000] radices = 224 16 16 1920 msec/iter = 61.24 ROE[avg,max] = [0.212430246, 0.257812500] radices = 60 16 32 2048 msec/iter = 56.56 ROE[avg,max] = [0.312500000, 0.312500000] radices = 128 16 16 2304 msec/iter = 65.73 ROE[avg,max] = [0.208895438, 0.250000000] radices = 144 16 16 2560 msec/iter = 79.33 ROE[avg,max] = [0.245312500, 0.281250000] radices = 20 16 16 2816 msec/iter = 85.93 ROE[avg,max] = [0.272896903, 0.343750000] radices = 176 16 16 3072 msec/iter = 91.91 ROE[avg,max] = [0.225892857, 0.281250000] radices = 192 16 16 3328 msec/iter = 97.41 ROE[avg,max] = [0.241322545, 0.281250000] radices = 208 16 16 3584 msec/iter = 105.64 ROE[avg,max] = [0.220870536, 0.250000000] radices = 224 16 16 3840 msec/iter = 132.28 ROE[avg,max] = [0.213867188, 0.242187500] radices = 60 32 32 4096 msec/iter = 116.38 ROE[avg,max] = [0.224023438, 0.250000000] radices = 16 16 16 4608 msec/iter = 141.80 ROE[avg,max] = [0.201425498, 0.250000000] radices = 144 16 32 5120 msec/iter = 162.11 ROE[avg,max] = [0.236607143, 0.281250000] radices = 20 16 16 5632 msec/iter = 186.77 ROE[avg,max] = [0.277120536, 0.312500000] radices = 44 16 16 6144 msec/iter = 192.85 ROE[avg,max] = [0.214425223, 0.250000000] radices = 48 16 16 6656 msec/iter = 223.12 ROE[avg,max] = [0.242299107, 0.281250000] radices = 208 16 32 7168 msec/iter = 230.10 ROE[avg,max] = [0.223437500, 0.281250000] radices = 56 16 16 7680 msec/iter = 253.42 ROE[avg,max] = [0.219891357, 0.250000000] radices = 60 16 16 8192 msec/iter = 252.43 ROE[avg,max] = [0.282589286, 0.312500000] radices = 1024 16 16 9216 msec/iter = 306.68 ROE[avg,max] = [0.208818163, 0.265625000] radices = 144 32 32 10240 msec/iter = 371.75 ROE[avg,max] = [0.248660714, 0.312500000] radices = 160 32 32 11264 msec/iter = 409.54 ROE[avg,max] = [0.275306920, 0.328125000] radices = 176 32 32 12288 msec/iter = 423.42 ROE[avg,max] = [0.209234401, 0.234375000] radices = 48 16 16 13312 msec/iter = 493.18 ROE[avg,max] = [0.236830357, 0.281250000] radices = 208 32 32 14336 msec/iter = 476.82 ROE[avg,max] = [0.218526786, 0.250000000] radices = 56 16 16 15360 msec/iter = 535.51 ROE[avg,max] = [0.217006138, 0.250000000] radices = 60 16 16 16384 msec/iter = 530.52 ROE[avg,max] = [0.276339286, 0.281250000] radices = 1024 16 16 18432 msec/iter = 606.73 ROE[avg,max] = [0.212458147, 0.250000000] radices = 144 16 16 20480 msec/iter = 745.91 ROE[avg,max] = [0.251116071, 0.281250000] radices = 160 16 16 22528 msec/iter = 822.14 ROE[avg,max] = [0.283984375, 0.328125000] radices = 176 16 16 24576 msec/iter = 833.16 ROE[avg,max] = [0.225502232, 0.250000000] radices = 192 16 16 26624 msec/iter = 975.42 ROE[avg,max] = [0.251785714, 0.281250000] radices = 208 16 16 28672 msec/iter = 971.73 ROE[avg,max] = [0.219098772, 0.250000000] radices = 224 16 16 30720 msec/iter = 1162.44 ROE[avg,max] = [0.242522321, 0.281250000] radices = 960 16 32 32768 msec/iter = 1075.11 ROE[avg,max] = [0.281250000, 0.281250000] radices = 1024 16 32 [/CODE] |
CPU Load
[CODE]top - 07:33:43 up 4 days, 3:39, 2 users, load average: 0,80, 0,58, 1,27 Tasks: 97 total, 1 running, 96 sleeping, 0 stopped, 0 zombie %Cpu(s): 98,8 us, 0,2 sy, 0,0 ni, 0,8 id, 0,0 wa, 0,0 hi, 0,0 si, 0,2 st KiB Mem : 2042848 total, 1076884 free, 111976 used, 853988 buff/cache KiB Swap: 501740 total, 501740 free, 0 used. 1853956 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5580 linux1 20 0 89624 36140 1120 S 197,3 1,8 0:59.77 mlucas [/CODE] |
But it's great that mLucas working on mainframe!!! :smile:
:primenet: |
[QUOTE=Lorenzo;425611]Not fast :brian-e:[/QUOTE]
Thanks, Lorenzo - it seems you truncated the rightmost columns of radices in posting your excerpt for the mlucas.cfg file (e.g. in the first line 448 means 448Kdoubles => complex FFT of legnth 224K = 56*16^3, i.e. there is a trailing 16 missing) - but those are easily inferred. Just as a point of 'slow' reference, the 32768K timing is roughly what I get on my aged Core2Duo running 2-threaded (1 thread per core) using the SSE2 version of the x86_64 build. My Haswell quad (4-threaded AVX2 build) is 10x faster. Aside from the overall slowness, the various non-powers-of-2 perform decently well with the notable exception of FFT lengths of form 15*2^n, which are uniformly dismal - the compiler really doesn't like my scalar-double radix-15 DFT macros, it seems. I guess the only positive thing I say (as with politics and economics it's all about the optimistic PR spin, you know) is that the scaling to larger runlengths is quite good - compare the 32768K and 1024K timings, for instance, with what one expects based on the asymptotic O(n log n) FFT opcount scaling. ------------------- Also, to repeat my earlier question: Do we have any way of seeing what kind of hardware is running underneath things? IBM's version of PowerPC? It would be silly if it were actually x86_64 and the cloud setup were masking that from users. |
cat /proc/cpuinfo should reveal some details.
|
[QUOTE=Mark Rose;425680]cat /proc/cpuinfo should reveal some details.[/QUOTE]
Not much info ... [CODE][linux1@lorenzoibm ~]$ cat /proc/cpuinfo vendor_id : IBM/S390 # processors : 2 bogomips per cpu: 20325.00 features : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs cache0 : level=1 type=Data scope=Private size=128K line_size=256 associativity=8 cache1 : level=1 type=Instruction scope=Private size=96K line_size=256 associativity=6 cache2 : level=2 type=Data scope=Private size=2048K line_size=256 associativity=8 cache3 : level=2 type=Instruction scope=Private size=2048K line_size=256 associativity=8 cache4 : level=3 type=Unified scope=Shared size=65536K line_size=256 associativity=16 cache5 : level=4 type=Unified scope=Shared size=491520K line_size=256 associativity=30 processor 0: version = FF, identification = 016A77, machine = 2964 processor 1: version = FF, identification = 016A77, machine = 2964[/CODE] [QUOTE]LinuxOne is a specialised Z13 IBM mainframe for Linux. You can run up to 8000 VM simultaneously on it. It is a powerfull beast like IBM does, the top stuff.[/QUOTE] So it's [URL="https://en.wikipedia.org/wiki/IBM_z13_(microprocessor)"]IBM Z13 CPU[/URL]. Much more details you can find in [URL="http://www.redbooks.ibm.com/redbooks/pdfs/sg248251.pdf"]Technical Guide[/URL]. And i'm not expert but i think it's not Power architecture. It's something special ... |
[QUOTE=Mark Rose;425680]cat /proc/cpuinfo should reveal some details.[/QUOTE]
Yes, this usually works very well, but proc filesystem is linux-specific, it may not work for other kernels. There is also lscpu command which I believe simply read /proc/cpuinfo and display it in a nicer way (using your locale setting). For FreeBSD, I find this [URL="https://stackoverflow.com/questions/4083848/what-is-the-equivalent-of-proc-cpuinfo-on-freebsd-v8-1?rq=1"]post[/URL]. |
[QUOTE=Lorenzo;425707]Not much info ...
[CODE][linux1@lorenzoibm ~]$ cat /proc/cpuinfo vendor_id : IBM/S390 # processors : 2 bogomips per cpu: 20325.00 features : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs cache0 : level=1 type=Data scope=Private size=128K line_size=256 associativity=8 cache1 : level=1 type=Instruction scope=Private size=96K line_size=256 associativity=6 cache2 : level=2 type=Data scope=Private size=2048K line_size=256 associativity=8 cache3 : level=2 type=Instruction scope=Private size=2048K line_size=256 associativity=8 cache4 : level=3 type=Unified scope=Shared size=65536K line_size=256 associativity=16 cache5 : level=4 type=Unified scope=Shared size=491520K line_size=256 associativity=30 processor 0: version = FF, identification = 016A77, machine = 2964 processor 1: version = FF, identification = 016A77, machine = 2964[/CODE]So it's [URL="https://en.wikipedia.org/wiki/IBM_z13_(microprocessor)"]IBM Z13 CPU[/URL]. Much more details you can find in [URL="http://www.redbooks.ibm.com/redbooks/pdfs/sg248251.pdf"]Technical Guide[/URL]. And i'm not expert but i think it's not Power architecture. It's something special ...[/QUOTE] Wauw! That is a lot of cache! L1 (per core) -96 KB instruction -128 KB Data L2 (per core) -2 MB instruction -2 MB Data L3 (shared) 64 MB eDRAM L4 (off die, on storage controller chip) 480 MB [quote] [FONT=sans-serif]The processor chip has an eight-core design, with either six, seven, or eight active cores, and [/FONT][FONT=sans-serif]operates at 5.0 GHz. Depending on the CPC drawer version (39 PU or 42 PU), 39 - 168 PUs [/FONT][FONT=sans-serif]are available on 1 - 4 CPC drawers.[/FONT][/quote]IBM names it a PU, we would call it a CPUcore. |
[QUOTE=VictordeHolland;425720]Wauw! That is a lot of cache!
L1 (per core) -96 KB instruction -128 KB Data L2 (per core) -2 MB instruction -2 MB Data L3 (shared) 64 MB eDRAM L4 (off die, on storage controller chip) 480 MB IBM names it a PU, we would call it a CPUcore.[/QUOTE] That explains the excellent timing-scaling in going to larger FFT lengths which we see in Lorenzo's cfg-file results. If we had some relatively efficient way to map x86_64 SIMD code to this arch's SIMD, things could get rather interesting. I shall have a look at the PDF Lorenzo linked later today. |
[QUOTE=ewmayer;425751]If we had some relatively efficient way to map x86_64 SIMD code to this arch's SIMD, things could get rather interesting. I shall have a look at the PDF Lorenzo linked later today.[/QUOTE]
Had a look - see nothing actually resembling an instruction set reference in there. Could someone point me to one? With just 139 SIMD instructions it wouldn't have taken up more than a decent-sized chapter or appendix in such a document. I did note this, however (Chapter 3. Central processor complex system design, p91), which mentions no floating-point among the SIMD - that would be a curious omission if indeed such are supported: [i] Here are some examples of SIMD instructions: o Integer byte to quadword add, sub, and compare o Integer byte to doubleword min, max, and average o Integer byte to word multiply o String find 8-bits, 16-bits, and 32-bits o String range compare o String find any equal o String load to block boundaries and load/store with length[/i] |
[QUOTE=ewmayer;425791]Had a look - see nothing actually resembling an instruction set reference in there. Could someone point me to one? With just 139 SIMD instructions it wouldn't have taken up more than a decent-sized chapter or appendix in such a document.
I did note this, however (Chapter 3. Central processor complex system design, p91), which mentions no floating-point among the SIMD - that would be a curious omission if indeed such are supported: [I] Here are some examples of SIMD instructions: o Integer byte to quadword add, sub, and compare o Integer byte to doubleword min, max, and average o Integer byte to word multiply o String find 8-bits, 16-bits, and 32-bits o String range compare o String find any equal o String load to block boundaries and load/store with length[/I][/QUOTE] Just find this documentation [URL="https://www-304.ibm.com/support/docview.wss?uid=isg29c69415c1e82603c852576700058075a&aid=1"]z/architecture reference summary[/URL] on the internet. Page 22 to page 25 shows the 139 vector instructions (of course I do not really try to count!), something like VMAH (vector multiple and add high)... Also I have created a s390x testing branch (there are only 2 commits), people interested are encouraged to test if it builds and passes the test, the instruction is as followed: $ git clone [URL]https://gitlab.com/mlucas-ll/mlucas.git[/URL] $ cd mlucas && touch * && git checkout s390x $ mkdir build && cd build && ../configure && make -j && make -j check (of course you must have git, gcc and make installed!) |
[QUOTE=alexvong1995;425805]Just find this documentation [URL="https://www-304.ibm.com/support/docview.wss?uid=isg29c69415c1e82603c852576700058075a&aid=1"]z/architecture reference summary[/URL] on the internet. Page 22 to page 25 shows the 139 vector instructions (of course I do not really try to count!), something like VMAH (vector multiple and add high)...
Also I have created a s390x testing branch (there are only 2 commits), people interested are encouraged to test if it builds and passes the test, the instruction is as followed: $ git clone [URL]https://gitlab.com/mlucas-ll/mlucas.git[/URL] $ cd mlucas && touch * && git checkout s390x $ mkdir build && cd build && ../configure && make -j && make -j check (of course you must have git, gcc and make installed!)[/QUOTE] If I may go somewhat off topic here, how would you compare GitLab hosting and web interface to GitHub? |
Alright, I find out it actually does not build on amd64, see [URL]https://gitlab.com/mlucas-ll/mlucas/builds[/URL], there are serveral error messages:
In file included from .././src/factor.c:3973:0: .././src/factor_test.h:603:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE' res64 = twopmodq100_2WORD_DOUBLE(p64, k); In file included from .././src/factor.c:3973:0: .././src/factor_test.h:1156:11: error: incompatible type for argument 2 of 'twopmodq100_2WORD_DOUBLE' res64 = twopmodq100_2WORD_DOUBLE(p64, q128); In file included from .././src/factor.c:3973:0: .././src/factor_test.h:1156:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE' res64 = twopmodq100_2WORD_DOUBLE(p64, q128); In file included from .././src/factor.c:3973:0: .././src/factor_test.h:1179:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE_q2' res64 = twopmodq100_2WORD_DOUBLE_q2(p64, k,k); |
Thanks for the instruction reference, Alex - that looks like what I need, and there does appear to be a full complement of vector-float functionality.
[QUOTE=alexvong1995;425811]Alright, I find out it actually does not build on amd64, see [URL]https://gitlab.com/mlucas-ll/mlucas/builds[/URL], there are serveral error messages: In file included from .././src/factor.c:3973:0: .././src/factor_test.h:603:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE' res64 = twopmodq100_2WORD_DOUBLE(p64, k); In file included from .././src/factor.c:3973:0: .././src/factor_test.h:1156:11: error: incompatible type for argument 2 of 'twopmodq100_2WORD_DOUBLE' res64 = twopmodq100_2WORD_DOUBLE(p64, q128); In file included from .././src/factor.c:3973:0: .././src/factor_test.h:1156:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE' res64 = twopmodq100_2WORD_DOUBLE(p64, q128); In file included from .././src/factor.c:3973:0: .././src/factor_test.h:1179:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE_q2' res64 = twopmodq100_2WORD_DOUBLE_q2(p64, k,k);[/QUOTE] OK, two things here: 1. I clearly need to update the API for this subset of TF calls, but you should not need factor.o to link currently, because TF functionality is not supported in the default Mlucas build; 2. These calls are getting included because USE_FMADD is getting def'd, which implies you are trying an AVX2/FMA3 build (-DUSE_AVX2 compile flag). Unless AMD has radically upgraded their AVX-and-beyond capabilities in the past year -- according to Wikipedia their very first CPU with AVX2 was 'Carrizo' (formerly codenamed 'Excavator') last year -- AMD builds probably should not go above SSE2. But if you can get AVX and/or AVX2 builds tested on something close to the latest AMD processor, let's see if they still suffer for the 'AVX slower than SSE2' handicap George noted based on his Prime95 tests and decide what to do config-wise based on that. |
[QUOTE=Dubslow;425810]If I may go somewhat off topic here, how would you compare GitLab hosting and web interface to GitHub?[/QUOTE]
It is a hard question, there isn't a clearly winner. :smile: I think having both accounts make it easier to submit pull request to projects hosted in either site. The wikipedia [URL="https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities"]page[/URL] is a good start on it. For github, it says "Gratis for public, paid for private." For gitlab, it says "Unlimited public and private repos, unlimited public and private collaborators". For web interface, it seems github is more responsive than gitlab. The UI of both sites looks comparable to me, I am using git most of the time anyway, so I think it is fine as long as the diff is shown nicely. For CI, github uses travis-ci and gitlab uses gitlab-ci. For js, gitlab releases all js under mit/expat, so you don't need to worry what the js code is doing. gitlab also releases its core as free-sw under mit/expat, known as gitlab-ce, github doesn't. For popularity, github is clearly more popular, but both are used by big organizations. There is one site getting increasingly popular [URL]https://notabug.org/[/URL], it is a community effort. |
[QUOTE=alexvong1995;425820]It is a hard question, there isn't a clearly winner. :smile:
I think having both accounts make it easier to submit pull request to projects hosted in either site. The wikipedia [URL="https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities"]page[/URL] is a good start on it. For github, it says "Gratis for public, paid for private." For gitlab, it says "Unlimited public and private repos, unlimited public and private collaborators". For web interface, it seems github is more responsive than gitlab. The UI of both sites looks comparable to me, I am using git most of the time anyway, so I think it is fine as long as the diff is shown nicely. For CI, github uses travis-ci and gitlab uses gitlab-ci. For js, gitlab releases all js under mit/expat, so you don't need to worry what the js code is doing. gitlab also releases its core as free-sw under mit/expat, known as gitlab-ce, github doesn't. For popularity, github is clearly more popular, but both are used by big organizations. There is one site getting increasingly popular [URL]https://notabug.org/[/URL], it is a community effort.[/QUOTE] Thanks, that's the sort of excellent summary I was looking for. I've discussed before the idea of moving a lot of the software here away from SourceForge/Subversion, and GitHub is of course the obvious alternative, though it's also good to track alternatives like GitLab and notabug.org. |
[QUOTE=Dubslow;425825]Thanks, that's the sort of excellent summary I was looking for. I've discussed before the idea of moving a lot of the software here away from SourceForge/Subversion, and GitHub is of course the obvious alternative, though it's also good to track alternatives like GitLab and notabug.org.[/QUOTE]
[QUOTE=alexvong1995;425820]It is a hard question, there isn't a clearly winner. :smile: I think having both accounts make it easier to submit pull request to projects hosted in either site. The wikipedia [URL="https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities"]page[/URL] is a good start on it. For github, it says "Gratis for public, paid for private." For gitlab, it says "Unlimited public and private repos, unlimited public and private collaborators". For web interface, it seems github is more responsive than gitlab. The UI of both sites looks comparable to me, I am using git most of the time anyway, so I think it is fine as long as the diff is shown nicely. For CI, github uses travis-ci and gitlab uses gitlab-ci. For js, gitlab releases all js under mit/expat, so you don't need to worry what the js code is doing. gitlab also releases its core as free-sw under mit/expat, known as gitlab-ce, github doesn't. For popularity, github is clearly more popular, but both are used by big organizations. There is one site getting increasingly popular [URL]https://notabug.org/[/URL], it is a community effort.[/QUOTE] [QUOTE=Dubslow;425810]If I may go somewhat off topic here, how would you compare GitLab hosting and web interface to GitHub?[/QUOTE] One last post on the matter (maybe a mod might split this off for me): [url]https://www.b.agilob.net/choose-gitlab-for-your-next-project/[/url] ^ An essay explicating why GitLab is better than GitHub. (Incidentally, I know the guy who runs that site (in an online manner same as I know everyone here) and I've worked with him on some projects.) |
[QUOTE=ewmayer;425751]That explains the excellent timing-scaling in going to larger FFT lengths which we see in Lorenzo's cfg-file results.
If we had some relatively efficient way to map x86_64 SIMD code to this arch's SIMD, things could get rather interesting. I shall have a look at the PDF Lorenzo linked later today.[/QUOTE] Hello! How about SIMD? Did you tried change code to support SIMD instructions for S390x arch? :unsure: |
[QUOTE=Lorenzo;428798]Hello! How about SIMD? Did you tried change code to support SIMD instructions for S390x arch? :unsure:[/QUOTE]
Excuse me if I didn't make it clear -- that's more in the way of a long-term 'maybe' project, not something one can make happen in a few weeks. And from a number-of-potential-users the coding effort is likely not justified. I am busy with another round of Intel SIMD optimizations and preparing for their next-gen AVX512 chips ... both of which have/will-have a very large user base. As my code is open-source, anyone with PPC-and-beyond assembler expertise and time to spare is welcome to have at it! Take a modest-size x86_64 SSE2 inline-asm macro - there are many to chose from - stick it into a suitable C test harness, use results to guide translation to IBM assembler. |
[QUOTE=ewmayer;428805]Excuse me if I didn't make it clear -- that's more in the way of a long-term 'maybe' project, not something one can make happen in a few weeks. And from a number-of-potential-users the coding effort is likely not justified. I am busy with another round of Intel SIMD optimizations and preparing for their next-gen AVX512 chips ... both of which have/will-have a very large user base.
As my code is open-source, anyone with PPC-and-beyond assembler expertise and time to spare is welcome to have at it! Take a modest-size x86_64 SSE2 inline-asm macro - there are many to chose from - stick it into a suitable C test harness, use results to guide translation to IBM assembler.[/QUOTE] Ok) Sure) Thank you for explanation :smile: Just double checked on IBM S390 :tu: [url]http://www.mersenne.org/report_exponent/?exp_lo=41523593&full=1[/url] |
| All times are UTC. The time now is 05:04. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.