mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Mlucas (https://www.mersenneforum.org/forumdisplay.php?f=118)
-   -   MLucas on IBM Mainframe (https://www.mersenneforum.org/showthread.php?t=20962)

Lorenzo 2016-02-05 14:20

MLucas on IBM Mainframe
 
Hello! It's possible compile on IBM Mainframe? Tried compile MLucas (mlucas-14.1.tar.gz and Mlucas_12.11.2014.tgz). But unfortunately have a some problem.

[CODE]Architecture: s390x
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Big Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s) per book: 1
Book(s): 2
Vendor ID: IBM/S390
BogoMIPS: 20325.00
Hypervisor: z/VM 6.3.0
Hypervisor vendor: IBM
Virtualization type: full
Dispatching mode: horizontal
L1d cache: 128K
L1i cache: 96K
L2d cache: 2048K
L2i cache: 2048K
[/CODE]

By the way. Everyone can try. Just register and get access for 90 days. [url]https://developer.ibm.com/linuxone/?source=web&ca=linuxone&ovcode=ov44223&tactic=C47300NW[/url]

I choose Linux Red Hat.

chalsall 2016-02-05 15:33

[QUOTE=Lorenzo;425319]By the way. Everyone can try. Just register and get access for 90 days. [url]https://developer.ibm.com/linuxone/?source=web&ca=linuxone&ovcode=ov44223&tactic=C47300NW[/url][/QUOTE]

I will let others speak to building MLucas in this environment, but thank you for pointing this out to us. Interesting....

chalsall 2016-02-05 16:57

[QUOTE=chalsall;425323]Interesting....[/QUOTE]

Or, perhaps not so much... DNS is broken within the virtual machine as initially instanced (at least for RedHat Linux), and Mlucas doesn't even start to compile successfully, even though "../configure" completes successfully.

Ernst?

I know this is an experiment by IBM, but it's not going to compete with EC2 nor Google Compute et al any time soon.

Dubslow 2016-02-05 17:14

What's the error? Two people saying "it doesn't work" is about as useful as me saying that Prime95 won't compile either.

chalsall 2016-02-05 17:22

[QUOTE=Dubslow;425334]What's the error? Two people saying "it doesn't work" is about as useful as me saying that Prime95 won't compile either.[/QUOTE]

Fair enough... [CODE][linux1@i4l-2 build]$ cat /proc/cpuinfo
vendor_id : IBM/S390
# processors : 2
bogomips per cpu: 20325.00
features : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs
processor 0: version = FF, identification = 016A77, machine = 2964
processor 1: version = FF, identification = 016A77, machine = 2964

[linux1@i4l-2 build]$ uname -a
Linux i4l-2 2.6.32-573.12.1.el6.s390x #1 SMP Mon Nov 23 12:58:30 EST 2015 s390x s390x s390x GNU/Linux

[linux1@i4l-2 build]$ pwd
/home/linux1/mlucas/mlucas-14.1/build

[linux1@i4l-2 build]$ ../configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking for style of include used by make... GNU
checking dependency style of gcc... none
checking for library containing ceil, log, pow, sqrt, sincos, floor, lrint, atan... -lm
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking fenv.h usability... yes
checking fenv.h presence... yes
checking for fenv.h... yes
checking limits.h usability... yes
checking limits.h presence... yes
checking for limits.h... yes
checking mach/mach.h usability... no
checking mach/mach.h presence... no
checking for mach/mach.h... no
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for unistd.h... (cached) yes
checking for stdbool.h that conforms to C99... yes
checking for _Bool... yes
checking for inline... inline
checking for pid_t... yes
checking for size_t... yes
checking for uint64_t... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible malloc... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible realloc... yes
checking for clock_gettime... no
checking for gethrtime... no
checking for gettimeofday... yes
checking for memset... yes
checking for pow... yes
checking for sqrt... yes
checking for strerror... yes
checking for strstr... yes
checking for strtoul... yes
checking whether _LARGEFILE_SOURCE is declared... no
checking build system type... s390x-ibm-linux-gnu
checking host system type... s390x-ibm-linux-gnu
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands

[linux1@i4l-2 build]$ make
make all-am
make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build'
CC $NORMAL_O $THREADS_O
make[1]: *** [NORMAL_O-THREADS_O.stamp] Error 1
make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build'
make: *** [all] Error 2[/CODE]

ET_ 2016-02-05 17:26

[QUOTE=chalsall;425331]
I know this is an experiment by IBM, but it's not going to compete with EC2 nor Google Compute et al any time soon.[/QUOTE]

Would you mind to elaborate?

Luigi

Dubslow 2016-02-05 17:33

[QUOTE=chalsall;425336]Fair enough... [CODE]
[linux1@i4l-2 build]$ make
make all-am
make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build'
CC $NORMAL_O $THREADS_O
make[1]: *** [NORMAL_O-THREADS_O.stamp] Error 1
make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build'
make: *** [all] Error 2[/CODE][/QUOTE]

What a perfectly useless error message. I see now why you didn't immediately include it.

chalsall 2016-02-05 17:42

[QUOTE=ET_;425337]Would you mind to elaborate?[/QUOTE]

Sure...

As much as I hate to say it, Intel has mostly won the assembly race. As in, most code is targeted to x86 (and this is coming from someone who hates x86; I much prefer 680x0 assembly (may it rest in peace)).

Even source code in C, C++ etc might not work under a different platform. This might be because of the build environment not being truly cross-platform, or because there are subtle bugs in the code which don't manifest under the most commonly used CPUs. Yes, this does mean the code is buggy, but most don't care -- they just want the code to work for them without hassle.

At the end of the day, my argument is that cloud computing providers need to provide x86 based virtual machines to (almost all of) their customers, or they're just not going to get traction.

ET_ 2016-02-05 17:50

[QUOTE=chalsall;425339]...
At the end of the day, my argument is that cloud computing providers need to provide x86 based virtual machines to (almost all of) their customers, or they're just not going to get traction.[/QUOTE]

I am working with servers in cloud, and notice that even on x86 architectures, one "virtual CPU" equals 40%-50% of a real one. If the IBM fellows deliver a virtual CPU worth 60%-65% of a real CPU, they would win the race.

That "virtual CPu" thingie is going to take the place of the "minimum guaranteed upload bandwidth" for ADSL...

chalsall 2016-02-05 18:03

[QUOTE=ET_;425340]I am working with servers in cloud, and notice that even on x86 architectures, one "virtual CPU" equals 40%-50% of a real one. If the IBM fellows deliver a virtual CPU worth 60%-65% of a real CPU, they would win the race.[/QUOTE]

I also mostly work in "the cloud". For my serious work I lease dedicated servers, where I get 100% of the machine.

For my "on demand" work I sometimes go "virtual". And, if you know what you're doing, you can get ~99% of a machine for pennies on the dollar.

IBM is not going to win this race unless and until they offer x86 instances. And, so you know, I ran a simple benchmark from the command line, and their instance was half the speed of my desktop (one CPU used on each).

alexvong1995 2016-02-05 18:31

[QUOTE=chalsall;425336]Fair enough... [CODE][linux1@i4l-2 build]$ cat /proc/cpuinfo
vendor_id : IBM/S390
# processors : 2
bogomips per cpu: 20325.00
features : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs
processor 0: version = FF, identification = 016A77, machine = 2964
processor 1: version = FF, identification = 016A77, machine = 2964

[linux1@i4l-2 build]$ uname -a
Linux i4l-2 2.6.32-573.12.1.el6.s390x #1 SMP Mon Nov 23 12:58:30 EST 2015 s390x s390x s390x GNU/Linux

[linux1@i4l-2 build]$ pwd
/home/linux1/mlucas/mlucas-14.1/build

[linux1@i4l-2 build]$ ../configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking for style of include used by make... GNU
checking dependency style of gcc... none
checking for library containing ceil, log, pow, sqrt, sincos, floor, lrint, atan... -lm
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking fenv.h usability... yes
checking fenv.h presence... yes
checking for fenv.h... yes
checking limits.h usability... yes
checking limits.h presence... yes
checking for limits.h... yes
checking mach/mach.h usability... no
checking mach/mach.h presence... no
checking for mach/mach.h... no
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for unistd.h... (cached) yes
checking for stdbool.h that conforms to C99... yes
checking for _Bool... yes
checking for inline... inline
checking for pid_t... yes
checking for size_t... yes
checking for uint64_t... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible malloc... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible realloc... yes
checking for clock_gettime... no
checking for gethrtime... no
checking for gettimeofday... yes
checking for memset... yes
checking for pow... yes
checking for sqrt... yes
checking for strerror... yes
checking for strstr... yes
checking for strtoul... yes
checking whether _LARGEFILE_SOURCE is declared... no
checking build system type... s390x-ibm-linux-gnu
checking host system type... s390x-ibm-linux-gnu
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands

[linux1@i4l-2 build]$ make
make all-am
make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build'
CC $NORMAL_O $THREADS_O
make[1]: *** [NORMAL_O-THREADS_O.stamp] Error 1
make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build'
make: *** [all] Error 2[/CODE][/QUOTE]

Hi people, could you try appending --enable-verbose-compiler and --disable-silent-rule when running configure and paste the output? (I don't want to give away my phone number to register...) Currently, Mlucas only compiles and passes the self-test on 4 targets when [URL="https://buildd.debian.org/status/package.php?p=mlucas"]building[/URL] it on [URL="https://packages.debian.org/sid/math/mlucas"]Debian[/URL]. It would be great if it compiles targets other than x86 and powerpc as well. Besides, do you think compiler warnings and verbose rule should be enabled by default? I used to think it is too verbose but it seems the short error message is pretty useless.

chalsall 2016-02-05 19:08

[QUOTE=alexvong1995;425343]Hi people, could you try appending --enable-verbose-compiler and --disable-silent-rule when running configure and paste the output? (I don't want to give away my phone number to register...)[/QUOTE]

Sure. But you can always use a "burner phone" in such cases. I know guys who have a half dozen or so for just such instances... :wink:

But here are the results of what you asked for:

[CODE]linux1@i4l-2 build]$ ../configure --enable-verbose-compiler --disable-silent-rule
configure: WARNING: unrecognized options: --disable-silent-rule
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking for style of include used by make... GNU
checking dependency style of gcc... none
checking for library containing ceil, log, pow, sqrt, sincos, floor, lrint, atan... -lm
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking fenv.h usability... yes
checking fenv.h presence... yes
checking for fenv.h... yes
checking limits.h usability... yes
checking limits.h presence... yes
checking for limits.h... yes
checking mach/mach.h usability... no
checking mach/mach.h presence... no
checking for mach/mach.h... no
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for unistd.h... (cached) yes
checking for stdbool.h that conforms to C99... yes
checking for _Bool... yes
checking for inline... inline
checking for pid_t... yes
checking for size_t... yes
checking for uint64_t... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible malloc... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible realloc... yes
checking for clock_gettime... no
checking for gethrtime... no
checking for gettimeofday... yes
checking for memset... yes
checking for pow... yes
checking for sqrt... yes
checking for strerror... yes
checking for strstr... yes
checking for strtoul... yes
checking whether _LARGEFILE_SOURCE is declared... no
checking build system type... s390x-ibm-linux-gnu
checking host system type... s390x-ibm-linux-gnu
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands
configure: WARNING: unrecognized options: --disable-silent-rule

[linux1@i4l-2 build]$ make
make all-am
make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build'
CC $NORMAL_O $THREADS_O
In file included from ../src/types.h:30,
from ../src/align.h:29,
from ../src/Mlucas.h:29,
from ../src/br.c:23:
../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds!
In file included from ../src/types.h:30,
from ../src/Mdata.h:30,
from ../src/dft_macro.c:24:
../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds!
In file included from ../src/types.h:30,
from ../src/util.h:30,
from ../src/factor.h:29,
from ../src/factor.c:27:
../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds!
In file included from ../src/types.h:30,
from ../src/align.h:29,
from ../src/Mlucas.h:29,
from ../src/fermat_mod_square.c:23:



...




In file included from ../src/types.h:30,
from ../src/util.h:30,
from ../src/factor.h:29,
from ../src/twopmodq80.c:23:
../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds!
../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE’:
../src/twopmodq80.c:342: warning: right shift count >= width of type
../src/twopmodq80.c:342: warning: left shift count is negative
../src/twopmodq80.c:342: warning: right shift count >= width of type
../src/twopmodq80.c:342: warning: left shift count is negative
../src/twopmodq80.c:342: warning: right shift count >= width of type
../src/twopmodq80.c:342: warning: right shift count is negative
../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q2’:
../src/twopmodq80.c:811: warning: right shift count >= width of type
../src/twopmodq80.c:811: warning: left shift count is negative
../src/twopmodq80.c:811: warning: right shift count >= width of type
../src/twopmodq80.c:811: warning: left shift count is negative
../src/twopmodq80.c:811: warning: right shift count >= width of type
../src/twopmodq80.c:811: warning: right shift count is negative
../src/twopmodq80.c:812: warning: right shift count >= width of type
../src/twopmodq80.c:812: warning: left shift count is negative
../src/twopmodq80.c:812: warning: right shift count >= width of type
../src/twopmodq80.c:812: warning: left shift count is negative
../src/twopmodq80.c:812: warning: right shift count >= width of type
../src/twopmodq80.c:812: warning: right shift count is negative
../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q4’:
../src/twopmodq80.c:2210: warning: right shift count >= width of type
../src/twopmodq80.c:2210: warning: left shift count is negative
../src/twopmodq80.c:2210: warning: right shift count >= width of type
../src/twopmodq80.c:2210: warning: left shift count is negative
../src/twopmodq80.c:2210: warning: right shift count >= width of type
../src/twopmodq80.c:2210: warning: right shift count is negative
../src/twopmodq80.c:2211: warning: right shift count >= width of type
../src/twopmodq80.c:2211: warning: left shift count is negative
../src/twopmodq80.c:2211: warning: right shift count >= width of type
../src/twopmodq80.c:2211: warning: left shift count is negative
../src/twopmodq80.c:2211: warning: right shift count >= width of type
../src/twopmodq80.c:2211: warning: right shift count is negative
../src/twopmodq80.c:2212: warning: right shift count >= width of type
../src/twopmodq80.c:2212: warning: left shift count is negative
../src/twopmodq80.c:2212: warning: right shift count >= width of type
../src/twopmodq80.c:2212: warning: left shift count is negative
../src/twopmodq80.c:2212: warning: right shift count >= width of type
../src/twopmodq80.c:2212: warning: right shift count is negative
../src/twopmodq80.c:2213: warning: right shift count >= width of type
../src/twopmodq80.c:2213: warning: left shift count is negative
../src/twopmodq80.c:2213: warning: right shift count >= width of type
../src/twopmodq80.c:2213: warning: left shift count is negative
../src/twopmodq80.c:2213: warning: right shift count >= width of type
../src/twopmodq80.c:2213: warning: right shift count is negative
../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q4_REF’:
../src/twopmodq80.c:2954: warning: right shift count >= width of type
../src/twopmodq80.c:2954: warning: left shift count is negative
../src/twopmodq80.c:2954: warning: right shift count >= width of type
../src/twopmodq80.c:2954: warning: left shift count is negative
../src/twopmodq80.c:2954: warning: right shift count >= width of type
../src/twopmodq80.c:2954: warning: right shift count is negative
../src/twopmodq80.c:2955: warning: right shift count >= width of type
../src/twopmodq80.c:2955: warning: left shift count is negative
../src/twopmodq80.c:2955: warning: right shift count >= width of type
../src/twopmodq80.c:2955: warning: left shift count is negative
../src/twopmodq80.c:2955: warning: right shift count >= width of type
../src/twopmodq80.c:2955: warning: right shift count is negative
../src/twopmodq80.c:2956: warning: right shift count >= width of type
../src/twopmodq80.c:2956: warning: left shift count is negative
../src/twopmodq80.c:2956: warning: right shift count >= width of type
../src/twopmodq80.c:2956: warning: left shift count is negative
../src/twopmodq80.c:2956: warning: right shift count >= width of type
../src/twopmodq80.c:2956: warning: right shift count is negative
../src/twopmodq80.c:2957: warning: right shift count >= width of type
../src/twopmodq80.c:2957: warning: left shift count is negative
../src/twopmodq80.c:2957: warning: right shift count >= width of type
../src/twopmodq80.c:2957: warning: left shift count is negative
../src/twopmodq80.c:2957: warning: right shift count >= width of type
../src/twopmodq80.c:2957: warning: right shift count is negative
In file included from ../src/types.h:30,
from ../src/util.h:30,
from ../src/factor.h:29,
from ../src/twopmodq96.c:23:
../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds!
In file included from ../src/types.h:30,
from ../src/util.h:30,
from ../src/factor.h:29,
from ../src/twopmodq.c:23:
../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds!
In file included from ../src/types.h:30,
from ../src/types.c:23:
../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds!
In file included from ../src/types.h:30,
from ../src/threadpool.h:69,
from ../src/threadpool.c:65:
../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds!
../src/threadpool.c:265: warning: ‘force_align_arg_pointer’ attribute directive ignored
../src/threadpool.c: In function ‘worker_thr_routine’:
../src/threadpool.c:312: error: ‘__NR_gettid’ undeclared (first use in this function)
../src/threadpool.c:312: error: (Each undeclared identifier is reported only once
../src/threadpool.c:312: error: for each function it appears in.)
make[1]: *** [NORMAL_O-THREADS_O.stamp] Error 1
make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build'
make: *** [all] Error 2[/CODE]

alexvong1995 2016-02-05 20:03

[CODE]configure: WARNING: unrecognized options: --disable-silent-rule[/CODE]Oh I forget a "s". :rolleyes:
[CODE]../src/platform.h:1072:4: error: #error Multithreading currently only supported for Linux/GCC builds![/CODE]Judging from this line, maybe we should try to build without multithread. Maybe we can try ./configure --enable-verbose-compiler --disable-threads and see what happens.
By the way, --help should show all options of configure. (only the Custom Options part is useful)

chalsall 2016-02-05 20:26

[QUOTE=alexvong1995;425358]Maybe we can try ./configure --enable-verbose-compiler --disable-threads and see what happens.[/QUOTE]

This is the last help I'm going to provide on this matter.

Create your own account to debug further on the IBM 390 mainframe....

[CODE][linux1@i4l-2 build]$ ../configure --enable-verbose-compiler --disable-threads
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking for style of include used by make... GNU
checking dependency style of gcc... none
checking for library containing ceil, log, pow, sqrt, sincos, floor, lrint, atan... -lm
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking fenv.h usability... yes
checking fenv.h presence... yes
checking for fenv.h... yes
checking limits.h usability... yes
checking limits.h presence... yes
checking for limits.h... yes
checking mach/mach.h usability... no
checking mach/mach.h presence... no
checking for mach/mach.h... no
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for unistd.h... (cached) yes
checking for stdbool.h that conforms to C99... yes
checking for _Bool... yes
checking for inline... inline
checking for pid_t... yes
checking for size_t... yes
checking for uint64_t... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible malloc... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible realloc... yes
checking for clock_gettime... no
checking for gethrtime... no
checking for gettimeofday... yes
checking for memset... yes
checking for pow... yes
checking for sqrt... yes
checking for strerror... yes
checking for strstr... yes
checking for strtoul... yes
checking whether _LARGEFILE_SOURCE is declared... no
checking build system type... s390x-ibm-linux-gnu
checking host system type... s390x-ibm-linux-gnu
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands

[linux1@i4l-2 build]$ make
make all-am
make[1]: Entering directory `/home/linux1/mlucas/mlucas-14.1/build'
CC $NORMAL_O
../src/get_fft_radices.c: In function ‘get_fft_radices’:
../src/get_fft_radices.c:1446: error: duplicate case value
../src/get_fft_radices.c:1443: error: previously used here
../src/Mlucas.c: In function ‘ernstMain’:
../src/Mlucas.c:1170: warning: cast from pointer to integer of different size
In file included from ../src/radix1024_ditN_cy_dif1.c:1324:
../src/radix1024_main_carry_loop.h: In function ‘radix1024_ditN_cy_dif1’:
../src/radix1024_main_carry_loop.h:137: warning: assignment discards qualifiers from pointer target type
In file included from ../src/radix1024_ditN_cy_dif1.c:1324:
../src/radix1024_main_carry_loop.h:529: warning: assignment discards qualifiers from pointer target type
In file included from ../src/radix128_ditN_cy_dif1.c:1443:
../src/radix128_main_carry_loop.h: In function ‘radix128_ditN_cy_dif1’:
../src/radix128_main_carry_loop.h:325: warning: assignment discards qualifiers from pointer target type
../src/radix128_main_carry_loop.h:868: warning: assignment discards qualifiers from pointer target type
In file included from ../src/radix144_ditN_cy_dif1.c:1044:
../src/radix144_main_carry_loop.h: In function ‘radix144_ditN_cy_dif1’:
../src/radix144_main_carry_loop.h:194: warning: assignment discards qualifiers from pointer target type
../src/radix144_main_carry_loop.h:503: warning: assignment discards qualifiers from pointer target type
../src/radix144_ditN_cy_dif1.c: In function ‘radix144_dif_pass1’:
../src/radix144_ditN_cy_dif1.c:1348: warning: assignment discards qualifiers from pointer target type
../src/radix144_ditN_cy_dif1.c: In function ‘radix144_dit_pass1’:
../src/radix144_ditN_cy_dif1.c:1609: warning: assignment discards qualifiers from pointer target type
In file included from ../src/radix208_ditN_cy_dif1.c:1049:
../src/radix208_main_carry_loop.h: In function ‘radix208_ditN_cy_dif1’:
../src/radix208_main_carry_loop.h:144: warning: assignment discards qualifiers from pointer target type
../src/radix208_main_carry_loop.h:401: warning: assignment discards qualifiers from pointer target type
../src/radix208_ditN_cy_dif1.c: In function ‘radix208_dif_pass1’:
../src/radix208_ditN_cy_dif1.c:1322: warning: assignment discards qualifiers from pointer target type
../src/radix208_ditN_cy_dif1.c: In function ‘radix208_dit_pass1’:
../src/radix208_ditN_cy_dif1.c:1600: warning: assignment discards qualifiers from pointer target type
In file included from ../src/radix256_ditN_cy_dif1.c:1675:
../src/radix256_main_carry_loop.h: In function ‘radix256_ditN_cy_dif1’:
../src/radix256_main_carry_loop.h:307: warning: assignment discards qualifiers from pointer target type
../src/radix256_main_carry_loop.h:876: warning: assignment discards qualifiers from pointer target type
../src/radix512_ditN_cy_dif1.c: In function ‘radix512_dif_pass1’:
../src/radix512_ditN_cy_dif1.c:322: warning: assignment discards qualifiers from pointer target type
../src/radix512_ditN_cy_dif1.c: In function ‘radix512_dit_pass1’:
../src/radix512_ditN_cy_dif1.c:491: warning: assignment discards qualifiers from pointer target type
In file included from ../src/radix64_ditN_cy_dif1.c:1306:
../src/radix64_main_carry_loop.h: In function ‘radix64_ditN_cy_dif1’:
../src/radix64_main_carry_loop.h:218: warning: assignment discards qualifiers from pointer target type
../src/radix64_main_carry_loop.h:735: warning: assignment discards qualifiers from pointer target type
../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE’:
../src/twopmodq80.c:342: warning: right shift count >= width of type
../src/twopmodq80.c:342: warning: left shift count is negative
../src/twopmodq80.c:342: warning: right shift count >= width of type
../src/twopmodq80.c:342: warning: left shift count is negative
../src/twopmodq80.c:342: warning: right shift count >= width of type
../src/twopmodq80.c:342: warning: right shift count is negative
../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q2’:
../src/twopmodq80.c:811: warning: right shift count >= width of type
../src/twopmodq80.c:811: warning: left shift count is negative
../src/twopmodq80.c:811: warning: right shift count >= width of type
../src/twopmodq80.c:811: warning: left shift count is negative
../src/twopmodq80.c:811: warning: right shift count >= width of type
../src/twopmodq80.c:811: warning: right shift count is negative
../src/twopmodq80.c:812: warning: right shift count >= width of type
../src/twopmodq80.c:812: warning: left shift count is negative
../src/twopmodq80.c:812: warning: right shift count >= width of type
../src/twopmodq80.c:812: warning: left shift count is negative
../src/twopmodq80.c:812: warning: right shift count >= width of type
../src/twopmodq80.c:812: warning: right shift count is negative
../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q4’:
../src/twopmodq80.c:2210: warning: right shift count >= width of type
../src/twopmodq80.c:2210: warning: left shift count is negative
../src/twopmodq80.c:2210: warning: right shift count >= width of type
../src/twopmodq80.c:2210: warning: left shift count is negative
../src/twopmodq80.c:2210: warning: right shift count >= width of type
../src/twopmodq80.c:2210: warning: right shift count is negative
../src/twopmodq80.c:2211: warning: right shift count >= width of type
../src/twopmodq80.c:2211: warning: left shift count is negative
../src/twopmodq80.c:2211: warning: right shift count >= width of type
../src/twopmodq80.c:2211: warning: left shift count is negative
../src/twopmodq80.c:2211: warning: right shift count >= width of type
../src/twopmodq80.c:2211: warning: right shift count is negative
../src/twopmodq80.c:2212: warning: right shift count >= width of type
../src/twopmodq80.c:2212: warning: left shift count is negative
../src/twopmodq80.c:2212: warning: right shift count >= width of type
../src/twopmodq80.c:2212: warning: left shift count is negative
../src/twopmodq80.c:2212: warning: right shift count >= width of type
../src/twopmodq80.c:2212: warning: right shift count is negative
../src/twopmodq80.c:2213: warning: right shift count >= width of type
../src/twopmodq80.c:2213: warning: left shift count is negative
../src/twopmodq80.c:2213: warning: right shift count >= width of type
../src/twopmodq80.c:2213: warning: left shift count is negative
../src/twopmodq80.c:2213: warning: right shift count >= width of type
../src/twopmodq80.c:2213: warning: right shift count is negative
../src/twopmodq80.c: In function ‘twopmodq78_3WORD_DOUBLE_q4_REF’:
../src/twopmodq80.c:2954: warning: right shift count >= width of type
../src/twopmodq80.c:2954: warning: left shift count is negative
../src/twopmodq80.c:2954: warning: right shift count >= width of type
../src/twopmodq80.c:2954: warning: left shift count is negative
../src/twopmodq80.c:2954: warning: right shift count >= width of type
../src/twopmodq80.c:2954: warning: right shift count is negative
../src/twopmodq80.c:2955: warning: right shift count >= width of type
../src/twopmodq80.c:2955: warning: left shift count is negative
../src/twopmodq80.c:2955: warning: right shift count >= width of type
../src/twopmodq80.c:2955: warning: left shift count is negative
../src/twopmodq80.c:2955: warning: right shift count >= width of type
../src/twopmodq80.c:2955: warning: right shift count is negative
../src/twopmodq80.c:2956: warning: right shift count >= width of type
../src/twopmodq80.c:2956: warning: left shift count is negative
../src/twopmodq80.c:2956: warning: right shift count >= width of type
../src/twopmodq80.c:2956: warning: left shift count is negative
../src/twopmodq80.c:2956: warning: right shift count >= width of type
../src/twopmodq80.c:2956: warning: right shift count is negative
../src/twopmodq80.c:2957: warning: right shift count >= width of type
../src/twopmodq80.c:2957: warning: left shift count is negative
../src/twopmodq80.c:2957: warning: right shift count >= width of type
../src/twopmodq80.c:2957: warning: left shift count is negative
../src/twopmodq80.c:2957: warning: right shift count >= width of type
../src/twopmodq80.c:2957: warning: right shift count is negative
make[1]: *** [NORMAL_O.stamp] Error 1
make[1]: Leaving directory `/home/linux1/mlucas/mlucas-14.1/build'
make: *** [all] Error 2[/CODE]

Lorenzo 2016-02-05 21:41

I can help to build )) But i'm not familiar with C/C++ :blush:

chalsall 2016-02-05 22:31

[QUOTE=Lorenzo;425371]I can help to build )) But i'm not familiar with C/C++ :blush:[/QUOTE]

We are still awaiting Ernst to speak.

Usually he is rather loud. His silence is notable.

Edit: Casts got your tongue?

ewmayer 2016-02-05 23:11

There are two distinct issues here - [1] the basic code-build and [2] the automated Linux/x86-oriented build scripts - which should be tackled in turn. I'll let Alex focus on [2] since he developed the scripts.

As to [1], the OP mentions having 'tried' the pre-build-script-wrapped 14.1 release - what precisely was tried? The manual build instructions on the README page? Note that due to the non-x86 CPUs here we are stuck with the non-SIMD basic scalar-double C build. But it's still always useful to work through build and code issues on non-primary-target platforms, to make/keep the code as portable as reasonably possible.

So to the OP (or anyone else with access to this platform - I will look into a guest account over the weekend), try the most-basic manual build first: straight C, unthreaded - i.e. no USE_ flags of any kind in the compile command.

As for the threading-related preprocessor #error, if this platform supports pthreads it should be a simple matter of tweaking the platform.h header to look in the right place for the pthreads-related header files. Looking at the #error line Alex quoted in my platform.h file, it's clear the
[i]
#include <pthread.h>
[/i]
is not finding the pthread.h file. Is there one in the /include tree on this platform?

chalsall 2016-02-05 23:35

[QUOTE=ewmayer;425381]As to [1], the OP mentions having 'tried' the pre-build-script-wrapped 14.1 release - what precisely was tried?[/QUOTE]

What part of "../configure; make" isn't clear?

ewmayer 2016-02-05 23:41

[QUOTE=chalsall;425386]What part of "../configure; make" isn't clear?[/QUOTE]

The part about the Mlucas_12.11.2014.tgz release not supporting that kind of auto-build, possibly.

chalsall 2016-02-05 23:46

[QUOTE=ewmayer;425387]The part about the Mlucas_12.11.2014.tgz release not supporting that kind of auto-build, possibly.[/QUOTE]

You missed this:

[CODE]linux1@i4l-2 build]$ pwd
/home/linux1/mlucas/mlucas-14.1/build[/CODE]

ewmayer 2016-02-06 01:18

[QUOTE=chalsall;425388]You missed this:

[CODE]linux1@i4l-2 build]$ pwd
/home/linux1/mlucas/mlucas-14.1/build[/CODE][/QUOTE]

I was referring to the 'and' part of the OP's second sentence:
[i]Tried compile MLucas (mlucas-14.1.tar.gz and Mlucas_12.11.2014.tgz)[/i]

Batalov 2016-02-06 03:12

How about this (in get_fft_radices.c, lines 1443-1446):
[CODE] case 7 :
numrad = 6; rvec[0] = 16; rvec[1] = 8; rvec[2] = 8; rvec[3] = 8; rvec[4] = 8; rvec[5] = 16; break;
#ifndef USE_ONLY_LARGE_LEAD_RADICES
case 7 :
numrad = 5; rvec[0] = 8; rvec[1] = 16; rvec[2] = 16; rvec[3] = 32; rvec[4] = 16; break;

[linux1@rhel7-3 build]$ gcc -Os ../src/get_fft_radices.c
../src/get_fft_radices.c: In function ‘get_fft_radices’:
../src/get_fft_radices.c:1446:3: error: duplicate case value
case 7 :
^
../src/get_fft_radices.c:1443:3: error: previously used here
case 7 :
^
[/CODE]Their gcc compiler is a bit stricter than on x86_64 (I've built it before -- no problems were reported, only warnings).

A bit later (in TRICKY section)
[CODE]../src/util.c: In function ‘print_host_info’:
../src/util.c:1071:106: error: ‘OS_BITS’ undeclared (first use in this function)[/CODE]so add -DOS_BITS=64 to the compiler line.

After polishing these two little blemishes, runs like a clock.

ewmayer 2016-02-06 04:16

[QUOTE=Batalov;425409]How about this (in get_fft_radices.c, lines 1443-1446):
[CODE] case 7 :
numrad = 6; rvec[0] = 16; rvec[1] = 8; rvec[2] = 8; rvec[3] = 8; rvec[4] = 8; rvec[5] = 16; break;
#ifndef USE_ONLY_LARGE_LEAD_RADICES
case 7 :
numrad = 5; rvec[0] = 8; rvec[1] = 16; rvec[2] = 16; rvec[3] = 32; rvec[4] = 16; break;

[linux1@rhel7-3 build]$ gcc -Os ../src/get_fft_radices.c
../src/get_fft_radices.c: In function ‘get_fft_radices’:
../src/get_fft_radices.c:1446:3: error: duplicate case value
case 7 :
^
../src/get_fft_radices.c:1443:3: error: previously used here
case 7 :
^
[/CODE]Their gcc compiler is a bit stricter than on x86_64 (I've built it before -- no problems were reported, only warnings).[/QUOTE]
Yes, that bug in the unthreaded scalar-C builds has been fixed in my dev-branch code, users who run into it should just 'do the needful' in their local copy of the file.

[QUOTE]A bit later (in TRICKY section)
[CODE]../src/util.c: In function ‘print_host_info’:
../src/util.c:1071:106: error: ‘OS_BITS’ undeclared (first use in this function)[/CODE]so add -DOS_BITS=64 to the compiler line.

After polishing these two little blemishes, runs like a clock.[/QUOTE]
As long as it's a 64-bit OS - that would be 'yes' - your hack is fine. Note OS_BITS is set (or better, attempted-to-set) in platform.h - you can turn on a bunch of diagnostics for that file by adding -DPLATFORM_DEBUG to your compile command, say for just a single tiny source file like br.c.

alexvong1995 2016-02-06 19:03

I managed to set up a s390x chroot! So I don't need to rely on QEMU image anymore (They are pretty hard to find and make). I can now build a "fresh" debian rootfs from scratch.

For curious people, this is how it is done (the following commands need root privilege):[CODE]# apt-get install binfmt-support qemu qemu-user-static[/CODE]Then, manually download [URL="https://packages.debian.org/jessie-backports/all/debootstrap/download"]debootstrap[/URL] and install it by [CODE]# dpkg -i debootstrap_1.0.73~bpo8+1_all.deb[/CODE]Note we cannot install the latest version because it has a [URL="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=813232"]bug[/URL] which prevents the two-stage bootstrap from working. (It takes me a looong time to figure this out, the error message isn't so obvious.) [CODE]qemu-debootstrap --arch=s390x --include=emacs-nox,less,build-essential,autogen --variant=buildd sid debian-s390x[/CODE] and we are done! :smile: You can now chroot into the chroot. This should works for any architecture that is supported by QEMU user mode emulation. In particular, armel (used by your phone) is supported.

Batalov 2016-02-07 02:41

I dropped my compilation yesterday, but today I tried some more...

#define COMPILER_TYPE_GCC 1

in ../src/platform.h

And Bob's your mother's closest living relative or something like that...

ewmayer 2016-02-07 04:46

1 Attachment(s)
OK, I got both un-and-multithreaded builds (all manual compile and link) to work, via the following hacks to platform.h:

o Got rid of these undefs at top - these (duh!) make it impossible for use of PLATFORM_DEBUG as intended, no clue WTF I was thinking with these undefs:

#undef PLATFORM_DEBUG
#undef OS_DEBUG
#undef OS_BITS_DEBUG
#undef CPU_DEBUG
#undef CMPLR_DEBUG

o Instead of initially undefing the following 3 things I now init them to 'unknown' (the not-defined CPU_NAME was hosing the util.c compile):

#define CPU_TYPE "Unknown CPU type"
#define CPU_NAME "Unknown CPU name"
#define CPU_SUBTYPE_NAME "Unknown CPU subtype"

o Changed

#ifndef CPU_TYPE
#error platform.h : CPU_TYPE not defined!
#endif

to a #warning. (OS_TYPE and COMPILER_TYPE are still required to be defined during preprocessing of the platform.h file, though.)

o Added a #elif for the case of Unknown hardware platform, but under Linux/GCC.

Fiddled version of the header attached. I only did a pair of quick spot-checks of the 2 binaries @FFT length 128K via './Mlucas -fftlen 128 -iters 100' in each of my 2 build dirs (one for unthreaded, one with pthreading). All the normal Linux thread-affinity stuff seems to be working, though no clue how much || scaling one can expect in these puny guest-accounts. (My virtual setup shows just 2 cores.)

alexvong1995 2016-02-07 12:29

1 Attachment(s)
Hi Ernst,

I have added 2 new CPU_TYPE CPU_IS_S390 and CPU_IS_S390X to platform.h, does it passes the spot-check and self-test on your VM? The platform.h I use is the old one on my repo, not the one you have just posted . I will rebase my change on the latest platform.h if it works.

By the way, could you post the latest version of get_fft_radices.c on your dev branch? I think I run into the duplicate case problem reported by Batalov when building with singlethread.

Thanks people!

ewmayer 2016-02-07 21:27

1 Attachment(s)
Hi, Alex - I will diff your platform.h and try it on the Cloud when I'm home later today and have a secure connection.

Being paranoid I manage my dev-branch code privately, so here is the current version of the file you asked for - should be a drop-in replacement (md5 of the *gz = db5d2504d58897229d0366f4749b4131):

ewmayer 2016-02-07 21:34

Also, do we have any easy way of determining what underlying compute hardware is being used, or is the whole point of the 'generic' cloud setup to obfuscate that?

My gcc predefines-dump shows no obvious clues to the CPU type.

ewmayer 2016-02-08 07:45

1 Attachment(s)
[QUOTE=alexvong1995;425532]Hi Ernst,

I have added 2 new CPU_TYPE CPU_IS_S390 and CPU_IS_S390X to platform.h, does it passes the spot-check and self-test on your VM? The platform.h I use is the old one on my repo, not the one you have just posted . I will rebase my change on the latest platform.h if it works.

By the way, could you post the latest version of get_fft_radices.c on your dev branch? I think I run into the duplicate case problem reported by Batalov when building with singlethread.

Thanks people![/QUOTE]

Alex, I integrated your additions into my latest, and also made a few tweaks to my yesterday work - the 3 predefines to "Unknown" (in place of the previous #undef) I used lead to preprocessor warnings on platforms where the variables do end being set to a specific platform-associated value:
[code]
../platform.h:1222:10: warning: 'CPU_NAME' macro redefined
#define CPU_NAME "x86_64"
^
../platform.h:130:9: note: previous definition is here
#define CPU_NAME "Unknown CPU name"
[/code]
so I fiddled those to set to the "unknown" values only if they are still undef'd by the time we reach the end of the platform.h file.

But note I fubared my key-creation on the cloud, so to save time was using Serge's already-setup image with his unzip of the 14.1 packaged code. He has since blown his stuff away. If your instance is still up, could you try auto-build with the merged platform.h file attached below?

Thanks, and happy lunar new year! I guess the celebrations have already started in Asia. (Or will within the next few hours.)

Lorenzo 2016-02-08 09:18

After i replaced platform.h:

[CODE][linux1@lorenzoibm mlucas-14.1]$ ./mlucas

Mlucas 14.1

http://hogranch.com/mayer/README.html

INFO: testing qfloat routines...
CPU Family = S390x, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 4.8.5 20150623 (Red Hat 4.8.5-4).
INFO: Using inline-macro form of MUL_LOHI64.
INFO: MLUCAS_PATH is set to ""
INFO: using 53-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
INFO: testing IMUL routines...
INFO: System has 2 available processor cores.
INFO: testing FFT radix tables...
looking for number of threads to use in nthreads.ini file...
Using NTHREADS = #CPUs = 2.
looking for worktodo.ini file...
worktodo.ini file found...checking next exponent in range...
ERROR: at line 245 of file ./src/get_preferred_fft_radix.c
Assertion failed: CONFIGFILE = mlucas.cfg: open failed![/CODE]

Lorenzo 2016-02-08 10:53

Ohhh.Sorry. It's working!!! Amazing!!!
P.S. Forgot about perfomance tunes (mlucas -s m).

ET_ 2016-02-08 11:07

[QUOTE=Lorenzo;425605]Ohhh.Sorry. It's working!!! Amazing!!!
P.S. Forgot about perfomance tunes (mlucas -s m).[/QUOTE]

Can you tell us anything about its performances? I guess it's running with 2 threads...

Luigi

Lorenzo 2016-02-08 12:30

[QUOTE=ET_;425606]Can you tell us anything about its performances? I guess it's running with 2 threads...

Luigi[/QUOTE]
Not fast :brian-e:

[CODE] 448 msec/iter = 12.80 ROE[avg,max] = [0.224609375, 0.250000000] radices = 56 16 16
480 msec/iter = 14.04 ROE[avg,max] = [0.210880824, 0.250000000] radices = 60 16 16
512 msec/iter = 14.23 ROE[avg,max] = [0.281250000, 0.281250000] radices = 128 8 16
576 msec/iter = 16.14 ROE[avg,max] = [0.208354841, 0.250000000] radices = 144 8 16
640 msec/iter = 19.46 ROE[avg,max] = [0.257421875, 0.312500000] radices = 160 8 16
704 msec/iter = 21.52 ROE[avg,max] = [0.274654715, 0.343750000] radices = 176 8 16
768 msec/iter = 21.99 ROE[avg,max] = [0.209895543, 0.250000000] radices = 48 16 16
832 msec/iter = 24.75 ROE[avg,max] = [0.239439174, 0.312500000] radices = 208 8 16
896 msec/iter = 25.59 ROE[avg,max] = [0.227832031, 0.312500000] radices = 56 16 16
960 msec/iter = 28.33 ROE[avg,max] = [0.212360491, 0.250000000] radices = 60 16 16
1024 msec/iter = 28.24 ROE[avg,max] = [0.312500000, 0.312500000] radices = 128 16 16
1152 msec/iter = 32.89 ROE[avg,max] = [0.208562687, 0.253906250] radices = 144 16 16
1280 msec/iter = 40.32 ROE[avg,max] = [0.235714286, 0.312500000] radices = 20 8 16
1408 msec/iter = 42.28 ROE[avg,max] = [0.273688616, 0.343750000] radices = 176 16 16
1536 msec/iter = 44.80 ROE[avg,max] = [0.223493304, 0.281250000] radices = 192 16 16
1664 msec/iter = 48.48 ROE[avg,max] = [0.246149554, 0.312500000] radices = 208 16 16
1792 msec/iter = 51.94 ROE[avg,max] = [0.220703125, 0.281250000] radices = 224 16 16
1920 msec/iter = 61.24 ROE[avg,max] = [0.212430246, 0.257812500] radices = 60 16 32
2048 msec/iter = 56.56 ROE[avg,max] = [0.312500000, 0.312500000] radices = 128 16 16
2304 msec/iter = 65.73 ROE[avg,max] = [0.208895438, 0.250000000] radices = 144 16 16
2560 msec/iter = 79.33 ROE[avg,max] = [0.245312500, 0.281250000] radices = 20 16 16
2816 msec/iter = 85.93 ROE[avg,max] = [0.272896903, 0.343750000] radices = 176 16 16
3072 msec/iter = 91.91 ROE[avg,max] = [0.225892857, 0.281250000] radices = 192 16 16
3328 msec/iter = 97.41 ROE[avg,max] = [0.241322545, 0.281250000] radices = 208 16 16
3584 msec/iter = 105.64 ROE[avg,max] = [0.220870536, 0.250000000] radices = 224 16 16
3840 msec/iter = 132.28 ROE[avg,max] = [0.213867188, 0.242187500] radices = 60 32 32
4096 msec/iter = 116.38 ROE[avg,max] = [0.224023438, 0.250000000] radices = 16 16 16
4608 msec/iter = 141.80 ROE[avg,max] = [0.201425498, 0.250000000] radices = 144 16 32
5120 msec/iter = 162.11 ROE[avg,max] = [0.236607143, 0.281250000] radices = 20 16 16
5632 msec/iter = 186.77 ROE[avg,max] = [0.277120536, 0.312500000] radices = 44 16 16
6144 msec/iter = 192.85 ROE[avg,max] = [0.214425223, 0.250000000] radices = 48 16 16
6656 msec/iter = 223.12 ROE[avg,max] = [0.242299107, 0.281250000] radices = 208 16 32
7168 msec/iter = 230.10 ROE[avg,max] = [0.223437500, 0.281250000] radices = 56 16 16
7680 msec/iter = 253.42 ROE[avg,max] = [0.219891357, 0.250000000] radices = 60 16 16
8192 msec/iter = 252.43 ROE[avg,max] = [0.282589286, 0.312500000] radices = 1024 16 16
9216 msec/iter = 306.68 ROE[avg,max] = [0.208818163, 0.265625000] radices = 144 32 32
10240 msec/iter = 371.75 ROE[avg,max] = [0.248660714, 0.312500000] radices = 160 32 32
11264 msec/iter = 409.54 ROE[avg,max] = [0.275306920, 0.328125000] radices = 176 32 32
12288 msec/iter = 423.42 ROE[avg,max] = [0.209234401, 0.234375000] radices = 48 16 16
13312 msec/iter = 493.18 ROE[avg,max] = [0.236830357, 0.281250000] radices = 208 32 32
14336 msec/iter = 476.82 ROE[avg,max] = [0.218526786, 0.250000000] radices = 56 16 16
15360 msec/iter = 535.51 ROE[avg,max] = [0.217006138, 0.250000000] radices = 60 16 16
16384 msec/iter = 530.52 ROE[avg,max] = [0.276339286, 0.281250000] radices = 1024 16 16
18432 msec/iter = 606.73 ROE[avg,max] = [0.212458147, 0.250000000] radices = 144 16 16
20480 msec/iter = 745.91 ROE[avg,max] = [0.251116071, 0.281250000] radices = 160 16 16
22528 msec/iter = 822.14 ROE[avg,max] = [0.283984375, 0.328125000] radices = 176 16 16
24576 msec/iter = 833.16 ROE[avg,max] = [0.225502232, 0.250000000] radices = 192 16 16
26624 msec/iter = 975.42 ROE[avg,max] = [0.251785714, 0.281250000] radices = 208 16 16
28672 msec/iter = 971.73 ROE[avg,max] = [0.219098772, 0.250000000] radices = 224 16 16
30720 msec/iter = 1162.44 ROE[avg,max] = [0.242522321, 0.281250000] radices = 960 16 32
32768 msec/iter = 1075.11 ROE[avg,max] = [0.281250000, 0.281250000] radices = 1024 16 32
[/CODE]

Lorenzo 2016-02-08 12:35

CPU Load
[CODE]top - 07:33:43 up 4 days, 3:39, 2 users, load average: 0,80, 0,58, 1,27
Tasks: 97 total, 1 running, 96 sleeping, 0 stopped, 0 zombie
%Cpu(s): 98,8 us, 0,2 sy, 0,0 ni, 0,8 id, 0,0 wa, 0,0 hi, 0,0 si, 0,2 st
KiB Mem : 2042848 total, 1076884 free, 111976 used, 853988 buff/cache
KiB Swap: 501740 total, 501740 free, 0 used. 1853956 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5580 linux1 20 0 89624 36140 1120 S 197,3 1,8 0:59.77 mlucas
[/CODE]

Lorenzo 2016-02-08 12:39

But it's great that mLucas working on mainframe!!! :smile:

:primenet:

ewmayer 2016-02-08 22:22

[QUOTE=Lorenzo;425611]Not fast :brian-e:[/QUOTE]

Thanks, Lorenzo - it seems you truncated the rightmost columns of radices in posting your excerpt for the mlucas.cfg file (e.g. in the first line 448 means 448Kdoubles => complex FFT of legnth 224K = 56*16^3, i.e. there is a trailing 16 missing) - but those are easily inferred.

Just as a point of 'slow' reference, the 32768K timing is roughly what I get on my aged Core2Duo running 2-threaded (1 thread per core) using the SSE2 version of the x86_64 build. My Haswell quad (4-threaded AVX2 build) is 10x faster.

Aside from the overall slowness, the various non-powers-of-2 perform decently well with the notable exception of FFT lengths of form 15*2^n, which are uniformly dismal - the compiler really doesn't like my scalar-double radix-15 DFT macros, it seems. I guess the only positive thing I say (as with politics and economics it's all about the optimistic PR spin, you know) is that the scaling to larger runlengths is quite good - compare the 32768K and 1024K timings, for instance, with what one expects based on the asymptotic O(n log n) FFT opcount scaling.

-------------------

Also, to repeat my earlier question: Do we have any way of seeing what kind of hardware is running underneath things? IBM's version of PowerPC? It would be silly if it were actually x86_64 and the cloud setup were masking that from users.

Mark Rose 2016-02-08 22:37

cat /proc/cpuinfo should reveal some details.

Lorenzo 2016-02-09 07:57

[QUOTE=Mark Rose;425680]cat /proc/cpuinfo should reveal some details.[/QUOTE]
Not much info ...
[CODE][linux1@lorenzoibm ~]$ cat /proc/cpuinfo
vendor_id : IBM/S390
# processors : 2
bogomips per cpu: 20325.00
features : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs
cache0 : level=1 type=Data scope=Private size=128K line_size=256 associativity=8
cache1 : level=1 type=Instruction scope=Private size=96K line_size=256 associativity=6
cache2 : level=2 type=Data scope=Private size=2048K line_size=256 associativity=8
cache3 : level=2 type=Instruction scope=Private size=2048K line_size=256 associativity=8
cache4 : level=3 type=Unified scope=Shared size=65536K line_size=256 associativity=16
cache5 : level=4 type=Unified scope=Shared size=491520K line_size=256 associativity=30
processor 0: version = FF, identification = 016A77, machine = 2964
processor 1: version = FF, identification = 016A77, machine = 2964[/CODE]

[QUOTE]LinuxOne is a specialised Z13 IBM mainframe for Linux. You can run up to 8000 VM simultaneously on it. It is a powerfull beast like IBM does, the top stuff.[/QUOTE]
So it's [URL="https://en.wikipedia.org/wiki/IBM_z13_(microprocessor)"]IBM Z13 CPU[/URL]. Much more details you can find in [URL="http://www.redbooks.ibm.com/redbooks/pdfs/sg248251.pdf"]Technical Guide[/URL]. And i'm not expert but i think it's not Power architecture. It's something special ...

alexvong1995 2016-02-09 10:02

[QUOTE=Mark Rose;425680]cat /proc/cpuinfo should reveal some details.[/QUOTE]
Yes, this usually works very well, but proc filesystem is linux-specific, it may not work for other kernels.
There is also lscpu command which I believe simply read /proc/cpuinfo and display it in a nicer way (using your locale setting).
For FreeBSD, I find this [URL="https://stackoverflow.com/questions/4083848/what-is-the-equivalent-of-proc-cpuinfo-on-freebsd-v8-1?rq=1"]post[/URL].

VictordeHolland 2016-02-09 13:00

[QUOTE=Lorenzo;425707]Not much info ...
[CODE][linux1@lorenzoibm ~]$ cat /proc/cpuinfo
vendor_id : IBM/S390
# processors : 2
bogomips per cpu: 20325.00
features : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs
cache0 : level=1 type=Data scope=Private size=128K line_size=256 associativity=8
cache1 : level=1 type=Instruction scope=Private size=96K line_size=256 associativity=6
cache2 : level=2 type=Data scope=Private size=2048K line_size=256 associativity=8
cache3 : level=2 type=Instruction scope=Private size=2048K line_size=256 associativity=8
cache4 : level=3 type=Unified scope=Shared size=65536K line_size=256 associativity=16
cache5 : level=4 type=Unified scope=Shared size=491520K line_size=256 associativity=30
processor 0: version = FF, identification = 016A77, machine = 2964
processor 1: version = FF, identification = 016A77, machine = 2964[/CODE]So it's [URL="https://en.wikipedia.org/wiki/IBM_z13_(microprocessor)"]IBM Z13 CPU[/URL]. Much more details you can find in [URL="http://www.redbooks.ibm.com/redbooks/pdfs/sg248251.pdf"]Technical Guide[/URL]. And i'm not expert but i think it's not Power architecture. It's something special ...[/QUOTE]
Wauw! That is a lot of cache!

L1 (per core)
-96 KB instruction
-128 KB Data
L2 (per core)
-2 MB instruction
-2 MB Data
L3 (shared)
64 MB eDRAM
L4 (off die, on storage controller chip)
480 MB
[quote]
[FONT=sans-serif]The processor chip has an eight-core design, with either six, seven, or eight active cores, and [/FONT][FONT=sans-serif]operates at 5.0 GHz. Depending on the CPC drawer version (39 PU or 42 PU), 39 - 168 PUs [/FONT][FONT=sans-serif]are available on 1 - 4 CPC drawers.[/FONT][/quote]IBM names it a PU, we would call it a CPUcore.

ewmayer 2016-02-09 20:17

[QUOTE=VictordeHolland;425720]Wauw! That is a lot of cache!

L1 (per core)
-96 KB instruction
-128 KB Data
L2 (per core)
-2 MB instruction
-2 MB Data
L3 (shared)
64 MB eDRAM
L4 (off die, on storage controller chip)
480 MB
IBM names it a PU, we would call it a CPUcore.[/QUOTE]

That explains the excellent timing-scaling in going to larger FFT lengths which we see in Lorenzo's cfg-file results.

If we had some relatively efficient way to map x86_64 SIMD code to this arch's SIMD, things could get rather interesting. I shall have a look at the PDF Lorenzo linked later today.

ewmayer 2016-02-10 03:51

[QUOTE=ewmayer;425751]If we had some relatively efficient way to map x86_64 SIMD code to this arch's SIMD, things could get rather interesting. I shall have a look at the PDF Lorenzo linked later today.[/QUOTE]

Had a look - see nothing actually resembling an instruction set reference in there. Could someone point me to one? With just 139 SIMD instructions it wouldn't have taken up more than a decent-sized chapter or appendix in such a document.

I did note this, however (Chapter 3. Central processor complex system design, p91), which mentions no floating-point among the SIMD - that would be a curious omission if indeed such are supported:
[i]
Here are some examples of SIMD instructions:
o Integer byte to quadword add, sub, and compare
o Integer byte to doubleword min, max, and average
o Integer byte to word multiply
o String find 8-bits, 16-bits, and 32-bits
o String range compare
o String find any equal
o String load to block boundaries and load/store with length[/i]

alexvong1995 2016-02-10 05:24

[QUOTE=ewmayer;425791]Had a look - see nothing actually resembling an instruction set reference in there. Could someone point me to one? With just 139 SIMD instructions it wouldn't have taken up more than a decent-sized chapter or appendix in such a document.

I did note this, however (Chapter 3. Central processor complex system design, p91), which mentions no floating-point among the SIMD - that would be a curious omission if indeed such are supported:
[I]
Here are some examples of SIMD instructions:
o Integer byte to quadword add, sub, and compare
o Integer byte to doubleword min, max, and average
o Integer byte to word multiply
o String find 8-bits, 16-bits, and 32-bits
o String range compare
o String find any equal
o String load to block boundaries and load/store with length[/I][/QUOTE]
Just find this documentation [URL="https://www-304.ibm.com/support/docview.wss?uid=isg29c69415c1e82603c852576700058075a&aid=1"]z/architecture reference summary[/URL] on the internet. Page 22 to page 25 shows the 139 vector instructions (of course I do not really try to count!), something like VMAH (vector multiple and add high)...

Also I have created a s390x testing branch (there are only 2 commits), people interested are encouraged to test if it builds and passes the test, the instruction is as followed:
$ git clone [URL]https://gitlab.com/mlucas-ll/mlucas.git[/URL]
$ cd mlucas && touch * && git checkout s390x
$ mkdir build && cd build && ../configure && make -j && make -j check
(of course you must have git, gcc and make installed!)

Dubslow 2016-02-10 06:09

[QUOTE=alexvong1995;425805]Just find this documentation [URL="https://www-304.ibm.com/support/docview.wss?uid=isg29c69415c1e82603c852576700058075a&aid=1"]z/architecture reference summary[/URL] on the internet. Page 22 to page 25 shows the 139 vector instructions (of course I do not really try to count!), something like VMAH (vector multiple and add high)...

Also I have created a s390x testing branch (there are only 2 commits), people interested are encouraged to test if it builds and passes the test, the instruction is as followed:
$ git clone [URL]https://gitlab.com/mlucas-ll/mlucas.git[/URL]
$ cd mlucas && touch * && git checkout s390x
$ mkdir build && cd build && ../configure && make -j && make -j check
(of course you must have git, gcc and make installed!)[/QUOTE]

If I may go somewhat off topic here, how would you compare GitLab hosting and web interface to GitHub?

alexvong1995 2016-02-10 06:17

Alright, I find out it actually does not build on amd64, see [URL]https://gitlab.com/mlucas-ll/mlucas/builds[/URL], there are serveral error messages:

In file included from .././src/factor.c:3973:0:
.././src/factor_test.h:603:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE'
res64 = twopmodq100_2WORD_DOUBLE(p64, k);

In file included from .././src/factor.c:3973:0:
.././src/factor_test.h:1156:11: error: incompatible type for argument 2 of 'twopmodq100_2WORD_DOUBLE'
res64 = twopmodq100_2WORD_DOUBLE(p64, q128);

In file included from .././src/factor.c:3973:0:
.././src/factor_test.h:1156:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE'
res64 = twopmodq100_2WORD_DOUBLE(p64, q128);

In file included from .././src/factor.c:3973:0:
.././src/factor_test.h:1179:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE_q2'
res64 = twopmodq100_2WORD_DOUBLE_q2(p64, k,k);

ewmayer 2016-02-10 07:46

Thanks for the instruction reference, Alex - that looks like what I need, and there does appear to be a full complement of vector-float functionality.

[QUOTE=alexvong1995;425811]Alright, I find out it actually does not build on amd64, see [URL]https://gitlab.com/mlucas-ll/mlucas/builds[/URL], there are serveral error messages:

In file included from .././src/factor.c:3973:0:
.././src/factor_test.h:603:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE'
res64 = twopmodq100_2WORD_DOUBLE(p64, k);

In file included from .././src/factor.c:3973:0:
.././src/factor_test.h:1156:11: error: incompatible type for argument 2 of 'twopmodq100_2WORD_DOUBLE'
res64 = twopmodq100_2WORD_DOUBLE(p64, q128);

In file included from .././src/factor.c:3973:0:
.././src/factor_test.h:1156:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE'
res64 = twopmodq100_2WORD_DOUBLE(p64, q128);

In file included from .././src/factor.c:3973:0:
.././src/factor_test.h:1179:11: error: too few arguments to function 'twopmodq100_2WORD_DOUBLE_q2'
res64 = twopmodq100_2WORD_DOUBLE_q2(p64, k,k);[/QUOTE]

OK, two things here:

1. I clearly need to update the API for this subset of TF calls, but you should not need factor.o to link currently, because TF functionality is not supported in the default Mlucas build;

2. These calls are getting included because USE_FMADD is getting def'd, which implies you are trying an AVX2/FMA3 build (-DUSE_AVX2 compile flag). Unless AMD has radically upgraded their AVX-and-beyond capabilities in the past year -- according to Wikipedia their very first CPU with AVX2 was 'Carrizo' (formerly codenamed 'Excavator') last year -- AMD builds probably should not go above SSE2. But if you can get AVX and/or AVX2 builds tested on something close to the latest AMD processor, let's see if they still suffer for the 'AVX slower than SSE2' handicap George noted based on his Prime95 tests and decide what to do config-wise based on that.

alexvong1995 2016-02-10 07:51

[QUOTE=Dubslow;425810]If I may go somewhat off topic here, how would you compare GitLab hosting and web interface to GitHub?[/QUOTE]

It is a hard question, there isn't a clearly winner. :smile:
I think having both accounts make it easier to submit pull request to projects hosted in either site.
The wikipedia [URL="https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities"]page[/URL] is a good start on it.
For github, it says "Gratis for public, paid for private."
For gitlab, it says "Unlimited public and private repos, unlimited public and private collaborators".
For web interface, it seems github is more responsive than gitlab.
The UI of both sites looks comparable to me, I am using git most of the time anyway, so I think it is fine as long as the diff is shown nicely.
For CI, github uses travis-ci and gitlab uses gitlab-ci.
For js, gitlab releases all js under mit/expat, so you don't need to worry what the js code is doing.
gitlab also releases its core as free-sw under mit/expat, known as gitlab-ce, github doesn't.
For popularity, github is clearly more popular, but both are used by big organizations.

There is one site getting increasingly popular [URL]https://notabug.org/[/URL], it is a community effort.

Dubslow 2016-02-10 09:40

[QUOTE=alexvong1995;425820]It is a hard question, there isn't a clearly winner. :smile:
I think having both accounts make it easier to submit pull request to projects hosted in either site.
The wikipedia [URL="https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities"]page[/URL] is a good start on it.
For github, it says "Gratis for public, paid for private."
For gitlab, it says "Unlimited public and private repos, unlimited public and private collaborators".
For web interface, it seems github is more responsive than gitlab.
The UI of both sites looks comparable to me, I am using git most of the time anyway, so I think it is fine as long as the diff is shown nicely.
For CI, github uses travis-ci and gitlab uses gitlab-ci.
For js, gitlab releases all js under mit/expat, so you don't need to worry what the js code is doing.
gitlab also releases its core as free-sw under mit/expat, known as gitlab-ce, github doesn't.
For popularity, github is clearly more popular, but both are used by big organizations.

There is one site getting increasingly popular [URL]https://notabug.org/[/URL], it is a community effort.[/QUOTE]
Thanks, that's the sort of excellent summary I was looking for. I've discussed before the idea of moving a lot of the software here away from SourceForge/Subversion, and GitHub is of course the obvious alternative, though it's also good to track alternatives like GitLab and notabug.org.

Dubslow 2016-02-13 07:02

[QUOTE=Dubslow;425825]Thanks, that's the sort of excellent summary I was looking for. I've discussed before the idea of moving a lot of the software here away from SourceForge/Subversion, and GitHub is of course the obvious alternative, though it's also good to track alternatives like GitLab and notabug.org.[/QUOTE]

[QUOTE=alexvong1995;425820]It is a hard question, there isn't a clearly winner. :smile:
I think having both accounts make it easier to submit pull request to projects hosted in either site.
The wikipedia [URL="https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities"]page[/URL] is a good start on it.
For github, it says "Gratis for public, paid for private."
For gitlab, it says "Unlimited public and private repos, unlimited public and private collaborators".
For web interface, it seems github is more responsive than gitlab.
The UI of both sites looks comparable to me, I am using git most of the time anyway, so I think it is fine as long as the diff is shown nicely.
For CI, github uses travis-ci and gitlab uses gitlab-ci.
For js, gitlab releases all js under mit/expat, so you don't need to worry what the js code is doing.
gitlab also releases its core as free-sw under mit/expat, known as gitlab-ce, github doesn't.
For popularity, github is clearly more popular, but both are used by big organizations.

There is one site getting increasingly popular [URL]https://notabug.org/[/URL], it is a community effort.[/QUOTE]

[QUOTE=Dubslow;425810]If I may go somewhat off topic here, how would you compare GitLab hosting and web interface to GitHub?[/QUOTE]

One last post on the matter (maybe a mod might split this off for me):

[url]https://www.b.agilob.net/choose-gitlab-for-your-next-project/[/url]

^ An essay explicating why GitLab is better than GitHub. (Incidentally, I know the guy who runs that site (in an online manner same as I know everyone here) and I've worked with him on some projects.)

Lorenzo 2016-03-11 22:02

[QUOTE=ewmayer;425751]That explains the excellent timing-scaling in going to larger FFT lengths which we see in Lorenzo's cfg-file results.

If we had some relatively efficient way to map x86_64 SIMD code to this arch's SIMD, things could get rather interesting. I shall have a look at the PDF Lorenzo linked later today.[/QUOTE]

Hello! How about SIMD? Did you tried change code to support SIMD instructions for S390x arch? :unsure:

ewmayer 2016-03-11 22:18

[QUOTE=Lorenzo;428798]Hello! How about SIMD? Did you tried change code to support SIMD instructions for S390x arch? :unsure:[/QUOTE]

Excuse me if I didn't make it clear -- that's more in the way of a long-term 'maybe' project, not something one can make happen in a few weeks. And from a number-of-potential-users the coding effort is likely not justified. I am busy with another round of Intel SIMD optimizations and preparing for their next-gen AVX512 chips ... both of which have/will-have a very large user base.

As my code is open-source, anyone with PPC-and-beyond assembler expertise and time to spare is welcome to have at it! Take a modest-size x86_64 SSE2 inline-asm macro - there are many to chose from - stick it into a suitable C test harness, use results to guide translation to IBM assembler.

Lorenzo 2016-03-13 08:45

[QUOTE=ewmayer;428805]Excuse me if I didn't make it clear -- that's more in the way of a long-term 'maybe' project, not something one can make happen in a few weeks. And from a number-of-potential-users the coding effort is likely not justified. I am busy with another round of Intel SIMD optimizations and preparing for their next-gen AVX512 chips ... both of which have/will-have a very large user base.

As my code is open-source, anyone with PPC-and-beyond assembler expertise and time to spare is welcome to have at it! Take a modest-size x86_64 SSE2 inline-asm macro - there are many to chose from - stick it into a suitable C test harness, use results to guide translation to IBM assembler.[/QUOTE]

Ok) Sure) Thank you for explanation :smile:

Just double checked on IBM S390 :tu: [url]http://www.mersenne.org/report_exponent/?exp_lo=41523593&full=1[/url]


All times are UTC. The time now is 05:04.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.