Mlucas currently only supports as high as AVX2 - the main point of getting the KNL dev-system was to allow us folks keen to add AVX-512 support to our codes a place to do that.

If you just auto-build the summer 2015 release using the simple instructions, you should end up with a working AVX2 binary. At that point see my previous note above for the cmd-line flag used to control/limit the threadcount. Suggest you try -nthread values 1,2,4,8,16,32,64, all at just the 4096K FFT length for now.

I'm still trying to work out an ssh-access issue ... David says I should not need a password on initial login, but I keep getting prompted for one. I asked him to simply create a temp-password for me to login and reset, but unlike us crazies he appears to keep sane 'no internet after ***pm' hours. :)
I'm doing that right now. I'm not primarily a programmer; I'm the lead systems admin for a couple HPC clusters. I purchased a KNL development box as soon as they were available to see if they would be a good upgrade to the cluster, but they neither scaled nor performed anywhere near as well as Intel claimed. The KNL system just sits idle now and I mess around with it occasionally.

As far as key issues, use ssh -i ~/.ssh/id_rsa or whatever key you created. Your key needs to be in your login users directory on the remote machine under the ~/.ssh/authorized_keys file. If you want to do a 'ssh -vvvvv' and PM it to me, I can take a look at it for you.

100 iterations of M77597293 with FFT length 4194304 = 4096 K
Res64: 8CC30E314BF3E556. AvgMaxErr = 0.293024554. MaxErr = 0.328125000. Program: E14.1
Res mod 2^36 = 5569242454
Res mod 2^35 - 1 = 22305398329
Res mod 2^36 - 1 = 64001568053
Clocks = 00:00:02.610

/ **************************************************************************** /

Done ...

Edit: Missed a 0

1000 iterations of M77597293 with FFT length 4194304 = 4096 K
Res64: 5F87421FA9DD8F1F. AvgMaxErr = 0.292703043. MaxErr = 0.343750000. Program: E14.1
Res mod 2^36 = 67274379039
Res mod 2^35 - 1 = 26302807323
Res mod 2^36 - 1 = 54919604018
Clocks = 00:00:23.097

