mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   LLR Version 3.8.22 released (https://www.mersenneforum.org/showthread.php?t=24238)

Jean Penné 2019-03-28 20:56

LLR Version 3.8.22 released
 
Hi All,

I uploaded today the version 3.8.22 of the LLR program.
You can find it now on my personal site :

[url]http://jpenne.free.fr/[/url]

The 32bit Windows and Linux compressed binaries are available as usual.
The Linux 64bit binaries are released here, and also the Mac OS 32bit and 64bit binaries I can now build using a Mac mini machine with software OS X 10.9.1
I uploaded also the complete source in a compressed file ; it may be used to build the 64bit Windows binaries.
In order to build the Linux binaries, I had to add -DSQLITE_OMIT_LOAD_EXTENSION in the CFLAGS in gwnum/Makefile and gwnum/make64.
I had also to suppress irrelevant source and binary libraries references in makemac and makemac64. The makefiles with .orig suffixes are the unmodified ones...

No new feature in this LLR version, but it is linked with the Version 29.6 of George Woltman's gwnum library which includes AVX 512 code.

As usual, I need help to build the 64bit Windows binaries.
Please, inform me if you encountered any problem while using this new version.
Best Regards,
Jean

rogue 2019-03-28 21:24

I'm suspecting that nobody is using the 32-bit Mac OS binaries. The only Macs built with a 32-bit Intel CPU were built between 2005 and 2007. Prior to that they were using PowerPC and after that they had 64-bit CPUs.

Even with pfgw, nobody has requested a 32-bit build in years so I haven't even bothered with releasing one.

rebirther 2019-03-28 21:55

@Jean, files sent for 64bit windows.


@all, files are available [URL="http://www.bc-team.org/downloads.php?cat=7"]here[/URL]

AG5BPilot 2019-03-29 17:30

Good news and bad news.

Bad news first: The new LLR is about 4% slower than 3.8.21 on my Haswell:

[code]C:\Temp\LLR3.8.21>cllr64.3.8.21 -d -q"55459*2^30071686+1" -t4
Starting Proth prime test of 55459*2^30071686+1
Using all-complex FMA3 FFT length 2880K, Pass1=384, Pass2=7680, 4 threads, a = 3

55459*2^30071686+1, bit: 30000 / 30071701 [0.09%]. Time per bit: 4.509 ms.

C:\Temp\LLR3.8.22>cllr64.3.8.22 -d -q"55459*2^30071686+1" -t4
Starting Proth prime test of 55459*2^30071686+1
Using all-complex FMA3 FFT length 2880K, Pass1=640, Pass2=4608, clm=1, 4 threads, a = 3
55459*2^30071686+1, bit: 30000 / 30071701 [0.09%]. Time per bit: 4.715 ms.[/code]

Good news is it's about 30% faster on a Skylake-X:

[code]$ cllr64 -d -q"55459*2^30071686+1" -t10
Starting Proth prime test of 55459*2^30071686+1
Using all-complex FMA3 FFT length 2880K, Pass1=384, Pass2=7680, 10 threads, a = 3
55459*2^30071686+1, bit: 20000 / 30071701 [0.06%]. Time per bit: 1.532 ms.
FYI, I stopped here:
> $ llr64 -d -q"55459*2^30071686+1" -t10
> Starting Proth prime test of 55459*2^30071686+1
> Using all-complex AVX-512 FFT length 2880K, Pass1=192, Pass2=15K, clm=4, 10 threads, a = 3
> 55459*2^30071686+1, bit: 1120000 / 30071701 [3.72%]. Time per bit: 1.061 ms.
> Caught signal. Terminating.[/code]

The Haswell is an i5-4670K and the Skylake-X is an i9-9820X.

Jean Penné 2019-03-30 09:51

Win64 binaries
 
[QUOTE=rebirther;512081]@Jean, files sent for 64bit windows.


@all, files are available [URL="http://www.bc-team.org/downloads.php?cat=7"]here[/URL][/QUOTE]

Many thanks to you, Rebirther,

The GUI and console Windows64 binaries are now available!

Best Regards,
Jean

Jean Penné 2019-03-30 10:11

[QUOTE=rogue;512079]I'm suspecting that nobody is using the 32-bit Mac OS binaries. The only Macs built with a 32-bit Intel CPU were built between 2005 and 2007. Prior to that they were using PowerPC and after that they had 64-bit CPUs.

Even with pfgw, nobody has requested a 32-bit build in years so I haven't even bothered with releasing one.[/QUOTE]

You are right, Mark, but I built the so called "32 bits" llr on a 64 bits Mac OS X machine (it is possible to do that on Mac OS X, not on Linux...) it works on the same machine, almost as fast as the llr64 binary, but allows to use the 32 bits prefactoring code I linked in it.
It may be interesting for Gaussian Mersenne Norm or Wagstaff hunters...

Best Regards,
Jean

AG5BPilot 2019-03-30 13:48

Bug?
 
There may be a bug.

[url=https://www.primegrid.com/show_host_detail.php?hostid=910168]This computer[/url] is running the AVX-512 transform just fine on numerous Proth numbers at PrimeGrid, specifically, PPS-MEGA, PPS, and PPSE. (100% good results.) But when running SGS tasks, it's producing incorrect residues 100% of the time. Normally with hardware errors on SGS we would see some false primes, but there were none. This makes it less likely that the bad residues are a result of a hardware error.

The CPU is an i7-7820X.

Example:
Tested number: 4344392810277*2^1290000-1

Result from a computer (i7-4770K) with AVX transform:
LLR command line: primegrid_llr -d -oDiskWriteTime=1 llr.in
Using zero-padded AVX FFT length 128K, Pass1=512, Pass2=256
4344392810277*2^1290000-1 is not prime. LLR Res64: 7D325B0469A1226E Time : 1162.344 sec.

Result from a computer (Xeon E5-2670) with FMA3 transform:
LLR command line: primegrid_cllr.exe -d -oDiskWriteTime=1 llr.in
Using zero-padded FMA3 FFT length 128K, Pass1=512, Pass2=256
4344392810277*2^1290000-1 is not prime. LLR Res64: 7D325B0469A1226E Time : 2594.600 sec

Bad result with AVX-512 transform:
LLR command line: primegrid_cllr.exe -d -oDiskWriteTime=1 -oThreadsPerTest=1 llr.in
Using zero-padded AVX-512 FFT length 128K, Pass1=128, Pass2=1K, clm=1
4344392810277*2^1290000-1 is not prime. LLR Res64: B878873BD88188FB Time : 313.633 sec.

AG5BPilot 2019-03-30 14:56

We also have an [url=http://www.primegrid.com/show_host_detail.php?hostid=929708]i9-7900X[/url] that is running the SGS tests [U]correctly [/U]with AVX-512.

EDIT: There's also an i9-7980XE that's running correctly.

ryanp 2019-03-30 16:06

[QUOTE=AG5BPilot;512217]We also have an [url=http://www.primegrid.com/show_host_detail.php?hostid=929708]i9-7900X[/url] that is running the SGS tests [U]correctly [/U]with AVX-512.

EDIT: There's also an i9-7980XE that's running correctly.[/QUOTE]

Just as a data point, I got the correct residue:

[code]4344392810277*2^1290000-1 is not prime. LLR Res64: 7D325B0469A1226E Time : 1147.827 sec.[/code]

with the new binary, on 2 separate Skylake machines.

[code]$ ./sllr64 -v
LLR Program - Version 3.8.22, using Gwnum Library Version 29.6[/code]

Prime95 2019-03-30 17:35

[QUOTE=AG5BPilot;512208]There may be a bug.[/QUOTE]

I suggest getting prime95 and running an AVX-512 torture test on the machine.

I also suggest building LLR with gwnum 29.7.

AG5BPilot 2019-03-30 17:36

[QUOTE=ryanp;512229]Just as a data point, I got the correct residue:

[code]4344392810277*2^1290000-1 is not prime. LLR Res64: 7D325B0469A1226E Time : 1147.827 sec.[/code]

with the new binary, on 2 separate Skylake machines.

[code]$ ./sllr64 -v
LLR Program - Version 3.8.22, using Gwnum Library Version 29.6[/code][/QUOTE]

Which specific CPUs? All the ones I have data for, good or bad, are Skylakes.


All times are UTC. The time now is 17:07.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.