![]() |
LLR error checking overhead of 66% - is this normal?
Hi all,
I'm considering upgrading my machines from a somewhat older version of LLR (3.8.20) to the latest version (4.0.3) so that they can benefit from the new Gerbicz error checking to guard against potential hardware errors, cosmic-ray bit flips, etc. However, the overheads I'm seeing in my benchmarks are quite extreme, so I'm wondering if this is "normal", or if I'm doing something wrong. :redface: Specifically, I benchmarked iteration times for the candidate 755857*2^5500077-1, using two independent LLR instances running side-by-side with -t4 (four compute threads) each on an 8-core/8-thread Intel i7-9700. I got about [b]1.18 ms/iter using LLR 4.0.3 in the default mode (Fermat PRP test with error checking enabled)[/b], but [b]0.71 ms/iter when doing an "old style" deterministic LLR test with no error checking[/b] (-oErrorChecking=0, which produced iteration times similar to running the older LLR 3.8.20 with default settings). This represents a [b]66% slowdown for the error-checking mode[/b], which surprised me since I had gotten the impression (from reading around these forums) that PRP+Gerbicz checks is "supposed to" incur about a 15% overhead. 15% sounded like a pretty good deal in exchange for high confidence that the results are good (likely eliminating the need for future double-checking); but 66% is a more painful tradeoff, since that's a full 2/3 the effort of doing both a first pass and a doublecheck at the same time. I think I must be doing something wrong here, because when I tried running the same test with mprime (Prime95) v30.8 (build 18), using the same threading setup (two single-worker clients using CoresPerTest=4), I got [b]0.72 ms/iter[/b] - nearly the same (within margin of error) as I got using non-error-checking LLR tests under LLR v3.8.20! Error checking was definitely enabled for these runs, because mprime announced when it started the test that it was doing a "Gerbicz error-checking PRP test". For context: all of these runs are on a 64-bit Linux system (i7-9700 with 16 GB of RAM, running Fedora 36). I'm using the 64-bit Linux dynamically-linked LLR 4.0.3 binary downloaded from Jean Penné's official site ([url]http://jpenne.free.fr/[/url]). mprime is the latest 64-bit Linux build downloaded from mersenne.org (the website says it's 30.8 build 17, but "./mprime -v" reports build 18). Am I doing something wrong, or is LLR 4.0.3 really that much slower (for the same Fermat PRP test on the same number) compared to a similarly-modern Prime95/mprime? :confused2: Thanks, Max |
OK, this is interesting. I tried downloading the latest version (1.3.3) of Pavel Atnashev's LLR2 ([url]https://github.com/patnashev/llr2[/url]) and running the same test on it. It is much faster and can run PRP tests at the same speed as Prime95/mprime (which, as far as I can tell, is essentially just as fast as an unchecked LLR test).
Unlike recent LLR versions, LLR2 still defaults to an unchecked LLR test for k*2^n-1 candidates, but I was able to force a PRP test with -oGerbicz=1 (or -oForcePRP=1). In this mode, the output confirmed that it was indeed performing Gerbicz checks periodically throughout the test, just like Prime95/mprime. Why, then, is LLR 4.0.3 so much slower than LLR2 and Prime95/mprime? :huh: |
You'll have to ask Jean Penne what is happening here. The overhead of GEC is about 1-3% (actually it can vary depending on the check frequency), but it should have the exact same iteration times as regular proth test, with the extra time take during intermediate error checks
|
Are the two programs using the same FFT size? I recommend running initially with a tight FFT size and increasing it on the fly if a Gerbicz error occurs.
|
[QUOTE=paulunderwood;626483]Are the two programs using the same FFT size? I recommend running initially with a tight FFT size and increasing it on the fly if a Gerbicz error occurs.[/QUOTE]
Indeed they are (560K in both cases): [code]$ ./llr64_4.0.3 -d -q"755857*2^5500077-1" -t4 Starting Fermat PRP test of 755857*2^5500077-1 Using zero-padded FMA3 FFT length 560K, Pass1=448, Pass2=1280, clm=4, 4 threads, a = 3 Iteration: 50000 / 5500096 [0.41%], ms/iter: 1.170, ETA: 01:46:18[/code] [code]$ ./llr2_1.3.3_30bb1_linux64 -d -q"755857*2^5500077-1" -t4 -oGerbicz=1 Gerbicz check is requested, switching to PRP. Starting probable prime test of 755857*2^5500077-1 Using zero-padded FMA3 FFT length 560K, Pass1=448, Pass2=1280, clm=4, 4 threads, a = 3, L2 = 625*550 755857*2^5500077-1, bit: 40001 / 5500077 [0.72%], 0 checked. Time per bit: 0.710 ms.[/code] |
[QUOTE=mdettweiler;626484]Indeed they are (560K in both cases):
[code]$ ./llr64_4.0.3 -d -q"755857*2^5500077-1" -t4 Starting Fermat PRP test of 755857*2^5500077-1 Using zero-padded FMA3 FFT length 560K, Pass1=448, Pass2=1280, clm=4, 4 threads, a = 3 Iteration: 50000 / 5500096 [0.41%], ms/iter: 1.170, ETA: 01:46:18[/code] [code]$ ./llr2_1.3.3_30bb1_linux64 -d -q"755857*2^5500077-1" -t4 -oGerbicz=1 Gerbicz check is requested, switching to PRP. Starting probable prime test of 755857*2^5500077-1 Using zero-padded FMA3 FFT length 560K, Pass1=448, Pass2=1280, clm=4, 4 threads, a = 3, L2 = 625*550 755857*2^5500077-1, bit: 40001 / 5500077 [0.72%], 0 checked. Time per bit: 0.710 ms.[/code][/QUOTE] There are different gwnum versions but you are safe with Gerbicz check and llr2, still testing PRST, its 15-20% faster where llr2 has a penalty but can only running input files with one line or single tests. [URL="https://github.com/patnashev/prst"]https://github.com/patnashev/prst[/URL] command line: prst.exe -d -fermat -check strong -q"755857*2^5500077-1" -t4 //-fermat - check strong is Gerbicz check prst.exe -d -fermat -check strong -ini inputfile -t4 //with input file If you are running llr2 and prpnet. be sure that if a prime is found, the log says "is PRP" not "is prime" |
| All times are UTC. The time now is 14:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.