mersenneforum.org why is prime95 much faster than mathematica?
 Register FAQ Search Today's Posts Mark Forums Read

 2019-03-02, 05:39 #12 danaj   "Dana Jacobsen" Feb 2011 Bangkok, TH 2·11·41 Posts Mathematica is a nice general purpose program. But it really isn't the fastest at a lot of things. Good GMP code or specific programs for operations can often be quite a bit faster. As others have states, prime95 and Woltman's library take it one step further for this specific operation.
2019-03-02, 17:54   #13
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·13·157 Posts

Quote:
 Originally Posted by bbb120 Is this algorithm much better and faster than lucas-lehmer algorithm ? Testing Mersenne Primes with Elliptic Curves https://link.springer.com/chapter/10.1007/11870814_27
I doubt it. If it was, someone like Gerbicz, Woltman, Mayer, Preda, Crandall, etc. probably would have mentioned it or capitalized on it. It's a good thing to check for anything related that might be a new development, now and then.

The accepted order of the LL test is (O)p^2 log p log log p.
How the referenced authors come to (O)p^3 bit operations is unclear.

The paper's text is hidden behind Springer's paywall.
The references are shown and are from 2005 or older, so I suspect this paper is not new.

Perhaps Robert Gerbicz or other qualified readers have a subscription to the Springer site or have time to it or review the most recent references and can access the full paper and give an opinion.

However, I couldn't resist a quick look myself (didn't really try to either).

Morain's 2005 paper http://www.lix.polytechnique.fr/~mor...ecpp-final.pdf abstract indicates asymptotically fast ECPP (O)(log N)^4. For Mersenne numbers of interest to GIMPS, N=2^p-1, log N ~ p, so (O)p^4, much worse than LL. In fact it is worse scaling than grammar-school based LL testing, (O)p^3, which I had coded in integer and C and performed to p>26,000 (~8000 digits) over 25 years ago. Morain's implementation was up to "several thousands of decimal digits", with benchmark tables to 2000 digits, while current GIMPS search space is ~p>83 million, > ~ 25 million decimal digits. Perhaps I missed it, but I did not find any expression of what the time units were in Morain's benchmarks. It also states they are using GMP, known to be less efficient than Woltman's gwnum routines. "Our current implementation uses GMP[26] for the basic arithmetic".

Gross's paper is available at https://www.sciencedirect.com/scienc...22314X0400143X and https://ac.els-cdn.com/S0022314X0400143X/1-s2.0-S0022314X0400143X-main.pdf

There's also a preprint available at http://www.math.harvard.edu/~gross/p...s/Mersenne.pdf
The elliptic curve math looks more involved (slower per iteration) than the simple LL iteration.
LL: square, -2, and a mod (which is achieved essentially free in the IBDWT)
ECPP: square, +12, and -12, another squaring, divide by a three-term product (and a mod?)
In addition to being at least 4 times slower, by count of squarings, multiplies, and divides, the ECPP approach seems likely to also have lower cache efficiency for the same size exponent, because there's so much more that must happen per iteration, implying a bigger memory footprint.

Last fiddled with by kriesel on 2019-03-02 at 18:07

 2019-03-03, 00:37 #14 Mark Rose     "/X\(‘-‘)/X\" Jan 2013 Ͳօɾօղէօ 2,789 Posts
2019-03-03, 13:19   #15
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

29×47 Posts

Quote:
 Originally Posted by bbb120 prime95 is much much much fast than mathematica 11.3! prime 95 takes no more than 2 seconds, but mathematica took 49.8735 seconds, prime 95 is 25 times faster than mathematica ! my mathematica code as follow Code: LucasLehmer[n_]:=Module[{Mp,s,k},Mp=2^n-1;s=4;Do[s=Mod[s^2-2,Mp],{k,1,n-2}];If[s==0,Return[True],Return[False]]] Timing[LucasLehmer[86243]]
Very easily you can make it faster, but this is all well known, without Jacobi error check:
((this is achieved on Mathematica 6.0)), LucasLehmer2 is the new faster function

Code:
In[12]:= LucasLehmer[n_] :=
Module[{Mp, s, k}, Mp = 2^n - 1; s = 4;
Do[s = Mod[s^2 - 2, Mp], {k, 1, n - 2}];
If[s == 0, Return[True], Return[False]]]

LucasLehmer2[p_] :=
Module[{Mp, s, k}, Mp = 2^p - 1; s = 4;
Do[s = s^2 - 2; If[s < 0, s += Mp];
s = BitAnd[s, Mp] + BitShiftRight[s, p];
If[s >= Mp, s -= Mp], {k, 1, p - 2}];
If[s == 0, Return[True], Return[False]]]

In[14]:= Timing[LucasLehmer2[86243]]

Out[14]= {16.469, True}

In[15]:= Timing[LucasLehmer[86243]]

Out[15]= {61.562, True}
Quote:
 Originally Posted by kriesel The accepted order of the LL test is (O)p^2 log p log log p.
At least in theory there is a faster multiplication algorithm, what would be enough for us, because reduction by mod mp is easy:
https://en.wikipedia.org/wiki/F%C3%BCrer%27s_algorithm

with that you need "only" O(p^2*log(p)*2^(2*log*(p))) time for the LLT(p).

(in practice it would be slower even in our wavefront).

 2019-03-04, 03:55 #16 bbb120   Feb 2019 29 Posts Very Good！ Great！ you let me found I am ignorant! you are really very very clever! but what is Jacobi error check? I only know jacobi symbol in number theory ! Last fiddled with by bbb120 on 2019-03-04 at 03:57
2019-03-04, 15:49   #17
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×13×157 Posts

Quote:
 Originally Posted by bbb120 Very Good！ ... but what is Jacobi error check? I only know jacobi symbol in number theory !
See the original post https://mersenneforum.org/showpost.p...3&postcount=30
There's a brief nutshell description in https://www.mersenneforum.org/showpo...46&postcount=6

 2019-03-04, 21:10 #18 ewmayer ∂2ω=0     Sep 2002 República de California 3×3,769 Posts I think e.g. Jacobi and Gerbicz residue-integrity checks are way beyond what a n00b needs to learn - focus on the basics: 1. Get acquainted with the various fast - as in subquadratic - multiply algorithms, especially FFT-based mul for large integers. Implementing a basic power-of-2 Numerical Recipes-style FFT and using it for bigint mul by adding a normalize/round/propagate-carries step is very helpful in gaining hands-on understanding; 2. Speed things up by specializing to mersenne moduli and using the special form of those to construct a fast bitwise "bit folded" mod for the output of a standard zero-padded bigint product; 3. Print out a copy of the original Crandall/Fagin DWT paper and work the numerical examples containing therein yourself. Then implement the IBDWT-based "implicit mod" in the context of your basic FFT-mul code, and see what kind of a speedup you get from no longer needing to zero-pad your multiply inputs. Lots of people have done [1], a fair % of those have gone on to do [2], but very few have proceeded further and mastered [3]. Those who have, understand why modern optimized codes can do LL-tests as fast as they do.
2019-04-21, 14:27   #19
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×13×157 Posts

Quote:
 Originally Posted by bbb120 I test M86243=2^86243-1 with prime95 and mathematica , by using the same algorithm -lucas lehmer algorithm , but prime95 is much much much fast than mathematica 11.3! prime 95 takes no more than 2 seconds, but mathematica took 49.8735 seconds, prime 95 is 25 times faster than mathematica ! ... my mathematica code as follow Code: LucasLehmer[n_]:=Module[{Mp,s,k},Mp=2^n-1;s=4;Do[s=Mod[s^2-2,Mp],{k,1,n-2}];If[s==0,Return[True],Return[False]]] Timing[LucasLehmer[86243]]
"The kernel interprets expressions (Wolfram Language code) and returns result expressions." https://en.wikipedia.org/wiki/Wolfram_Mathematica If it is in fact acting as an interpreter, rather than a compiler, there's a performance hit for Mathematica right there. I don't recall the details, but in the 1980s, there was a large difference in performance between interpreted BASIC and compiled BASIC on a PC, for the same source code and hardware. https://en.wikipedia.org/wiki/Interp...ng)#Efficiency

"Wolfram Language code can be converted to C code or to an automatically generated DLL." I wonder what sort of performance compiling the resulting C code in a good optimizing compiler would provide for the very short Mathematica program. (I'm confident that it would not match the speed of prime95/mprime or Mlucas on the same hardware. But it may be closer than a 25:1 disadvantage.) 25:1 is not an unusually large magnitude of performance difference. Different methods of multiword multiplication differ by thousands, or more, at current GIMPS exponent levels. https://www.mersenneforum.org/showpo...21&postcount=7

Interpreter based environments offer quick turnaround time during development changes, sacrificing run-time performance for turnaround time and user convenience during coding and testing. I have a vague recollection of a development experience long ago that offered both interpreter and compiler, and alternating between them.

There's good reason for Mathematica performance to differ from prime95. Mathematica is probably designed more for user convenience in developing and using scripts, and doing many things well enough (fast enough for ordinary usage), on numbers of a great many forms. Prime95 is designed for doing a relatively few things, with numbers of a very small set of specific forms, EXTREMELY efficiently on a wide variety of cpus, and very little if any user scripting.

Last fiddled with by kriesel on 2019-04-21 at 15:07

2019-04-21, 16:20   #20
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

29·47 Posts

Quote:
 Originally Posted by danaj Mathematica is a nice general purpose program. But it really isn't the fastest at a lot of things. Good GMP code or specific programs for operations can often be quite a bit faster.
The costly Mathematica (and also Maple) is using the free gmp:
http://www.wolfram.com/legal/third-p...enses/gmp.html
https://www.maplesoft.com/support/he....aspx?path=GMP

 Similar Threads Thread Thread Starter Forum Replies Last Post ChemicalCat59 Information & Answers 6 2017-02-20 23:22 JuanTutors Information & Answers 7 2007-06-14 17:29 jinydu Lounge 0 2007-05-07 05:05 Zeta-Flux Math 6 2005-09-22 21:47 wakko Software 4 2005-02-11 01:45

All times are UTC. The time now is 08:57.

Mon Jul 13 08:57:10 UTC 2020 up 110 days, 6:30, 0 users, load averages: 2.08, 1.91, 1.92