20190302, 05:39  #12 
"Dana Jacobsen"
Feb 2011
Bangkok, TH
2·11·41 Posts 
Mathematica is a nice general purpose program. But it really isn't the fastest at a lot of things. Good GMP code or specific programs for operations can often be quite a bit faster. As others have states, prime95 and Woltman's library take it one step further for this specific operation.

20190302, 17:54  #13  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2·13·157 Posts 
Quote:
The accepted order of the LL test is (O)p^2 log p log log p. How the referenced authors come to (O)p^3 bit operations is unclear. The paper's text is hidden behind Springer's paywall. The references are shown and are from 2005 or older, so I suspect this paper is not new. Perhaps Robert Gerbicz or other qualified readers have a subscription to the Springer site or have time to it or review the most recent references and can access the full paper and give an opinion. However, I couldn't resist a quick look myself (didn't really try to either). Morain's 2005 paper http://www.lix.polytechnique.fr/~mor...ecppfinal.pdf abstract indicates asymptotically fast ECPP (O)(log N)^4. For Mersenne numbers of interest to GIMPS, N=2^p1, log N ~ p, so (O)p^4, much worse than LL. In fact it is worse scaling than grammarschool based LL testing, (O)p^3, which I had coded in integer and C and performed to p>26,000 (~8000 digits) over 25 years ago. Morain's implementation was up to "several thousands of decimal digits", with benchmark tables to 2000 digits, while current GIMPS search space is ~p>83 million, > ~ 25 million decimal digits. Perhaps I missed it, but I did not find any expression of what the time units were in Morain's benchmarks. It also states they are using GMP, known to be less efficient than Woltman's gwnum routines. "Our current implementation uses GMP[26] for the basic arithmetic". Gross's paper is available at https://www.sciencedirect.com/scienc...22314X0400143X and https://ac.elscdn.com/S0022314X0400143X/1s2.0S0022314X0400143Xmain.pdf There's also a preprint available at http://www.math.harvard.edu/~gross/p...s/Mersenne.pdf The elliptic curve math looks more involved (slower per iteration) than the simple LL iteration. LL: square, 2, and a mod (which is achieved essentially free in the IBDWT) ECPP: square, +12, and 12, another squaring, divide by a threeterm product (and a mod?) In addition to being at least 4 times slower, by count of squarings, multiplies, and divides, the ECPP approach seems likely to also have lower cache efficiency for the same size exponent, because there's so much more that must happen per iteration, implying a bigger memory footprint. Last fiddled with by kriesel on 20190302 at 18:07 

20190303, 00:37  #14 
"/X\(‘‘)/X\"
Jan 2013
Ͳօɾօղէօ
2,789 Posts 

20190303, 13:19  #15  
"Robert Gerbicz"
Oct 2005
Hungary
29×47 Posts 
Quote:
((this is achieved on Mathematica 6.0)), LucasLehmer2 is the new faster function Code:
In[12]:= LucasLehmer[n_] := Module[{Mp, s, k}, Mp = 2^n  1; s = 4; Do[s = Mod[s^2  2, Mp], {k, 1, n  2}]; If[s == 0, Return[True], Return[False]]] LucasLehmer2[p_] := Module[{Mp, s, k}, Mp = 2^p  1; s = 4; Do[s = s^2  2; If[s < 0, s += Mp]; s = BitAnd[s, Mp] + BitShiftRight[s, p]; If[s >= Mp, s = Mp], {k, 1, p  2}]; If[s == 0, Return[True], Return[False]]] In[14]:= Timing[LucasLehmer2[86243]] Out[14]= {16.469, True} In[15]:= Timing[LucasLehmer[86243]] Out[15]= {61.562, True} https://en.wikipedia.org/wiki/F%C3%BCrer%27s_algorithm with that you need "only" O(p^2*log(p)*2^(2*log*(p))) time for the LLT(p). (in practice it would be slower even in our wavefront). 

20190304, 03:55  #16 
Feb 2019
29 Posts 
Very Good！
Great！ you let me found I am ignorant! you are really very very clever! but what is Jacobi error check? I only know jacobi symbol in number theory ! Last fiddled with by bbb120 on 20190304 at 03:57 
20190304, 15:49  #17  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×13×157 Posts 
Quote:
There's a brief nutshell description in https://www.mersenneforum.org/showpo...46&postcount=6 

20190304, 21:10  #18 
∂^{2}ω=0
Sep 2002
República de California
3×3,769 Posts 
I think e.g. Jacobi and Gerbicz residueintegrity checks are way beyond what a n00b needs to learn  focus on the basics:
1. Get acquainted with the various fast  as in subquadratic  multiply algorithms, especially FFTbased mul for large integers. Implementing a basic powerof2 Numerical Recipesstyle FFT and using it for bigint mul by adding a normalize/round/propagatecarries step is very helpful in gaining handson understanding; 2. Speed things up by specializing to mersenne moduli and using the special form of those to construct a fast bitwise "bit folded" mod for the output of a standard zeropadded bigint product; 3. Print out a copy of the original Crandall/Fagin DWT paper and work the numerical examples containing therein yourself. Then implement the IBDWTbased "implicit mod" in the context of your basic FFTmul code, and see what kind of a speedup you get from no longer needing to zeropad your multiply inputs. Lots of people have done [1], a fair % of those have gone on to do [2], but very few have proceeded further and mastered [3]. Those who have, understand why modern optimized codes can do LLtests as fast as they do. 
20190421, 14:27  #19  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×13×157 Posts 
Quote:
"Wolfram Language code can be converted to C code or to an automatically generated DLL." I wonder what sort of performance compiling the resulting C code in a good optimizing compiler would provide for the very short Mathematica program. (I'm confident that it would not match the speed of prime95/mprime or Mlucas on the same hardware. But it may be closer than a 25:1 disadvantage.) 25:1 is not an unusually large magnitude of performance difference. Different methods of multiword multiplication differ by thousands, or more, at current GIMPS exponent levels. https://www.mersenneforum.org/showpo...21&postcount=7 Interpreter based environments offer quick turnaround time during development changes, sacrificing runtime performance for turnaround time and user convenience during coding and testing. I have a vague recollection of a development experience long ago that offered both interpreter and compiler, and alternating between them. There's good reason for Mathematica performance to differ from prime95. Mathematica is probably designed more for user convenience in developing and using scripts, and doing many things well enough (fast enough for ordinary usage), on numbers of a great many forms. Prime95 is designed for doing a relatively few things, with numbers of a very small set of specific forms, EXTREMELY efficiently on a wide variety of cpus, and very little if any user scripting. Last fiddled with by kriesel on 20190421 at 15:07 

20190421, 16:20  #20  
"Robert Gerbicz"
Oct 2005
Hungary
29·47 Posts 
Quote:
http://www.wolfram.com/legal/thirdp...enses/gmp.html https://www.maplesoft.com/support/he....aspx?path=GMP 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
How to make Prime95 run faster without charger?  ChemicalCat59  Information & Answers  6  20170220 23:22 
Long Division in Mathematica  JuanTutors  Information & Answers  7  20070614 17:29 
Mathematica 6 Released  jinydu  Lounge  0  20070507 05:05 
Mathematica questionsolving systems  ZetaFlux  Math  6  20050922 21:47 
Is Mathematica really slow?  wakko  Software  4  20050211 01:45 