![]() |
|
|
#34 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×32×7×43 Posts |
Quote:
I want to reiterate here, that the efficient sequence seems to me to be learning about the available algorithms and their strengths and applicabilities first, selecting an efficient algorithm which scales well to exponents of interest next, coding it efficiently in a high level language next, optimizing (key portions of) it such as implementing in tuned assembly code, last. Given the meaning of optimization, and the actual and extrapolated relative run times of accepted commonly used GIMPS code and your ASM LL code, your reference to optimization occurs to me as a misuse of the word optimization. Your ASM LL code extrapolates to literally millions of times too slow to be useful in the exponent range of interest now in GIMPS, which is p~44,000,000 and up for double checks, ~81 million and up to 109. (Years or decades projected run time for exponents of 3 to 7 million, that are commonly used as software and hardware quick reliability tests of minutes or hours duration; thousands of years projected run time per exponent for current LL double checks, to ~25 million years for the maximal GIMPS mersenne.org exponent, p ~109.) It is extrapolating to slower than a simple Perl & BigInt library implementation beginning around p=105. I have not made any adjustments for hardware speed differences, just noted the hardware involved. I used a few high precision timings from your zip file along with your posted timings for exponent between 10k and 20k to regression fit a power function to extrapolate run time, and did similar for a couple other programs and a hypothetical fft implementation. See the attachment. Code:
# perlll3.pl (C) 2018 Kriesel an adaptation to LL w/o Jacobi check, from perlpm1.pl with Jacobi check testing, with perhaps some lingering unneeded code
sub gettimevals { # reuse my std routine, some of which is not necessary here
our $localtimestring=localtime; # global so can be used elsewhere
our $gmtimestring=gmtime; # " "
# pattern matches are presumed to succeed, not tested for success
my ( $dow, $mon, $dom, $time, $yr ) = $localtimestring =~ /^(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\d+)$/;
my ( $gmdow, $gmmon, $gmdom, $gmtime, $gmyr ) = $gmtimestring =~ /^(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\d+)$/;
$localtimestring = $dow.' '.$yr.' '.$mon.' '.$dom.' '.$time; #reassemble the pieces in my preferred order
$gmtimestring = $gmdow.' '.$gmyr.' '.$gmmon.' '.$gmdom.' '.$gmtime;
time(); # this is the return value of the subroutine, integer seconds count from start of the epoch
}
sub zerothecounts {
$nprimeexponents=0;
$mersennecount=0;
}
use bigint;
use Math::Prime::Util qw/is_prime/;
use Math::BigInt lib => 'GMP'; #http://perldoc.perl.org/Math/BigInt.html
print "Perlll3.pl v 0.10 Aug 3 2018 (C) 2018 Kriesel\n";
$verbosity=0;
$plo = 132048; # start with an odd prime, and it makes no sense to start with a value so small there's no odd prime < $p
$phi = 132050; # upper limit of primes p to be tested; recommend < 100K, is ~1 hour/exponent on an i3 M370; scales as ~ p^2.558
if ( $plo %2 == 0 ) { $plo++; } # ensure start with an odd prime
my $r = Math::BigInt->new(1);
my $t64 = Math::BigInt->new(1);
$t64 = 2**64;
zerothecounts;
$time0 = gettimevals;
for ($p=$plo; $p<$phi; $p+=2 ) {
if ( is_prime($p) ) { # try on all prime exponents >=p lo, <p hi (no TF or P-1 here, or avoidance of already-known nonprimes)
$nprimeexponents++;
$mp = 2**$p-1;
$r->bone();
$r->bmul(4);
$time1 = gettimevals;
for ( $j=1; $j<=$p-2; $j++) {
$r->bmodpow(2,$mp);
$r->bsub(2);
}
$time2 = gettimevals;
$dt = $time2 -$time1;
if ( $dt == 1 ) { $pls=''; } else { $pls='s'; }
$r64 = ( $r-> bmod ( $t64 ) );
if ( $r64 > 0 ) {
print "p=$p composite, res64 =";
printf " %016x %d second%s\n",$r64, $dt, $pls;
} else {
print "p=$p Mersenne prime in $dt second$pls.\n";
$mersennecount++;
}
}
}
$dt = $time2 -$time0;
print "Number of prime exponents $nprimeexponents, number of Mersenne primes $mersennecount, total run time $dt seconds.\n";
Last fiddled with by kriesel on 2018-08-04 at 16:50 |
|
|
|
|
|
|
#35 |
|
Aug 2018
GEORGIA Republic
2810 Posts |
kriesel, trust me, I fully understand and never argued, about FFT vs simple Math.
you don't need to calculate, how long it will take.. Start: 18:52:42.906 Prime_86243 RES64: 0000000000000000 19:30:44.578 also, just did with my code Square 12500000 -> 3005sec. I just think, that, maybe my code is better then compiler's compiled?! |
|
|
|
|
|
#36 |
|
"Forget I exist"
Jul 2009
Dumbassville
26×131 Posts |
you're either missing prefixes, or a decimal point, or orders of magnitude off. PARIdroid [2,11,0,22860,"5579deb0b"] can complete that 10 million times in roughly 4.581 seconds. ASM can be slower if poorly written, wrong algorithm used, etc.
|
|
|
|
|
|
#37 |
|
Aug 2018
GEORGIA Republic
348 Posts |
I used simple Squaring algorhitm, as enhanced multiplication case..
what you did not get? |
|
|
|
|
|
#38 |
|
"Forget I exist"
Jul 2009
Dumbassville
100000110000002 Posts |
|
|
|
|
|
|
#39 |
|
Aug 2018
GEORGIA Republic
22·7 Posts |
ah, sorry!
I mean 10^12500000 edit: 1 with 12499968 nulls Last fiddled with by ktpn2011 on 2018-08-04 at 20:38 |
|
|
|
|
|
#41 | ||
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
541810 Posts |
Quote:
In http://www.mersenneforum.org/showpos...5&postcount=23 you wrote: Quote:
Anyway, I hope you had fun doing it. And I always appreciate when someone located & perhaps raised in another country posts in English, because my Japanese, Romanian, Russian, French, etc. is nonexistent. |
||
|
|
|
|
|
#42 |
|
Aug 2018
GEORGIA Republic
22×7 Posts |
No, what a misunderstanding..
yet I am talking about simple algos. btw, here is results of your perl on my pc: Code:
Perlll3.pl v 0.10 Aug 3 2018 (C) 2018 Kriesel p=11213 Mersenne prime in 6 seconds. p=19937 Mersenne prime in 23 seconds. p=21701 Mersenne prime in 29 seconds. p=23209 Mersenne prime in 33 seconds. p=44497 Mersenne prime in 156 seconds. p=86243 Mersenne prime in 732 seconds. $r->bmodpow(2,$mp); $r->bsub(2); while I do square / sub 2 / mod_for2p1 |
|
|
|
|
|
#43 |
|
Aug 2018
GEORGIA Republic
22×7 Posts |
I changed your code to:
Code:
$r->bpow(2);
$r->bsub(2);
$r->bmod($mp);
p=86243 Mersenne prime in 204 seconds.!! against 732 Last fiddled with by ktpn2011 on 2018-08-04 at 21:10 |
|
|
|
|
|
#44 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2·32·7·43 Posts |
Quote:
But you've now demonstrated your assembler code is >11. times slower, on the same hardware and exponent. (At p=86243, 2281.672 sec vs 204 = 11.18...) On my i3-M370, for p=86243, I get not the 3.59:1 speedup you indicated, but more like 1.95:1. It's nonintuitive why there's any advantage; it's dealing with longer operands on average. Code:
$r->bpow(2); # p bit input, square, 2p bit result
$r->bsub(2); # subtract scalar from 2p bit operand, 2p bit result
$r->bmod($mp); # 2p bit input, p bit result
Code:
$r->bmodpow(2,$mp); # p bit input, square, mod, p bit result
$r->bsub(2); # subtract scalar from p bit operand, p bit result
|
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Primality test based on factorization of n^2+n+1 | carpetpool | Miscellaneous Math | 5 | 2018-02-05 05:20 |
| Composites that pass Mathematica's pseudoprime test | grandpascorpion | Math | 15 | 2012-03-23 02:52 |
| Prime numbers Grid, to test an odd integer on 44 | Zarck | Math | 5 | 2012-03-06 14:43 |
| Faster Lucas-Lehmer test using integer arithmetic? | __HRB__ | Software | 188 | 2010-08-09 11:13 |
| please help me pass the test. | caliman | Hardware | 2 | 2007-11-08 06:12 |