mersenneforum.org  

Go Back   mersenneforum.org > Math Stuff > Computer Science & Computational Number Theory

Reply
 
Thread Tools
Old 2015-09-13, 15:46   #1
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

24×3×11 Posts
Default APRCL implementations comparison

A post from Dana pushed me to time both mpz_aprcl and Pari/GP more thoroughly.

Quote:
Originally Posted by danaj View Post
It was testing on an otherwise idle 4.3GHz 4770K running Linux, using Pari 2.7.0 and mpz_aprcl 1.1, At those sizes averaging 5 random n-digit primes. At 500 digits Pari was 20.8s, mpz_aprcl 13.3s. At 600 digits, Pari 35.4s, mpz_aprcl 27.8s. My spreadsheet shows no size where Pari came out faster, so it wasn't just a lucky few numbers. At 2000 digits it was 5039s for Pari, 4374s for mpz_aprcl.

GMP 6.0.0a made a big difference, but you have recent versions. I did the timing in May 2014.
Attached is a spreadsheet showing timing from 100 to 1000 digits. Starting at 600 digits Pari/GP is faster (except for 700 digits). I also added the chosen t parameter which seems to be the main difference between both implementations.

The primes were generated using Pari/GP:
Code:
setrand(2)
for (i=1, 10, for (j=1, 10, a=0; until(a > 10^(i*100-1)-1, a=randomprime(10^(i*100))); print(a)))
The environment is as follows:
  • 4770K with HT enabled and no OC except RAM @2400.
  • Pari git abee4fd1a75
  • gmp hg 16822
  • mpz_aprcl 1.2
Attached Files
File Type: gz mpz-aprcl-vs-pari.ods.gz (17.4 KB, 138 views)
ldesnogu is offline   Reply With Quote
Old 2015-09-14, 02:39   #2
danaj
 
"Dana Jacobsen"
Feb 2011
Bangkok, TH

2·11·41 Posts
Default

I just reran everything on my machine. The inputs are a bit different, but that shouldn't have a really big impact. I get nearly identical times for mpz_aprcl. I get substantially similar ties for dj-ecpp. I get much faster times for Pari than I did back in early 2014 (right after 2.7.0 was released). I'm not sure why, as there is nothing substantially faster about today's HEAD vs. 2.7.0 in APR-CL.

Attaching a similar spreadsheet. Sheet 1 is Idesnogu, Sheet 2 is mine.

Pari has some odd timing behavior. The times for 500 digit are more than double that for 400 digit, but 600 digit is just barely more than 500. 700 is a huge jump, then 800 a small one and 900 almost no difference.

I'd like to see this extended a little father. My results are mpz_aprcl faster to 500 digits, slower at 600, faster at 700 and 800, then slower at 900 and substantially slower at 1000. I'd also like to improve the speed of my ECPP implementation at the larger sizes, but that's another topic.
Attached Files
File Type: gz mpz-aprcl-vs-pari.ods.gz (60.5 KB, 79 views)
danaj is offline   Reply With Quote
Old 2015-09-14, 17:45   #3
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

24×3×11 Posts
Default

Thanks for checking. Your mpz_aprcl results are about ~20% faster than mine, while the Pari/GP ones vary from 10% slower to 60% faster, that's strange.

I need to cleanup the changes I made to mpz_aprcl, they bring me a 10%-20% over mpz_aprcl-1.2. What's blocking me is getting a good way to build the list of t to use (this probably requires estimating the cost of computing the Jacobi sums), which would bring mpz_aprcl closer to Pari/GP (I get the same time for one of 1000-digit primes for mpz_aprcl and Pari/GP, while mpz_aprcl-1.2 is 60% slower).
ldesnogu is offline   Reply With Quote
Old 2015-09-15, 00:26   #4
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

586310 Posts
Default

Did you tune your PARI? Sounds like some cutoffs are misplaced.
CRGreathouse is offline   Reply With Quote
Old 2015-09-15, 00:39   #5
danaj
 
"Dana Jacobsen"
Feb 2011
Bangkok, TH

11100001102 Posts
Default

Quote:
Originally Posted by CRGreathouse View Post
Did you tune your PARI? Sounds like some cutoffs are misplaced.
I did not -- just ./Configure, make gp. I'm running ./Configure --tune now and will see what happens.
danaj is offline   Reply With Quote
Old 2015-09-15, 03:36   #6
danaj
 
"Dana Jacobsen"
Feb 2011
Bangkok, TH

16068 Posts
Default

make clean; ./Configure --tune churned away for an hour and made a tune.h file. The resulting gp produces essentially identical times.

My CPU is faster, but memory maybe a little slower (unless you don't have dual channel). That it is faster doesn't surprise me, but the differences in Pari times aren't clear.

Nice to hear about your mpz_aprcl speedups.
danaj is offline   Reply With Quote
Old 2015-09-16, 07:58   #7
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

24×3×11 Posts
Default

Quote:
Originally Posted by danaj View Post
make clean; ./Configure --tune churned away for an hour and made a tune.h file. The resulting gp produces essentially identical times.
I re-tuned too.

Quote:
My CPU is faster, but memory maybe a little slower (unless you don't have dual channel). That it is faster doesn't surprise me, but the differences in Pari times aren't clear.
I have dual channel. The difference in Pari/GP times comes from a stupid mistake on my side: my benchmark script used the default Pari/GP of my system... Now I'm 10% behind you which is what I expect given your OC. That leaves the 20% to explain on mpz_aprcl.

Quote:
Nice to hear about your mpz_aprcl speedups.
There's still a lot of work to make it available. I could perhaps put it on Github (David gave me his approval, provided I change the name).
ldesnogu is offline   Reply With Quote
Old 2015-09-20, 23:56   #8
WraithX
 
WraithX's Avatar
 
Mar 2006

1110110002 Posts
Default

Quote:
Originally Posted by ldesnogu View Post
I re-tuned too.

I have dual channel. The difference in Pari/GP times comes from a stupid mistake on my side: my benchmark script used the default Pari/GP of my system... Now I'm 10% behind you which is what I expect given your OC. That leaves the 20% to explain on mpz_aprcl.

There's still a lot of work to make it available. I could perhaps put it on Github (David gave me his approval, provided I change the name).
I'd be happy to update mpz_aprcl to include any more speed-ups you have developed. I could even update the copyright notice to indicate that you are a developer of the code, if you like. I would prefer to go this route rather than having multiple/similar projects available online. I'd like this so that everyone would have just one place to go to get the latest/greatest version of mpz_aprcl that is available. Would you be interested in this idea, or would you still like to go the Github route? Either way is fine with me.

Also, I once had some code to generate t values. However, once I incorporated Jason Moxam's Jacobi sums, I was locked in to that particular set of t values. I started writing code to generate new Jacobi sums based on a different set of t values, but was getting decent performance with the original set, so never really finished it. If you're interested, I can pick that back up and try to finish it so that we can test different sets of t values.
WraithX is offline   Reply With Quote
Old 2015-09-23, 13:02   #9
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

24×3×11 Posts
Default

Quote:
Originally Posted by WraithX View Post
I'd be happy to update mpz_aprcl to include any more speed-ups you have developed. I could even update the copyright notice to indicate that you are a developer of the code, if you like. I would prefer to go this route rather than having multiple/similar projects available online. I'd like this so that everyone would have just one place to go to get the latest/greatest version of mpz_aprcl that is available. Would you be interested in this idea, or would you still like to go the Github route? Either way is fine with me.
My main interest is in learning how to efficiently implement APRCL, it's not mpz_aprcl in itself (and don't take this as a criticism or whatever, I really like the code ). So I think the best thing to do for the community would be that I give you all what I have already done. The problem is that the code has been heavily re-indented and "extended", so the work to get it back into mpz_aprcl.c might be a real pain. I can send you a copy of my git repository if you want to take a look before we decide how to proceed.

Quote:
Also, I once had some code to generate t values. However, once I incorporated Jason Moxam's Jacobi sums, I was locked in to that particular set of t values. I started writing code to generate new Jacobi sums based on a different set of t values, but was getting decent performance with the original set, so never really finished it. If you're interested, I can pick that back up and try to finish it so that we can test different sets of t values.
I have such an implementation that I pushed up to 18401055938125660800. I also have everything in place to generate the tables needed by mpz_aprcl.

What I miss is a way to select a good value. This is dependent on the underlying hardware, libraries and APRCL implementation performance. Pari/GP is slightly better here, but not perfect (which explains why it loses sometimes against mpz_aprcl). It looks like FLINT APRCL is even faster for some of the ranges (I'll post results later). I also guess these various implementations offer various optimizations missing in the others.
ldesnogu is offline   Reply With Quote
Old 2015-09-23, 13:53   #10
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

52810 Posts
Default

Here are corrected results plus new FLINT APRCL results.

The FLINT implementation is always faster than both mpz_aprcl and Pari/GP.
Attached Files
File Type: gz mpz-aprcl-vs-pari.ods.gz (47.8 KB, 73 views)
ldesnogu is offline   Reply With Quote
Old 2015-10-28, 00:36   #11
vladgl
 
"Vladimir Glazachev"
Oct 2015
Russia, Saint-Petersburg

1 Posts
Default

Hi!

I worked on the implementation of APRCL in FLINT this summer during GSoC2015. I wrote some blogposts, so maybe someone find it useful - link.

My code is still not merged with FLINT, so that it is not checked very carefully. Also I have ideas what can be done to speed up. Thanks for your timings, thats interesting.
vladgl is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparison of NFS tools CRGreathouse Factoring 3 2018-02-05 14:55
Any CIDE Primality Test Implementations? Stargate38 Computer Science & Computational Number Theory 4 2014-11-02 19:25
Comparison Page Not Working wblipp Operation Billion Digits 0 2012-11-24 06:33
PFGW vs LLR comparison discussion henryzz Conjectures 'R Us 37 2010-02-19 07:42
Pollard's Algorithm & That of Devaraj-a comparison devarajkandadai Miscellaneous Math 22 2005-06-10 11:13

All times are UTC. The time now is 22:28.

Fri Aug 7 22:28:08 UTC 2020 up 21 days, 18:14, 1 user, load averages: 1.73, 1.92, 1.93

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.