![]() |
![]() |
#133 |
"Antonio Key"
Sep 2011
UK
32×59 Posts |
![]()
gap11, an update to gap10 - provides a modest 1.5-2% speed boost on my ivybridge using 4 threads.
I was unable to speed test on my skylake laptop due to thermal limiting. Code:
// version 1.05 Remove always false tests from prec_prime Antonio Key // and next_prime, alternative would be to replace // with asserts as the progam failed elsewhere if // the tests were ever true. // Removed always true test in gap search // Removed bound test on P2 in gap search // Replaced result sort routine // On screen display of process state (sieve or gap search) // Let compiler decide if AVX2 is available to use // Update default_unknowngap // Moved the #define version so that it was easier to check // against the version comments and keep them consistent. |
![]() |
![]() |
![]() |
#134 |
Jun 2003
Oxford, UK
22×13×37 Posts |
![]()
Is there any reasons for a drop off in speed as we moved from 6e18 to 7e18? My speed has dropped from 32e9 to 29.6e9 n/sec. The only change of variables compared to 6e18 is that I now have -unknowngap 1382
Presumably the next_prime prev_prime calculations take marginally longer because the n surviving the sieve are larger, but this does seem like a large drop off. Also gap11 is marginally slower on my machine than gap10. At the end of the test the gap11 was running at 29.25e9 compares to 29.35e9 for gap11. |
![]() |
![]() |
![]() |
#135 |
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
10011000000012 Posts |
![]()
Mine has increased per Antonios values. Try to reboot the machine and check if you have any svhost.exe service running in background.
|
![]() |
![]() |
![]() |
#136 | |
"Antonio Key"
Sep 2011
UK
32·59 Posts |
![]() Quote:
For testing I used a batch file containing the following on my i5 ivybridge: Code:
Rem Reference Code gap10 -n1 6e18 -n2 625e16 -n 6e18 -res1 0 -res2 15 -res 0 -m1 1190 -m2 8151 -unknowngap 1382 -numcoprime 27 -sb 24 -bs 18 -t 4 -mem 12.25 ren gap_solutions.txt gap10_solutions.txt ren gap_report.txt gap10_report.txt Rem Test Code gap11 -n1 6e18 -n2 625e16 -n 6e18 -res1 0 -res2 15 -res 0 -m1 1190 -m2 8151 -unknowngap 1382 -numcoprime 27 -sb 24 -bs 18 -t 4 -mem 12.25 Are you using the provided exe or compiling it yourself? If compiling yourself, what version of gcc? I'm using version 6.3.0 |
|
![]() |
![]() |
![]() |
#137 |
Jun 2003
113708 Posts |
![]()
Sieving would take slightly longer since more primes are used in the sieve. Also, the program reports cumulative speed since start of the run (instead of incremental -- feature request?), and initial iterations would be slower, so it takes time for the speed to stabilize.
|
![]() |
![]() |
![]() |
#138 | |
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
130116 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#139 |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
22·227 Posts |
![]()
Due to a friendly cooperative 64-bit factoring competion / hackathon in the last week, we have somewhat faster mulredc aka mont_mulmod now, using some asm written by Ben Buhrow of yafu fame. I also have a tweak to the addmod asm.
What is the best way to get this applied? It would be nice if the gap finding program was on github so I could do a pull request. |
![]() |
![]() |
![]() |
#140 | |
"Robert Gerbicz"
Oct 2005
Hungary
2·7·103 Posts |
![]() Quote:
Since that seems arithmetic speedup we would not need a proper test, but I don't like untested codes, so test it and compare the results to say gap10.c or even gap11.c [Antonio tested these two, so I've accepted this]. Your code should handle 64 bits n values also, up to n<2^64-2^32. [in the latest range [9.25e18,10e18] we are already in that area]. |
|
![]() |
![]() |
![]() |
#141 |
"Antonio Key"
Sep 2011
UK
32·59 Posts |
![]()
gap12 code now available:
Code:
// version 1.06 Speed up in assembly routines Dana Jacobsen // mulmod & addmod by Dana Jacobsen // mont_prod64 asm thanks to Ben Buhrow // Replace in-code tests for AVX2 use with Antonio Key // conditional compile directives // Cosmetic change - Now displays upper and lower bounds // of n for the current test, rather than just lower bound. The usual Windows executables are provided, the required dll is available in post #133. |
![]() |
![]() |
![]() |
#142 |
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
10011000000012 Posts |
![]()
Thank you Antonio. I’ve upgraded the client our morning but speed is still stabilising therefore awaiting to see the improvements claimed.
Nevertheless good stuff. |
![]() |
![]() |
![]() |
#143 | |
"Robert Gerbicz"
Oct 2005
Hungary
2×7×103 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
The lasieve5 latest code and patches | frmky | Factoring | 36 | 2016-08-13 16:32 |
PFGW latest well-tested version | mdettweiler | Conjectures 'R Us | 109 | 2010-09-29 20:20 |
where can I download the latest version of GMP-ECM | aaa120 | GMP-ECM | 2 | 2008-10-31 14:28 |
Is 23.8.1 the latest Version of Prime95? | Bundu | Software | 1 | 2004-11-03 23:18 |
Latest version? | [CZ]Pegas | Software | 3 | 2002-08-23 17:05 |