mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2020-10-18, 22:55   #23
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

2×709 Posts
Default

Quote:
Originally Posted by paulunderwood View Post
Fermat's Little theorem states b^(N-1) == 1 mod N for prime N (and gcd(b,N)==1).

For N=2^n-c this means for b=3 that 3^(2^n-c-1) == 1 (mod 2^n-c). This can be rewritten as 3^(2^n) == 3^(c+1) (mod 2^n-c). The left hand side is just squarings; The right hand side takes ~log(c) iterations. At the moment LLR does ~n multiplications of 3 times an n bit number which adds up to a 3-5% overhead.

Can a similar argument hold for k*2^n-c? Yes! (3^k)^(2^n) == 3^(c+1) (mod k*2^n-c).
Basically in the algorithm there is "almost" only squarings and very few multplications (at the stuff for error computation).

For N=(k*b^n+c)/d using a^d as "base" with Fermat's little theorem if N is prime then:

(a^d)^((k*b^n+c)/d)==a^d mod N but then

a^(k*b^n+c)==a^d mod (d*N) is also true, we're doing this because reduction mod (d*N) is easier, note that here we're already weaker than a standard Fermat test. From this

(a^k)^(b^n)==a^(d-c) mod (d*N)

Ofcourse b=2 is fixed to enable the fast checks. It isn't interesting if d-c<0 because in general "a" is small, or you can even divide the left side by the right side to avoid any inverse multiplication
computation in this case.

And you can also delay the k-th powering with: (a^(b^n))^k.

Last fiddled with by R. Gerbicz on 2020-10-18 at 22:59 Reason: typos
R. Gerbicz is offline   Reply With Quote
Old 2020-10-28, 19:54   #24
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

22×103 Posts
Default

Quote:
Originally Posted by henryzz View Post
Still only 32-bit for windows.
Still waiting on a 64-bit Windows binary. I want to run timing tests on a laptop I'm trying to bring online, but they'll be off with 32-bit or an old version (more with the latter I assume).
Happy5214 is offline   Reply With Quote
Old 2020-10-29, 11:48   #25
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

34·71 Posts
Default

Quote:
Originally Posted by Happy5214 View Post
Still waiting on a 64-bit Windows binary. I want to run timing tests on a laptop I'm trying to bring online, but they'll be off with 32-bit or an old version (more with the latter I assume).
Is there a possibility of running WSL?
henryzz is offline   Reply With Quote
Old 2020-10-29, 19:39   #26
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

22·103 Posts
Default

Quote:
Originally Posted by henryzz View Post
Is there a possibility of running WSL?
Not at the moment. It has Windows 8.1 on it, and I haven't gotten around to upgrading it to 10.
Happy5214 is offline   Reply With Quote
Old 2020-11-02, 20:39   #27
Jean Penné
 
Jean Penné's Avatar
 
May 2004
FRANCE

2×281 Posts
Default Switching to 3.8.24 build 3

Hi,

Jeff Gilchrist warned me about an issue in LLR 3.8.24 build 2 :

The default options for the Fermat PRP testing are :
Gerbicz error checking (-oErrorChecking=1) and
Random shift on "a" at the beginning (-oShifting=1)

Gerbicz is OK for these tests, but the random shift can work only for
k*2^n+c numbers if k==1 AND c == +1 or -1
The bug is that my code did require k == 1, but did not test for c, before setting
this random shift...

To avoid any misunderstanding, I renamed all the links in the index.html files
but, indeed, you have to download again all the files to be up to date!
Would you excuse me for these drawbacks, and Best Regards,

Jean
Jean Penné is offline   Reply With Quote
Old 2020-11-02, 21:07   #28
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

DB016 Posts
Default

Thanks Jean. The lastest LLR with GEC seems much slower, compared to older versions. I understand there is computational overhead for GEC and welcome better error detection. This is what I am seeing:

Code:
^C3322955+15, bit: 540000 / 3322956 [16.25%].  Time per bit: 0.272 ms.
Code:
^Ceration: 1050000 / 3322955 [31.10%], ms/iter:  0.334, ETA: 00:12:39

Last fiddled with by paulunderwood on 2020-11-02 at 21:09
paulunderwood is offline   Reply With Quote
Old 2020-11-02, 21:32   #29
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

2×709 Posts
Default

Quote:
Originally Posted by paulunderwood View Post
Thanks Jean. The lastest LLR with GEC seems much slower, compared to older versions. I understand there is computational overhead for GEC and welcome better error detection. This is what I am seeing:

Code:
^C3322955+15, bit: 540000 / 3322956 [16.25%].  Time per bit: 0.272 ms.
Code:
^Ceration: 1050000 / 3322955 [31.10%], ms/iter:  0.334, ETA: 00:12:39
You shouldn't see such slowdown due to the checking. See what gpuowl is using L=400 or L=1000 with that the slowdown is "only" 2/L part of the total running time. p95 is using a not fixed L, changing it in a dynamic way if you see error then you decrease L; if for a while you don't see errors you increase it up to even max L=1000 [as I can remember].

The theoretic barrier for my check is that you could see even only 2/sqrt(n) part for the slowdown for a given N=(k*2^n+c)/d, but it is not that recommended because in that case you would do only one error check in the whole run. [so even a fixed L=400, 1000 is much better, giving a pretty small slowdown, but more checks]. Though for a smallish n, say n<1e6 you could use smaller L, because L^2>n is suboptimal.
R. Gerbicz is offline   Reply With Quote
Old 2020-11-05, 23:35   #30
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

13×53 Posts
Default

Which LLR version is adviced to run on old AMD magny cours processors?

Have 48 core box operational with magny cours 2.2ghz processors.
Very happy about it.

LLR gives CPU fault.

That would be the same LLR version that's on the intel xeons here where it runs fine for years.

diep@thegathering:/home/69/test3$ ./sllr64 -v
LLR Program - Version 3.8.21, using Gwnum Library Version 28.14

Which LLR version is adviced to use?
Many Thanks in advance,
Vincent
diep is offline   Reply With Quote
Old 2020-11-06, 06:39   #31
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

19C16 Posts
Default

Quote:
Originally Posted by diep View Post
Which LLR version is adviced to run on old AMD magny cours processors?

Have 48 core box operational with magny cours 2.2ghz processors.
Very happy about it.

LLR gives CPU fault.

That would be the same LLR version that's on the intel xeons here where it runs fine for years.

diep@thegathering:/home/69/test3$ ./sllr64 -v
LLR Program - Version 3.8.21, using Gwnum Library Version 28.14

Which LLR version is adviced to use?
Many Thanks in advance,
Vincent
I have an Intel Core 2 Quad from 2009 (which is even older than Magny-Cours, according to Wikipedia), and the latest LLR versions work fine on it. Have you tried building it from source or using a more recent version? I don't know if the 3.8.21 build had any particular issues with older CPUs, since I don't think I ever used that particular version. I know for sure that 3.8.23 and 3.8.24 both work on my box.

Last fiddled with by Happy5214 on 2020-11-06 at 06:57 Reason: More detail
Happy5214 is offline   Reply With Quote
Old 2020-11-23, 15:29   #32
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

2B116 Posts
Default

Good Morning!

On a NUMA system i notice a huge performance issue with LLR.

let's first explain the simple problem.

Probably we all know what numa systems are. memory at other sockets (or attached in case of threadripper to other cpu module as it is 8 modules of course) needs to get memory from other sockets via a crossbar. Not only is that slower it also kind of burns up the crossbar which has a limited bandwidth and not a forever life.

At a 4 socket board, i have one with a broken socket (and 2 with junk in it) which after much work i managed to get to work. So i have 3 socket system. With very noticable differences in latency local memory versus remote.

Now in a perfect world if the kernel schedules LLR at the correct socket (so that would be 12 cores in case of this magny cours) then everything goes fine.

Regrettably the linux kernel is not so clever there and it doesn't have knowledge about LLR so it's logical this problem happens.

So i get huge timing differences.

Now with taskset you can set your program to execute on different cores.

Regrettably that's a useless command because it doesn't migrate the memory.

So for example if i start LLR at 4 threads and have system start it, odds are ver high it starts at the wrong socket. Say that's socket 0. Now it allocates memory at socket 0 and it starts executing.

Then i want it run it at a different cpu.
Say socket number 2. As there is 36 cores in this system that's

taskset -p 0xfff000000 4000

In case the procesID is 4000.
In total 9 characters hexadecimal as 9 x 4 = 36 cores.

(In fact i would give it probably then taskset - p 0xfc0000000 as each cpu in itself is again a double cpu of 2x6 cores. )

So memory already allocated at socket number 0 then gets accesses from cores at cpu 2.

This is duck slow.

Problem seems to be: The kernel doesn't migrate the memory already allocated by the proces to socket number 2. It stays on socket number 0.

That causes huge timing differences to happen. Sometimes nearly factor 2 here.

Some proces that should take no more than 7500 - 9500 seconds is really 17000 seconds slow here. And a few of them actually do. Yet the ones scheduled right from the start wrong are screwed forever.

So if LLR would have a command line option which set of cpu cores (like 0-5 or 29-35) to bind to and allocate RAM from that would improve LLR times significantly and heat up the HT links less.
diep is offline   Reply With Quote
Old 2020-11-23, 16:27   #33
axn
 
axn's Avatar
 
Jun 2003

7·683 Posts
Default

You can start a command using taskset, not just set the affinity of existing process.
axn is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
LLR Version 3.8.22 released Jean Penné Software 51 2019-04-10 06:04
LLR Version 3.8.19 released Jean Penné Software 11 2017-02-23 08:52
LLR Version 3.8.16 released Jean Penné Software 38 2015-12-10 07:31
LLR Version 3.8.15 released Jean Penné Software 28 2015-08-04 04:51
llr 3.8.2 released as dev-version opyrt Prime Sierpinski Project 11 2010-11-18 18:24

All times are UTC. The time now is 14:53.

Mon Nov 30 14:53:44 UTC 2020 up 81 days, 12:04, 3 users, load averages: 1.60, 1.41, 1.42

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.