mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Miscellaneous Math

Reply
 
Thread Tools
Old 2022-07-20, 22:46   #23
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

89×113 Posts
Lightbulb Re: 32 bits, and later 64 bit limit, and then higher.

Every road begins with a few first steps.

Just to give you, perhaps, a sandwich for the road (because this road is long), there are certain elements that can be used for 1) prototyping, 2) refactoring into something intermediate, 3) deep refactoring, with custom building blocks.

(That is all with the meta-decision to stop at any stage if/when you might find that this was already done, or idea turned to be invalid or unoriginal or simply you might have lost interest. Or every once in a while you might have given it all you've got and the problem has not yielded. I've had a few where you do a lot of invisible work with no result to show... except maybe that final search limit was this or that.

The other meta-consideration is to stop after or in parallel with Step 1 and estimate what, statistically, you expect to get. And at expense of how many core-years. It is a separate skill to acquire. Many people keep banging at problems where their expected yield is 1 hit per 10,000 core years ... and they only have 32 cores. Do the arithmetic.)

For step 1) most of the time one can find that if you start using Pari/GP and get over the few first syntax hurdles, you will never quit it. Once you have a script you can test some hypothesis to a certain limit and with a bit of luck for some problems you will be done.

For step 2) one would take a gp scaffold code and refactor it into GMP-assisted C/C++. This will be faster, but not lightning fast.

For step 3) you can define (or refine based on your results from steps 1 and 2) the definition domain of the final implementation. Let's say this will be a 96-bit limit. (64-bit is even easier to implement; but 96 is quite commonly good; btw lots of folks have 96-bit oriented code.) Then if we take this toy problem as an example, you will realize that you want to implement a class (or a structure) with three 32-bit unsigned int limbs, and implement a) addition of elements a + b from it (sum up, carry over), b) multiplication x * y -- and you can do either "grade school" multiplication or Karatsuba. You can also borrow ad hoc implementations by studying other folks code, for example srsieve or polysieve (there is readily available ~55-bit multiplication implementation there that is cleverly using doubles for an intermediate). You can also find small ASM snippets to mulmod, powmod, invmod etc that can be embedded into C that will make code much faster, and then leave all other logic (if/else/for's) to gcc -O3. Then it is always fun to compare what you will get with simply using GMP despite its 'shoot from large cannons' overhead when all you need is 64 or 96 bits.

Just 2 cents for the road. These hints might be useful for the general process of prototyping anything. Anything that you might think of tomorrow.
Batalov is offline   Reply With Quote
Old 2022-07-20, 23:20   #24
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·72·113 Posts
Default

Quote:
Originally Posted by Batalov View Post
Just 2 cents for the road. These hints might be useful for the general process of prototyping anything. Anything that you might think of tomorrow.
Sound counsel.
chalsall is offline   Reply With Quote
Old 2022-07-21, 18:30   #25
User140242
 
Jul 2022

3×52 Posts
Default

Quote:
Originally Posted by Batalov View Post

Just to give you, perhaps, a sandwich for the road (because this road is long), there are certain elements that can be used for 1) prototyping, 2) refactoring into something intermediate, 3) deep refactoring, with custom building blocks.

I will try to follow your advice.



In the meantime I created this function that allows you to find the factor (~ 42 bits) of 13379827 or the factor (~ 46 bits) of 668267879 in less than 0.7 s.


Code:
uint64_t XY_modulo_D_LN(uint64_t X,uint64_t Y,uint64_t D,uint64_t D_max)
{
    // D<2^54 D>D_max
    uint64_t mod_D;
    if (X<=D_max && Y<=D_max )
    {
        mod_D=(X*Y)%D;
    }
    else if (X<=D_max || Y<=D_max )
    {
        uint64_t X1,N_t;
        uint64_t RD=(uint64_t)D%D_max;
        uint64_t KD=(uint64_t)D/D_max;
        if (Y<=D_max)
        {
            X1=Y;
            Y=X;
            X=X1;
        }      
        mod_D=(X*(Y%D_max))%D_max;
        N_t=X*(Y/D_max)+(X*(Y%D_max))/D_max;
        mod_D=(mod_D+(D_max*(N_t%KD))%D)%D;
        mod_D=(mod_D+D-((N_t/KD)*RD)%D)%D;

    }
    else
    {
    uint64_t RD=(uint64_t)D%D_max;
    uint64_t KD=(uint64_t)D/D_max;
    uint64_t Rx=(uint64_t)X%D_max;
    uint64_t Ry=(uint64_t)Y%D_max;
    uint64_t Kx=(uint64_t)X/D_max;
    uint64_t Ky=(uint64_t)Y/D_max;
    uint64_t mod_D_t,N_t,R_t;


    mod_D=(Rx*Ry)%D;

    mod_D_t=((D_max*(D_max%KD))%D)%D;
    mod_D_t=(mod_D_t+D-((D_max/KD)*RD)%D)%D;

    N_t=mod_D_t;
    R_t=Kx;
    if (N_t>=(D_max/R_t))
    {
        mod_D_t=(R_t*(N_t%D_max))%D_max;
        N_t=R_t*(N_t/D_max)+(R_t*(N_t%D_max))/D_max;
        mod_D_t=(mod_D_t+(D_max*(N_t%KD))%D)%D;
        mod_D_t=(mod_D_t+D-((N_t/KD)*RD)%D)%D;
    }
    else
        mod_D_t=mod_D_t*R_t;
    N_t=mod_D_t;
    R_t=Ky;
    if (N_t>=(D_max/R_t))
    {
        mod_D_t=(R_t*(N_t%D_max))%D_max;
        N_t=R_t*(N_t/D_max)+(R_t*(N_t%D_max))/D_max;
        mod_D_t=(mod_D_t+(D_max*(N_t%KD))%D)%D;
        mod_D_t=(mod_D_t+D-((N_t/KD)*RD)%D)%D;
    }
    else
        mod_D_t=mod_D_t*R_t;
    N_t=mod_D+mod_D_t;
    mod_D=N_t;
    if (N_t>=D_max)
    {
        mod_D=(N_t)%D_max;
        mod_D=(mod_D+(D_max*((N_t/D_max)%KD))%D)%D;
        mod_D=(mod_D+D-(((N_t/D_max)/KD)*RD)%D)%D;
    }

    if (Rx>=KD)
    {
        mod_D_t=((D_max*(Rx%KD))%D)%D;
        mod_D_t=(mod_D_t+D-((Rx/KD)*RD)%D)%D;
    }
    else
       mod_D_t=(Rx*D_max)%D;
    N_t=mod_D_t;
    R_t=Ky;
    if (N_t>=(D_max/R_t))
    {
        mod_D_t=(R_t*(N_t%D_max))%D_max;
        N_t=R_t*(N_t/D_max)+(R_t*(N_t%D_max))/D_max;
        mod_D_t=(mod_D_t+(D_max*(N_t%KD))%D)%D;
        mod_D_t=(mod_D_t+D-((N_t/KD)*RD)%D)%D;
    }
    else
        mod_D_t=mod_D_t*R_t;
    N_t=mod_D+mod_D_t;
    mod_D=N_t;
    if (N_t>=D_max)
    {
        mod_D=(N_t)%D_max;
        mod_D=(mod_D+(D_max*((N_t/D_max)%KD))%D)%D;
        mod_D=(mod_D+D-(((N_t/D_max)/KD)*RD)%D)%D;
    }

    if (Ry>=KD)
    {
        mod_D_t=((D_max*(Ry%KD))%D)%D;
        mod_D_t=(mod_D_t+D-((Ry/KD)*RD)%D)%D;
    }
    else
       mod_D_t=(Ry*D_max)%D;
    N_t=mod_D_t;
    R_t=Kx;
    if (N_t>=(D_max/R_t))
    {
        mod_D_t=(R_t*(N_t%D_max))%D_max;
        N_t=R_t*(N_t/D_max)+(R_t*(N_t%D_max))/D_max;
        mod_D_t=(mod_D_t+(D_max*(N_t%KD))%D)%D;
        mod_D_t=(mod_D_t+D-((N_t/KD)*RD)%D)%D;
    }
    else
        mod_D_t=mod_D_t*R_t;
    N_t=mod_D+mod_D_t;
    mod_D=N_t;
    if (N_t>=D_max)
    {
        mod_D=(N_t)%D_max;
        mod_D=(mod_D+(D_max*((N_t/D_max)%KD))%D)%D;
        mod_D=(mod_D+D-(((N_t/D_max)/KD)*RD)%D)%D;
    }
    }

    return mod_D;
}

bool is_2pow_onemod(uint64_t P,uint64_t D)
{ 
    // return true if 2^P = 1 (mod D)
    
    uint64_t x=2;
    uint64_t y=1;
    uint64_t D_max=(uint64_t)1<<30;
    if (D<=D_max)
    {
        while (P > (uint64_t)1)
        {
            if (P%2 == 1)
                y = (x * y) %D;
            x = (x * x) %D;
            P >>= 1;
        }
        return (((x*y)%D) == 1);
    }
    else
    {
        while (P > (uint64_t)1)
        {
             if (P%2 == 1)
                 y=XY_modulo_D_LN(x,y,D,D_max);
             x=XY_modulo_D_LN(x,x,D,D_max);  
             P >>= 1;
        }
        return (XY_modulo_D_LN(x,y,D,D_max) == (uint64_t)1);
    }
    
}
User140242 is offline   Reply With Quote
Old 2022-07-21, 19:23   #26
science_man_88
 
science_man_88's Avatar
 
"Forget I exist"
Jul 2009
Dartmouth NS

2×3×23×61 Posts
Default

Quote:
Originally Posted by User140242 View Post
I will try to follow your advice.



In the meantime I created this function that allows you to find the factor (~ 42 bits) of 13379827 or the factor (~ 46 bits) of 668267879 in less than 0.7
I'm not fluent in C but did your code use Mp prime ? If so D=2k(Mp)+1 for some k. You only have to test D with certain k values. Nevermind I see the 1 mod 8 versus 7 mod 8 talk earlier in the thread didn't read it fully.

Last fiddled with by science_man_88 on 2022-07-21 at 20:00
science_man_88 is offline   Reply With Quote
Old 2022-07-21, 21:17   #27
science_man_88
 
science_man_88's Avatar
 
"Forget I exist"
Jul 2009
Dartmouth NS

2×3×23×61 Posts
Default

Quote:
Originally Posted by science_man_88 View Post
I'm not fluent in C but did your code use Mp prime ? If so D=2k(Mp)+1 for some k. You only have to test D with certain k values. Nevermind I see the 1 mod 8 versus 7 mod 8 talk earlier in the thread didn't read it fully.
you can check k\not\equiv p\pmod 3 because that's when 2kp+1 doesn't divide by 3.
science_man_88 is offline   Reply With Quote
Old 2022-07-22, 08:21   #28
User140242
 
Jul 2022

10010112 Posts
Default

Quote:
Originally Posted by science_man_88 View Post
you can check k\not\equiv p\pmod 3 because that's when 2kp+1 doesn't divide by 3.

Here you find the final version https://gist.github.com/user140242/0...d2738788bb1cae.


A wheel sieve is used which has been set to eliminate all multiples of small primes less than min(bW, 524288) including multiples of 3.
User140242 is offline   Reply With Quote
Old 2022-09-04, 15:44   #29
User140242
 
Jul 2022

3×52 Posts
Default

To get the mind busy.

Several times I have been told in this thread that the best solution to calculate (2 ^ p-1) module D is the one described in the math page.


As described, given a certain number P, a number of steps equal to the number of bits of P converted into binary are required.

Example for p=86000063 you need 27 steps.

In reality, different solutions can be adopted to achieve the same result with the same complexity, for example:

fix b = 6 then 2^b=64

P=P%64 +P/64 *64 = P%64 +b1 *64

b1=b1%64 +b1/64 *64 = P%64 +b2 *64 then b2=b1/64

.....

bm+1 =bm%64 +bm/64 *64

continue until bm/64 <64

the remainder must be stored in a vector [P%64, b1%64, ... , bm%64]

then we start from

mod=2^(bm/64)

the product is repeated for b=6 times

mod=(mod*mod)%D
mod=(mod*mod)%D
mod=(mod*mod)%D
mod=(mod*mod)%D
mod=(mod*mod)%D
mod=(mod*mod)%D

then it is calculated

mod=(mod* 2^(bm%64) )%D

the whole is repeated for all the remainders in the vector.


Practical example

P=86000063 and D=61920045361



Quote:
P = 86000063 = 63 + 1343750 *64
b1 = 1343750 = 6 + 20996 *64
b2 = 20996 = 4 + 328 *64
b3 = 328 = 8 + 5 *64

remainders vector
[63, 6, 4, 8]

b3 / 64 = 5
then

Quote:


mod=2^5=32

---

mod=(mod*mod)%D = 1024
mod=(mod*mod)%D = 1048576
mod=(mod*mod)%D = 46870856639
mod=(mod*mod)%D = 2154143653
mod=(mod*mod)%D = 52612687576
mod=(mod*mod)%D = 22360353543

mod=(mod* 2^8 )%D = 27606333796

---

mod=(mod*mod)%D = 4234002796
mod=(mod*mod)%D = 34887580973
mod=(mod*mod)%D = 39592247589
mod=(mod*mod)%D = 51169879776
mod=(mod*mod)%D = 26016070733
mod=(mod*mod)%D = 29657271891

mod=(mod* 2^4 )%D = 41076032729

---

mod=(mod*mod)%D = 22120162447
mod=(mod*mod)%D = 45609483711
mod=(mod*mod)%D = 49416360515
mod=(mod*mod)%D = 58025736930
mod=(mod*mod)%D = 25169243724
mod=(mod*mod)%D = 29919015099

mod=(mod* 2^6 )%D = 57215605506

---

mod=(mod*mod)%D = 31050105354
mod=(mod*mod)%D = 2960155768
mod=(mod*mod)%D = 17389932407
mod=(mod*mod)%D = 28504056210
mod=(mod*mod)%D = 40422916975
mod=(mod*mod)%D = 9761495609

mod=(mod* 2^63 )%D = 1
in this way 28 steps are carried out but it is not necessary to distinguish the two cases on the basis of the value of the bits of P in each single step.
User140242 is offline   Reply With Quote
Old 2022-09-15, 13:21   #30
User140242
 
Jul 2022

7510 Posts
Default

In reference to post #29 maybe this: Find factors of a Mersenne number might be of interest to someone.
User140242 is offline   Reply With Quote
Old 2023-01-07, 12:26   #31
User140242
 
Jul 2022

3×52 Posts
Default

After realizing an error in the previous version I updated the page with the new version. I hope this is correct I tested to look for the factors of M86000063 up to 2^52 and the execution time is about 9s twice as long as Prime95.

Quote:
Originally Posted by kriesel View Post
...
As another point of comparison, prime95 V30.7b9 on i5-1035G1 (4 core & HT),
Factor=86000063,48,56

[Jul 18 04:36:14] Starting trial factoring of M86000063 to 2^56
[Jul 18 04:36:15] M86000063 has a factor: 61920045361 (TF:48-56)
[Jul 18 04:36:15] Trial factoring M86000063 to 2^50.
[Jul 18 04:36:15] Trial factoring M86000063 to 2^51.
[Jul 18 04:36:18] Trial factoring M86000063 to 2^52.
...
User140242 is offline   Reply With Quote
Old 2023-01-09, 14:47   #32
User140242
 
Jul 2022

3×52 Posts
Default

I am writing this post as a learning exercise.

In the link in the previous post there is a version to find the possible factors of prime numbers below a certain set value, the search is done using the 4 possible classes mod(2*p*12) and performing a sieve for small prime numbers.

Summarizing:

if we use modular arithmetic a possible factor can be written as d=r+bW*k
where r is the remainder associated with the possible class and bW the modulus.

As we know the possible factors are of the type d=1+2*p*k so bW=2*p.

Furthermore d=1(mod 8) or d=7(mod 8) this narrows the field into factors of the type:

d=1+8*p*k and d=1+2*p+8*p*k if p=3(mod4)

or

d=1+8*p*k and d=1+6*p+8*p*k if p=1(mod4)

in this way bW=2*p*4 and we have used only 2 classes (useful classes (r1=1,r2=1+2*p) or (r1=1,r2=1+6*p)) of the 4 possible for d=r+bW*k.

In this post there is a code to find the useful classes considering bW=2*p*PrimesBaseProd where PrimesBaseProd is the maximum number of classes
and nR the number of useful classes that coincides with phi(PrimesBaseProd) where phi() is the Eulers's Totient function.

Although with the sieve all the multiples for small primes are eliminated, the purpose of increasing the number of classes is to decrease the size of the memory used and to have more useful classes to work in parallel with multithreading.

So I made this version where it is possible to vary the n_PB variable to increase classes:
for n_PB=1 max_num_classes=4, n_PB=2 max_num_classes=12, n_PB=3 max_num_classes=60, n_PB=4 max_num_classes=420, (n_PB=5 max_num_classes=4620 not tested)

Code:
#include <iostream>
#include <cmath>
#include <algorithm>
#include <vector>
#include <cstdlib>
#include <stdint.h>

int64_t Euclidean_Diophantine( int64_t coeff_a, int64_t  coeff_b)
{
    // return y in  Diophantine equation  coeff_a x + coeff_b y  = 1
    int64_t k=1;
    std::vector<int64_t> div_t; 
    std::vector<int64_t> rem_t;
    std::vector<int64_t> coeff_t;
    div_t.push_back(coeff_a);
    rem_t.push_back(coeff_b);
    coeff_t.push_back((int64_t)0);
    div_t.push_back((int64_t)div_t[0] / rem_t[0]);
    rem_t.push_back((int64_t)div_t[0] % rem_t[0]);
    coeff_t.push_back((int64_t)0);
    while (rem_t[k] > 1)
    {
        k++;
        div_t.push_back((int64_t)rem_t[k-2] / rem_t[k-1]);
        rem_t.push_back((int64_t)rem_t[k-2] % rem_t[k-1]);
        coeff_t.push_back((int64_t)0);
    }
    k--;
    coeff_t[k] = - div_t[k+1];
    if (k>0)
        coeff_t[k-1] = (int64_t)1;
    while (k > 1)
    {
        k--;
        coeff_t[k-1] = coeff_t[k+1];
        coeff_t[k] += (int64_t)(coeff_t[k+1] * (-div_t[k+1]));
    }
    if (k == 1)
        return (int64_t)(coeff_t[k-1] + coeff_t[k] * (-div_t[k]));
    else
        return (int64_t)(coeff_t[0]);
}

uint64_t XY_modulo_D_LN(uint64_t X, uint64_t Y, uint64_t D, uint64_t D_max)
{
    //D>D_max  tested for D<2^54 
    uint64_t mod_D;
    if (X <= D_max && Y <= D_max )
    {
        mod_D = (X * Y) % D;
    }
    else if (X <= D_max || Y <= D_max )
    {
        uint64_t N_t;
        uint64_t RD = D % D_max;
        uint64_t KD = D / D_max;
        if (Y <= D_max)
        {
            uint64_t X1 = Y;
            Y = X;
            X = X1;
        }      
        mod_D = (X * (Y % D_max)) % D_max;
        N_t = X * (Y / D_max) + (X * (Y % D_max)) / D_max;
        mod_D = (mod_D + (D_max *(N_t % KD)) % D) % D;
        mod_D = (mod_D + D -((N_t / KD) * RD) % D) % D;
    }
    else
    {
        uint64_t RD = D % D_max;
        uint64_t KD = D / D_max;
        uint64_t Rx = X % D_max;
        uint64_t Ry = Y % D_max;
        uint64_t Kx = X / D_max;
        uint64_t Ky = Y / D_max;
        uint64_t mod_D_t,N_t,R_t;

        mod_D =(Rx * Ry) % D;
        mod_D_t = ((D_max * (D_max % KD)) % D) % D;
        mod_D_t = (mod_D_t + D -((D_max / KD) * RD) % D) % D;
        N_t = mod_D_t;
        R_t = Kx;
        if (N_t >= (D_max / R_t))
        {
            mod_D_t = (R_t *(N_t % D_max)) % D_max;
            N_t = R_t * (N_t / D_max)+(R_t * (N_t % D_max)) / D_max;
            mod_D_t = (mod_D_t + (D_max * (N_t % KD)) % D) % D;
            mod_D_t = (mod_D_t + D - ((N_t / KD) * RD) % D) % D;
        }
        else
            mod_D_t = mod_D_t * R_t;
        N_t = mod_D_t;
        R_t = Ky;
        if (N_t >= (D_max / R_t))
        {
            mod_D_t = (R_t * (N_t % D_max)) % D_max;
            N_t = R_t * (N_t / D_max) + (R_t * (N_t % D_max)) / D_max;
            mod_D_t = (mod_D_t + (D_max * (N_t % KD)) % D) % D;
            mod_D_t = (mod_D_t + D - ((N_t / KD) * RD) % D) % D;
        }
        else
            mod_D_t = mod_D_t * R_t;
        N_t = mod_D + mod_D_t;
        mod_D = N_t;
        if (N_t >= D_max)
        {
            mod_D =(N_t) % D_max;
            mod_D =(mod_D + (D_max *((N_t / D_max) % KD)) % D) % D;
            mod_D =(mod_D + D - (((N_t / D_max) / KD) * RD) % D) % D;
        }
        if (Rx >= KD)
        {
            mod_D_t = ((D_max * (Rx % KD)) % D) % D;
            mod_D_t = (mod_D_t + D - ((Rx / KD) * RD) % D) % D;
        }
        else
           mod_D_t = (Rx * D_max) % D;
        N_t = mod_D_t;
        R_t = Ky;
        if (N_t >= (D_max / R_t))
        {
            mod_D_t = (R_t * (N_t % D_max)) % D_max;
            N_t = R_t * (N_t / D_max) + (R_t * (N_t % D_max)) / D_max;
            mod_D_t = (mod_D_t + (D_max * (N_t % KD)) % D) % D;
            mod_D_t = (mod_D_t + D - ((N_t / KD) * RD) % D) % D;
        }
        else
            mod_D_t = mod_D_t * R_t;
        N_t = mod_D + mod_D_t;
        mod_D = N_t;
        if (N_t >= D_max)
        {
            mod_D = N_t % D_max;
            mod_D=(mod_D + (D_max*((N_t / D_max) % KD)) % D) % D;
            mod_D=(mod_D + D - (((N_t / D_max) / KD) * RD) % D) % D;
        }
        if (Ry >= KD)
        {
            mod_D_t = ((D_max * (Ry % KD)) % D) % D;
            mod_D_t = (mod_D_t + D - ((Ry/KD) * RD) % D) % D;
        }
        else
           mod_D_t = (Ry * D_max) % D;
        N_t = mod_D_t;
        R_t = Kx;
        if (N_t >= (D_max / R_t))
        {
            mod_D_t = (R_t * (N_t % D_max)) % D_max;
            N_t=R_t * (N_t / D_max) + (R_t * (N_t % D_max)) / D_max;
            mod_D_t = (mod_D_t + (D_max * (N_t % KD)) % D) % D;
            mod_D_t = (mod_D_t + D - ((N_t / KD) * RD) % D) % D;
        }
        else
            mod_D_t = mod_D_t * R_t;
        N_t = mod_D + mod_D_t;
        mod_D = N_t;
        if (N_t >= D_max)
        {
            mod_D = N_t % D_max;
            mod_D = (mod_D + (D_max * ((N_t / D_max) % KD)) % D) % D;
            mod_D = (mod_D + D - (((N_t / D_max) / KD) * RD) % D) % D;
        }
    }
    return mod_D;
}

bool is_2pow_onemod(uint64_t P, uint64_t D, uint64_t D_max)
{ 
    // return true if 2^P = 1 (mod D)
    uint64_t x = 2;
    uint64_t y = 1;
    if (D <= D_max)
    {
        while (P > (uint64_t)1)
        {
            if (P % 2 == 1)
                y = (x * y) % D;
            x = (x * x) % D;
            P >>= 1;
        }
        return (((x * y) % D) == 1);
    }
    else
    {
        while (P > (uint64_t)1)
        {
            if (P % 2 == 1)
                y=XY_modulo_D_LN(x, y, D, D_max);
            x = XY_modulo_D_LN(x, x, D, D_max);  
            P >>= 1;
        }
        return (XY_modulo_D_LN(x, y, D, D_max) == (uint64_t)1);
    }
}

void mersenne_first_factor(uint64_t P, int64_t max_exp_D)
{ 
    // P>61 returns, if it exists, the first factor D < 2^max_exp_D of (2^P - 1)  

    // maximum memory size 
    int64_t max_memory_size = 209715200;
    
    // set the maximum value small primes for sieve
    int64_t  dim_primes = 8192;

    // set the maximum value for modulo function XY_modulo_D_LN
    uint64_t D_max_mod = (uint64_t) 1 << 30;

    //It is possible to vary n_PB to change the maximum number of classes, for example n_PB=1 max_num_classes=4,        
    //n_PB=2 max_num_classes=12, n_PB=3 max_num_classes=60, n_PB=4 max_num_classes=420, (n_PB=5 max_num_classes=4620 not tested)
    int n_PB = 2; 
    int64_t PrimesBase[5] = {2,3,5,7,11};
	int64_t max_num_classes = 2;
    for(int j = 0; j < n_PB; j++)
        max_num_classes *= PrimesBase[j];

    int64_t  bW = (int64_t)(2 * P * max_num_classes);
    dim_primes = std::min(bW, dim_primes);

    std::vector<int64_t> RW;
    int64_t p0_sieve = PrimesBase[n_PB - 1] + 2;        
    if (n_PB>1)
    {    
        int64_t r_t1,r_t7;
        int j;
        r_t1 = 1;        
        r_t7 = 1 + 2 * P;
        if (P % 4 == 1)
            r_t7 += 4 * P;
        for (int64_t k = 1; k <= max_num_classes/4; k++)
        {
            for (j = 1; j < n_PB; j++)
                if (r_t1 % PrimesBase[j] == 0)
                    break;
            if (j == n_PB)
                RW.push_back(r_t1);
            for (j = 1 ; j < n_PB ; j++)
                if (r_t7 % PrimesBase[j] == 0)
                    break;
            if (j == n_PB)
                RW.push_back(r_t7);
            r_t1 += 8 * P;
            r_t7 += 8 * P;
        }
    }
    else
    {
        p0_sieve = 3;  
        RW.push_back(1);
        RW.push_back(1 + 2 * P);
        if (P % 4 == 1)
            RW[1] += 4 * P;    
    }
    int  nR=RW.size();

    int64_t max_D = pow(2,max_exp_D);
    
    max_exp_D = std::min(max_exp_D , (int64_t)(P / 2 + 1));
    int64_t  dim_step = max_D / bW;
    if (dim_step * nR > max_memory_size){
        dim_step = max_memory_size / nR;
        max_D = dim_step * bW;
    }
	
    // vector for find small primes for sieve
    std::vector<char> Primes_s(dim_primes + 1, true);

    // find small primes for sieve    
    for (int64_t p = 3; p * p <= dim_primes ; p += 2)
        if (Primes_s[p])
            for (int64_t m = p * p; m <= dim_primes; m += p)
                Primes_s[m]=false;
    if ((int64_t)P <= dim_primes)
        Primes_s[(int64_t)P] = false;

    for(int j = 1; j < nR; j++)
    {
        if (RW[j] <= dim_primes)
            Primes_s[RW[j]] = false;        
    }

    // vector for find factor primes remainder RWi mod bW  factor D= RWi+bW*k  
    std::vector<char> Factor_Sieve(nR * dim_step, true); 

    int64_t  r_t, r_t1, C1, C2, m, mmin;
    for (int64_t p = p0_sieve; p < dim_primes; p += 2)
    {
        if (Primes_s[p])
        {
            r_t1 = Euclidean_Diophantine(bW, bW - p);
            for(int j = 0; j < nR; j++)
            {
                // delete multiples of small primes in Factor_Sieve
                r_t = (r_t1 * (- RW[j])) % bW;
                if (r_t > 1)
                    r_t -= bW;
                C2 = (int64_t) (( - bW + p) * r_t / bW);
                C1 = r_t - bW + p;
                mmin=(int64_t)(bW + C1 + C2);
                for (m = mmin; m < dim_step && m >= 0; m += p)
                    Factor_Sieve[m + j * dim_step] = false;
            }
        }
    }
    Factor_Sieve[0] = false;		
    int D=0;
    uint64_t  D_RWt;    
    for(int j = 0; j < nR; j++)
    {
        D_RWt = RW[j];

        for (int64_t k = 0; k < dim_step; D_RWt += bW, k++)
        {
            if (Factor_Sieve[k + j * dim_step] == true)
            {
                if (is_2pow_onemod(P, D_RWt, D_max_mod)) 
                {
                    std::cout << "M" << P << " factor: " << D_RWt << std::endl;
                    D++;
                }
            }
        }
    }
    if (D == 0)
    {
        if (max_exp_D == (int64_t) (P / 2 + 1) && max_D == pow(2, max_exp_D))
            std::cout << "2^" << P << " -1 is prime" << std::endl;
        else if (max_D == pow(2, max_exp_D))
            std::cout <<"M" << P << " no factor < 2^" << max_exp_D << std::endl;
        else{
            std::cout <<"exceeded the maximum size of memory"<< std::endl;
            std::cout <<"M" << P << " no factor < " << max_D << std::endl;
        }
    }
}

int main()
{ 
    mersenne_first_factor(86000063 , 52);

    return 0;
}
The problem is that even though the array is smaller for larger lookup values, increasing the number of classes increases the execution time.
I was expecting a reduction in execution time maybe I'm doing something wrong?
User140242 is offline   Reply With Quote
Old 2023-01-11, 20:20   #33
User140242
 
Jul 2022

3×52 Posts
Default

Once the following variables have been set:

Code:
    int b = 5;
    uint64_t P_t = P;
    std::vector<int> Residue_P;
    while (P_t >= (uint64_t)(1 << b)) {
        Residue_P.push_back((int)(P_t & ((1 << b) - 1)));
        P_t >>= b;
    }
    Residue_P.push_back((int)P_t);
    int num_res = Residue_P.size();
in place of is_2pow_onemod() it is possible to alternatively use the algorithm described in post #29

Code:
bool is_2pow_1mod(uint64_t P, uint64_t D, std::vector<int> &Residue_P, int num_res, uint64_t D_max, int b)
{ 
    // return true if 2^P = 1 (mod D)
    uint64_t mod = pow(2, Residue_P[num_res - 1]);
    num_res--;
    if (D <= D_max)
    {
        if(num_res > 0) {
            for (int i = num_res-1; i >= 0; i--) 
            {
                for (int j = 0; j < b; j++) 
                {
                    mod *= mod;
                    mod %= D;
                }
                mod <<= Residue_P[i];
                mod %= D;
            }
        }
        else
            mod %= D;
    }
    else
    {
        if(num_res > 0) {
            for (int i = num_res-1; i >= 0; i--) 
            {
                for (int j = 0; j < b; j++) 
                    mod = XY_modulo_D_LN(mod, mod, D, D_max);
                mod = XY_modulo_D_LN(mod, pow(2,Residue_P[i]), D, D_max);
            }
        }
        else
            mod %= D;        
    }
    return (mod == (uint64_t)1);
}
I've only tested it for a few values ​​but it should be slightly faster because the value of the P bits is not checked.
User140242 is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 15:54.


Thu Feb 2 15:54:38 UTC 2023 up 168 days, 13:23, 1 user, load averages: 0.50, 0.73, 0.84

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔