![]() |
![]() |
#23 |
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
2·11·457 Posts |
![]()
Every road begins with a few first steps.
Just to give you, perhaps, a sandwich for the road (because this road is long), there are certain elements that can be used for 1) prototyping, 2) refactoring into something intermediate, 3) deep refactoring, with custom building blocks. (That is all with the meta-decision to stop at any stage if/when you might find that this was already done, or idea turned to be invalid or unoriginal or simply you might have lost interest. Or every once in a while you might have given it all you've got and the problem has not yielded. I've had a few where you do a lot of invisible work with no result to show... except maybe that final search limit was this or that. The other meta-consideration is to stop after or in parallel with Step 1 and estimate what, statistically, you expect to get. And at expense of how many core-years. It is a separate skill to acquire. Many people keep banging at problems where their expected yield is 1 hit per 10,000 core years ... and they only have 32 cores. Do the arithmetic.) For step 1) most of the time one can find that if you start using Pari/GP and get over the few first syntax hurdles, you will never quit it. Once you have a script you can test some hypothesis to a certain limit and with a bit of luck for some problems you will be done. For step 2) one would take a gp scaffold code and refactor it into GMP-assisted C/C++. This will be faster, but not lightning fast. For step 3) you can define (or refine based on your results from steps 1 and 2) the definition domain of the final implementation. Let's say this will be a 96-bit limit. (64-bit is even easier to implement; but 96 is quite commonly good; btw lots of folks have 96-bit oriented code.) Then if we take this toy problem as an example, you will realize that you want to implement a class (or a structure) with three 32-bit unsigned int limbs, and implement a) addition of elements a + b from it (sum up, carry over), b) multiplication x * y -- and you can do either "grade school" multiplication or Karatsuba. You can also borrow ad hoc implementations by studying other folks code, for example srsieve or polysieve (there is readily available ~55-bit multiplication implementation there that is cleverly using doubles for an intermediate). You can also find small ASM snippets to mulmod, powmod, invmod etc that can be embedded into C that will make code much faster, and then leave all other logic (if/else/for's) to gcc -O3. Then it is always fun to compare what you will get with simply using GMP despite its 'shoot from large cannons' overhead when all you need is 64 or 96 bits. Just 2 cents for the road. These hints might be useful for the general process of prototyping anything. Anything that you might think of tomorrow. |
![]() |
![]() |
![]() |
#24 |
If I May
"Chris Halsall"
Sep 2002
Barbados
2·72·113 Posts |
![]() |
![]() |
![]() |
![]() |
#25 | |
Jul 2022
3·52 Posts |
![]() Quote:
I will try to follow your advice. In the meantime I created this function that allows you to find the factor (~ 42 bits) of 13379827 or the factor (~ 46 bits) of 668267879 in less than 0.7 s. Code:
uint64_t XY_modulo_D_LN(uint64_t X,uint64_t Y,uint64_t D,uint64_t D_max) { // D<2^54 D>D_max uint64_t mod_D; if (X<=D_max && Y<=D_max ) { mod_D=(X*Y)%D; } else if (X<=D_max || Y<=D_max ) { uint64_t X1,N_t; uint64_t RD=(uint64_t)D%D_max; uint64_t KD=(uint64_t)D/D_max; if (Y<=D_max) { X1=Y; Y=X; X=X1; } mod_D=(X*(Y%D_max))%D_max; N_t=X*(Y/D_max)+(X*(Y%D_max))/D_max; mod_D=(mod_D+(D_max*(N_t%KD))%D)%D; mod_D=(mod_D+D-((N_t/KD)*RD)%D)%D; } else { uint64_t RD=(uint64_t)D%D_max; uint64_t KD=(uint64_t)D/D_max; uint64_t Rx=(uint64_t)X%D_max; uint64_t Ry=(uint64_t)Y%D_max; uint64_t Kx=(uint64_t)X/D_max; uint64_t Ky=(uint64_t)Y/D_max; uint64_t mod_D_t,N_t,R_t; mod_D=(Rx*Ry)%D; mod_D_t=((D_max*(D_max%KD))%D)%D; mod_D_t=(mod_D_t+D-((D_max/KD)*RD)%D)%D; N_t=mod_D_t; R_t=Kx; if (N_t>=(D_max/R_t)) { mod_D_t=(R_t*(N_t%D_max))%D_max; N_t=R_t*(N_t/D_max)+(R_t*(N_t%D_max))/D_max; mod_D_t=(mod_D_t+(D_max*(N_t%KD))%D)%D; mod_D_t=(mod_D_t+D-((N_t/KD)*RD)%D)%D; } else mod_D_t=mod_D_t*R_t; N_t=mod_D_t; R_t=Ky; if (N_t>=(D_max/R_t)) { mod_D_t=(R_t*(N_t%D_max))%D_max; N_t=R_t*(N_t/D_max)+(R_t*(N_t%D_max))/D_max; mod_D_t=(mod_D_t+(D_max*(N_t%KD))%D)%D; mod_D_t=(mod_D_t+D-((N_t/KD)*RD)%D)%D; } else mod_D_t=mod_D_t*R_t; N_t=mod_D+mod_D_t; mod_D=N_t; if (N_t>=D_max) { mod_D=(N_t)%D_max; mod_D=(mod_D+(D_max*((N_t/D_max)%KD))%D)%D; mod_D=(mod_D+D-(((N_t/D_max)/KD)*RD)%D)%D; } if (Rx>=KD) { mod_D_t=((D_max*(Rx%KD))%D)%D; mod_D_t=(mod_D_t+D-((Rx/KD)*RD)%D)%D; } else mod_D_t=(Rx*D_max)%D; N_t=mod_D_t; R_t=Ky; if (N_t>=(D_max/R_t)) { mod_D_t=(R_t*(N_t%D_max))%D_max; N_t=R_t*(N_t/D_max)+(R_t*(N_t%D_max))/D_max; mod_D_t=(mod_D_t+(D_max*(N_t%KD))%D)%D; mod_D_t=(mod_D_t+D-((N_t/KD)*RD)%D)%D; } else mod_D_t=mod_D_t*R_t; N_t=mod_D+mod_D_t; mod_D=N_t; if (N_t>=D_max) { mod_D=(N_t)%D_max; mod_D=(mod_D+(D_max*((N_t/D_max)%KD))%D)%D; mod_D=(mod_D+D-(((N_t/D_max)/KD)*RD)%D)%D; } if (Ry>=KD) { mod_D_t=((D_max*(Ry%KD))%D)%D; mod_D_t=(mod_D_t+D-((Ry/KD)*RD)%D)%D; } else mod_D_t=(Ry*D_max)%D; N_t=mod_D_t; R_t=Kx; if (N_t>=(D_max/R_t)) { mod_D_t=(R_t*(N_t%D_max))%D_max; N_t=R_t*(N_t/D_max)+(R_t*(N_t%D_max))/D_max; mod_D_t=(mod_D_t+(D_max*(N_t%KD))%D)%D; mod_D_t=(mod_D_t+D-((N_t/KD)*RD)%D)%D; } else mod_D_t=mod_D_t*R_t; N_t=mod_D+mod_D_t; mod_D=N_t; if (N_t>=D_max) { mod_D=(N_t)%D_max; mod_D=(mod_D+(D_max*((N_t/D_max)%KD))%D)%D; mod_D=(mod_D+D-(((N_t/D_max)/KD)*RD)%D)%D; } } return mod_D; } bool is_2pow_onemod(uint64_t P,uint64_t D) { // return true if 2^P = 1 (mod D) uint64_t x=2; uint64_t y=1; uint64_t D_max=(uint64_t)1<<30; if (D<=D_max) { while (P > (uint64_t)1) { if (P%2 == 1) y = (x * y) %D; x = (x * x) %D; P >>= 1; } return (((x*y)%D) == 1); } else { while (P > (uint64_t)1) { if (P%2 == 1) y=XY_modulo_D_LN(x,y,D,D_max); x=XY_modulo_D_LN(x,x,D,D_max); P >>= 1; } return (XY_modulo_D_LN(x,y,D,D_max) == (uint64_t)1); } } |
|
![]() |
![]() |
![]() |
#26 |
"Forget I exist"
Jul 2009
Dartmouth NS
2·3·23·61 Posts |
![]()
I'm not fluent in C but did your code use Mp prime ? If so D=2k(Mp)+1 for some k. You only have to test D with certain k values. Nevermind I see the 1 mod 8 versus 7 mod 8 talk earlier in the thread didn't read it fully.
Last fiddled with by science_man_88 on 2022-07-21 at 20:00 |
![]() |
![]() |
![]() |
#27 | |
"Forget I exist"
Jul 2009
Dartmouth NS
2×3×23×61 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#28 | |
Jul 2022
3·52 Posts |
![]() Quote:
Here you find the final version https://gist.github.com/user140242/0...d2738788bb1cae. A wheel sieve is used which has been set to eliminate all multiples of small primes less than min(bW, 524288) including multiples of 3. |
|
![]() |
![]() |
![]() |
#29 | ||
Jul 2022
3×52 Posts |
![]()
To get the mind busy.
Several times I have been told in this thread that the best solution to calculate (2 ^ p-1) module D is the one described in the math page. As described, given a certain number P, a number of steps equal to the number of bits of P converted into binary are required. Example for p=86000063 you need 27 steps. In reality, different solutions can be adopted to achieve the same result with the same complexity, for example: fix b = 6 then 2^b=64 P=P%64 +P/64 *64 = P%64 +b1 *64 b1=b1%64 +b1/64 *64 = P%64 +b2 *64 then b2=b1/64 ..... bm+1 =bm%64 +bm/64 *64 continue until bm/64 <64 the remainder must be stored in a vector [P%64, b1%64, ... , bm%64] then we start from mod=2^(bm/64) the product is repeated for b=6 times mod=(mod*mod)%D mod=(mod*mod)%D mod=(mod*mod)%D mod=(mod*mod)%D mod=(mod*mod)%D mod=(mod*mod)%D then it is calculated mod=(mod* 2^(bm%64) )%D the whole is repeated for all the remainders in the vector. Practical example P=86000063 and D=61920045361 Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#30 |
Jul 2022
4B16 Posts |
![]()
In reference to post #29 maybe this: Find factors of a Mersenne number might be of interest to someone.
|
![]() |
![]() |
![]() |
#31 | |
Jul 2022
3×52 Posts |
![]()
After realizing an error in the previous version I updated the page with the new version. I hope this is correct I tested to look for the factors of M86000063 up to 2^52 and the execution time is about 9s twice as long as Prime95.
Quote:
|
|
![]() |
![]() |
![]() |
#32 |
Jul 2022
3·52 Posts |
![]()
I am writing this post as a learning exercise.
In the link in the previous post there is a version to find the possible factors of prime numbers below a certain set value, the search is done using the 4 possible classes mod(2*p*12) and performing a sieve for small prime numbers. Summarizing: if we use modular arithmetic a possible factor can be written as d=r+bW*k where r is the remainder associated with the possible class and bW the modulus. As we know the possible factors are of the type d=1+2*p*k so bW=2*p. Furthermore d=1(mod 8) or d=7(mod 8) this narrows the field into factors of the type: d=1+8*p*k and d=1+2*p+8*p*k if p=3(mod4) or d=1+8*p*k and d=1+6*p+8*p*k if p=1(mod4) in this way bW=2*p*4 and we have used only 2 classes (useful classes (r1=1,r2=1+2*p) or (r1=1,r2=1+6*p)) of the 4 possible for d=r+bW*k. In this post there is a code to find the useful classes considering bW=2*p*PrimesBaseProd where PrimesBaseProd is the maximum number of classes and nR the number of useful classes that coincides with phi(PrimesBaseProd) where phi() is the Eulers's Totient function. Although with the sieve all the multiples for small primes are eliminated, the purpose of increasing the number of classes is to decrease the size of the memory used and to have more useful classes to work in parallel with multithreading. So I made this version where it is possible to vary the n_PB variable to increase classes: for n_PB=1 max_num_classes=4, n_PB=2 max_num_classes=12, n_PB=3 max_num_classes=60, n_PB=4 max_num_classes=420, (n_PB=5 max_num_classes=4620 not tested) Code:
#include <iostream> #include <cmath> #include <algorithm> #include <vector> #include <cstdlib> #include <stdint.h> int64_t Euclidean_Diophantine( int64_t coeff_a, int64_t coeff_b) { // return y in Diophantine equation coeff_a x + coeff_b y = 1 int64_t k=1; std::vector<int64_t> div_t; std::vector<int64_t> rem_t; std::vector<int64_t> coeff_t; div_t.push_back(coeff_a); rem_t.push_back(coeff_b); coeff_t.push_back((int64_t)0); div_t.push_back((int64_t)div_t[0] / rem_t[0]); rem_t.push_back((int64_t)div_t[0] % rem_t[0]); coeff_t.push_back((int64_t)0); while (rem_t[k] > 1) { k++; div_t.push_back((int64_t)rem_t[k-2] / rem_t[k-1]); rem_t.push_back((int64_t)rem_t[k-2] % rem_t[k-1]); coeff_t.push_back((int64_t)0); } k--; coeff_t[k] = - div_t[k+1]; if (k>0) coeff_t[k-1] = (int64_t)1; while (k > 1) { k--; coeff_t[k-1] = coeff_t[k+1]; coeff_t[k] += (int64_t)(coeff_t[k+1] * (-div_t[k+1])); } if (k == 1) return (int64_t)(coeff_t[k-1] + coeff_t[k] * (-div_t[k])); else return (int64_t)(coeff_t[0]); } uint64_t XY_modulo_D_LN(uint64_t X, uint64_t Y, uint64_t D, uint64_t D_max) { //D>D_max tested for D<2^54 uint64_t mod_D; if (X <= D_max && Y <= D_max ) { mod_D = (X * Y) % D; } else if (X <= D_max || Y <= D_max ) { uint64_t N_t; uint64_t RD = D % D_max; uint64_t KD = D / D_max; if (Y <= D_max) { uint64_t X1 = Y; Y = X; X = X1; } mod_D = (X * (Y % D_max)) % D_max; N_t = X * (Y / D_max) + (X * (Y % D_max)) / D_max; mod_D = (mod_D + (D_max *(N_t % KD)) % D) % D; mod_D = (mod_D + D -((N_t / KD) * RD) % D) % D; } else { uint64_t RD = D % D_max; uint64_t KD = D / D_max; uint64_t Rx = X % D_max; uint64_t Ry = Y % D_max; uint64_t Kx = X / D_max; uint64_t Ky = Y / D_max; uint64_t mod_D_t,N_t,R_t; mod_D =(Rx * Ry) % D; mod_D_t = ((D_max * (D_max % KD)) % D) % D; mod_D_t = (mod_D_t + D -((D_max / KD) * RD) % D) % D; N_t = mod_D_t; R_t = Kx; if (N_t >= (D_max / R_t)) { mod_D_t = (R_t *(N_t % D_max)) % D_max; N_t = R_t * (N_t / D_max)+(R_t * (N_t % D_max)) / D_max; mod_D_t = (mod_D_t + (D_max * (N_t % KD)) % D) % D; mod_D_t = (mod_D_t + D - ((N_t / KD) * RD) % D) % D; } else mod_D_t = mod_D_t * R_t; N_t = mod_D_t; R_t = Ky; if (N_t >= (D_max / R_t)) { mod_D_t = (R_t * (N_t % D_max)) % D_max; N_t = R_t * (N_t / D_max) + (R_t * (N_t % D_max)) / D_max; mod_D_t = (mod_D_t + (D_max * (N_t % KD)) % D) % D; mod_D_t = (mod_D_t + D - ((N_t / KD) * RD) % D) % D; } else mod_D_t = mod_D_t * R_t; N_t = mod_D + mod_D_t; mod_D = N_t; if (N_t >= D_max) { mod_D =(N_t) % D_max; mod_D =(mod_D + (D_max *((N_t / D_max) % KD)) % D) % D; mod_D =(mod_D + D - (((N_t / D_max) / KD) * RD) % D) % D; } if (Rx >= KD) { mod_D_t = ((D_max * (Rx % KD)) % D) % D; mod_D_t = (mod_D_t + D - ((Rx / KD) * RD) % D) % D; } else mod_D_t = (Rx * D_max) % D; N_t = mod_D_t; R_t = Ky; if (N_t >= (D_max / R_t)) { mod_D_t = (R_t * (N_t % D_max)) % D_max; N_t = R_t * (N_t / D_max) + (R_t * (N_t % D_max)) / D_max; mod_D_t = (mod_D_t + (D_max * (N_t % KD)) % D) % D; mod_D_t = (mod_D_t + D - ((N_t / KD) * RD) % D) % D; } else mod_D_t = mod_D_t * R_t; N_t = mod_D + mod_D_t; mod_D = N_t; if (N_t >= D_max) { mod_D = N_t % D_max; mod_D=(mod_D + (D_max*((N_t / D_max) % KD)) % D) % D; mod_D=(mod_D + D - (((N_t / D_max) / KD) * RD) % D) % D; } if (Ry >= KD) { mod_D_t = ((D_max * (Ry % KD)) % D) % D; mod_D_t = (mod_D_t + D - ((Ry/KD) * RD) % D) % D; } else mod_D_t = (Ry * D_max) % D; N_t = mod_D_t; R_t = Kx; if (N_t >= (D_max / R_t)) { mod_D_t = (R_t * (N_t % D_max)) % D_max; N_t=R_t * (N_t / D_max) + (R_t * (N_t % D_max)) / D_max; mod_D_t = (mod_D_t + (D_max * (N_t % KD)) % D) % D; mod_D_t = (mod_D_t + D - ((N_t / KD) * RD) % D) % D; } else mod_D_t = mod_D_t * R_t; N_t = mod_D + mod_D_t; mod_D = N_t; if (N_t >= D_max) { mod_D = N_t % D_max; mod_D = (mod_D + (D_max * ((N_t / D_max) % KD)) % D) % D; mod_D = (mod_D + D - (((N_t / D_max) / KD) * RD) % D) % D; } } return mod_D; } bool is_2pow_onemod(uint64_t P, uint64_t D, uint64_t D_max) { // return true if 2^P = 1 (mod D) uint64_t x = 2; uint64_t y = 1; if (D <= D_max) { while (P > (uint64_t)1) { if (P % 2 == 1) y = (x * y) % D; x = (x * x) % D; P >>= 1; } return (((x * y) % D) == 1); } else { while (P > (uint64_t)1) { if (P % 2 == 1) y=XY_modulo_D_LN(x, y, D, D_max); x = XY_modulo_D_LN(x, x, D, D_max); P >>= 1; } return (XY_modulo_D_LN(x, y, D, D_max) == (uint64_t)1); } } void mersenne_first_factor(uint64_t P, int64_t max_exp_D) { // P>61 returns, if it exists, the first factor D < 2^max_exp_D of (2^P - 1) // maximum memory size int64_t max_memory_size = 209715200; // set the maximum value small primes for sieve int64_t dim_primes = 8192; // set the maximum value for modulo function XY_modulo_D_LN uint64_t D_max_mod = (uint64_t) 1 << 30; //It is possible to vary n_PB to change the maximum number of classes, for example n_PB=1 max_num_classes=4, //n_PB=2 max_num_classes=12, n_PB=3 max_num_classes=60, n_PB=4 max_num_classes=420, (n_PB=5 max_num_classes=4620 not tested) int n_PB = 2; int64_t PrimesBase[5] = {2,3,5,7,11}; int64_t max_num_classes = 2; for(int j = 0; j < n_PB; j++) max_num_classes *= PrimesBase[j]; int64_t bW = (int64_t)(2 * P * max_num_classes); dim_primes = std::min(bW, dim_primes); std::vector<int64_t> RW; int64_t p0_sieve = PrimesBase[n_PB - 1] + 2; if (n_PB>1) { int64_t r_t1,r_t7; int j; r_t1 = 1; r_t7 = 1 + 2 * P; if (P % 4 == 1) r_t7 += 4 * P; for (int64_t k = 1; k <= max_num_classes/4; k++) { for (j = 1; j < n_PB; j++) if (r_t1 % PrimesBase[j] == 0) break; if (j == n_PB) RW.push_back(r_t1); for (j = 1 ; j < n_PB ; j++) if (r_t7 % PrimesBase[j] == 0) break; if (j == n_PB) RW.push_back(r_t7); r_t1 += 8 * P; r_t7 += 8 * P; } } else { p0_sieve = 3; RW.push_back(1); RW.push_back(1 + 2 * P); if (P % 4 == 1) RW[1] += 4 * P; } int nR=RW.size(); int64_t max_D = pow(2,max_exp_D); max_exp_D = std::min(max_exp_D , (int64_t)(P / 2 + 1)); int64_t dim_step = max_D / bW; if (dim_step * nR > max_memory_size){ dim_step = max_memory_size / nR; max_D = dim_step * bW; } // vector for find small primes for sieve std::vector<char> Primes_s(dim_primes + 1, true); // find small primes for sieve for (int64_t p = 3; p * p <= dim_primes ; p += 2) if (Primes_s[p]) for (int64_t m = p * p; m <= dim_primes; m += p) Primes_s[m]=false; if ((int64_t)P <= dim_primes) Primes_s[(int64_t)P] = false; for(int j = 1; j < nR; j++) { if (RW[j] <= dim_primes) Primes_s[RW[j]] = false; } // vector for find factor primes remainder RWi mod bW factor D= RWi+bW*k std::vector<char> Factor_Sieve(nR * dim_step, true); int64_t r_t, r_t1, C1, C2, m, mmin; for (int64_t p = p0_sieve; p < dim_primes; p += 2) { if (Primes_s[p]) { r_t1 = Euclidean_Diophantine(bW, bW - p); for(int j = 0; j < nR; j++) { // delete multiples of small primes in Factor_Sieve r_t = (r_t1 * (- RW[j])) % bW; if (r_t > 1) r_t -= bW; C2 = (int64_t) (( - bW + p) * r_t / bW); C1 = r_t - bW + p; mmin=(int64_t)(bW + C1 + C2); for (m = mmin; m < dim_step && m >= 0; m += p) Factor_Sieve[m + j * dim_step] = false; } } } Factor_Sieve[0] = false; int D=0; uint64_t D_RWt; for(int j = 0; j < nR; j++) { D_RWt = RW[j]; for (int64_t k = 0; k < dim_step; D_RWt += bW, k++) { if (Factor_Sieve[k + j * dim_step] == true) { if (is_2pow_onemod(P, D_RWt, D_max_mod)) { std::cout << "M" << P << " factor: " << D_RWt << std::endl; D++; } } } } if (D == 0) { if (max_exp_D == (int64_t) (P / 2 + 1) && max_D == pow(2, max_exp_D)) std::cout << "2^" << P << " -1 is prime" << std::endl; else if (max_D == pow(2, max_exp_D)) std::cout <<"M" << P << " no factor < 2^" << max_exp_D << std::endl; else{ std::cout <<"exceeded the maximum size of memory"<< std::endl; std::cout <<"M" << P << " no factor < " << max_D << std::endl; } } } int main() { mersenne_first_factor(86000063 , 52); return 0; } I was expecting a reduction in execution time maybe I'm doing something wrong? |
![]() |
![]() |
![]() |
#33 |
Jul 2022
3×52 Posts |
![]()
Once the following variables have been set:
Code:
int b = 5; uint64_t P_t = P; std::vector<int> Residue_P; while (P_t >= (uint64_t)(1 << b)) { Residue_P.push_back((int)(P_t & ((1 << b) - 1))); P_t >>= b; } Residue_P.push_back((int)P_t); int num_res = Residue_P.size(); Code:
bool is_2pow_1mod(uint64_t P, uint64_t D, std::vector<int> &Residue_P, int num_res, uint64_t D_max, int b) { // return true if 2^P = 1 (mod D) uint64_t mod = pow(2, Residue_P[num_res - 1]); num_res--; if (D <= D_max) { if(num_res > 0) { for (int i = num_res-1; i >= 0; i--) { for (int j = 0; j < b; j++) { mod *= mod; mod %= D; } mod <<= Residue_P[i]; mod %= D; } } else mod %= D; } else { if(num_res > 0) { for (int i = num_res-1; i >= 0; i--) { for (int j = 0; j < b; j++) mod = XY_modulo_D_LN(mod, mod, D, D_max); mod = XY_modulo_D_LN(mod, pow(2,Residue_P[i]), D, D_max); } } else mod %= D; } return (mod == (uint64_t)1); } |
![]() |
![]() |