mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Programming

Reply
 
Thread Tools
Old 2022-11-04, 16:18   #1
rdmclure
 
Nov 2022

2 Posts
Default Java modPow slower than pow following by mod

testing how long it would take to raise 3 to a power x mod 2^p - 1.
I tried 2^p - 1 where p = the highest known mersenne prime 2^82589933
because of Java the exponent must be int so less than 2^32 - 1
I chose 2^26
so I try something like
Int a = 26;
int n = 82589933
int j =0;
BigInteger b = new BigInteger(“2”);
BigInteger c = new BigInteger (“3”);
BigInteger d= new BigInteger (“0”);
BigInteger e= new BigInteger (“0”);
BigInteger p= new BigInteger (“0”);
BigInteger large = new BigInteger (“0”);
BigInteger test= new BigInteger(“0”);
d=b.pow(a);
j = d.intVal(); // exponent must be int
e=b.pow(n); //
p=e.subtract(BigInteger.ONE); //prime
large = c.pow(j); // 3^(2^26);
test = large.mod(p);

this takes about a minute and a half to run
then I try
test = c.modPow(d,p) and I give up after an hour
I would feel better if they took roughly the same time.
rdmclure is offline   Reply With Quote
Old 2022-11-04, 18:20   #2
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

156178 Posts
Default

2^p - 1 for your p is a very large number at over 2 million digits. Of course it will take a long time to compute.
rogue is offline   Reply With Quote
Old 2022-11-04, 19:15   #3
rdmclure
 
Nov 2022

2 Posts
Default

but. … it didn’t take a long time .
this code is 2 ^ 82539833
a min and forty seconds
e=b.pow(n); //
p=e.subtract(BigInteger.ONE); //prime

It’s the modPow that is hours+
rdmclure is offline   Reply With Quote
Old 2022-11-04, 19:26   #4
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

165548 Posts
Default

Quote:
Originally Posted by rdmclure View Post
testing how long it would take to raise 3 to a power x mod 2^p - 1.
I tried 2^p - 1 where p = the highest known mersenne prime 2^82589933
because of Java the exponent must be int so less than 2^32 - 1
I chose 2^26
so I try something like
Int a = 26;
int n = 82589933
int j =0;
BigInteger b = new BigInteger(“2”);
BigInteger c = new BigInteger (“3”);
BigInteger d= new BigInteger (“0”);
BigInteger e= new BigInteger (“0”);
BigInteger p= new BigInteger (“0”);
BigInteger large = new BigInteger (“0”);
BigInteger test= new BigInteger(“0”);
d=b.pow(a); //2^26
j = d.intVal(); // exponent must be int
e=b.pow(n); //2^82589933
p=e.subtract(BigInteger.ONE); //prime
large = c.pow(j); // 3^(2^26);
test = large.mod(p);

this takes about a minute and a half to run
then I try
test = c.modPow(d,p) and I give up after an hour
I would feel better if they took roughly the same time.
Testing on how many cores & java streams of what processor?
The highest known mersenne prime is not "2^82589933" nor is its exponent.
226 = 67108864. Reached in 26 squarings.
2^82589933 is hugely bigger. Reached in 82,589,933 squarings.
Try working your way up gradually, rather than in one giant leap, & include accurate timing in the program, to determine run time scaling in acceptable testing time. For example p=
13
127
1279
11213
110503
graph resulting timings log/log & regression fit a power function to the timing data for which run time is a function of exponent. (Very small values may have initialization & other overhead greater than computation time, and should be excluded from the power fit.)

Well written, tuned and optimized compiled GPU code on a fast GPU (gpuowl 6.11-380 on Radeon VII) takes ~10 hours for PRP3 computation on M82589933, and scales as good fft implementations do ~p2 log p log log p ~p2.1. Higher powers are bad. Long multiplication algorithm scales as ~p3.
Naive implementations have been posted, whose run time scalings indicate very long wavefront timings, in one case longer than the estimated age of the universe.

Last fiddled with by kriesel on 2022-11-04 at 19:43
kriesel is offline   Reply With Quote
Old 2022-11-04, 19:31   #5
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22×7×269 Posts
Default

Quote:
Originally Posted by rdmclure View Post
but. … it didn’t take a long time .
this code is 2 ^ 82539833
The string 82539833 does not appear in post 1.
kriesel is offline   Reply With Quote
Old 2022-11-04, 19:31   #6
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

5×2,237 Posts
Default

Quote:
Originally Posted by kriesel View Post
...in one case longer than the estimated age of the universe.
ROFL... Thank you for that! Seriously.
chalsall is offline   Reply With Quote
Old 2022-11-05, 11:22   #7
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

13·41 Posts
Default

Quote:
Originally Posted by rdmclure View Post
testing how long it would take to raise 3 to a power x mod 2^p - 1.
I tried 2^p - 1 where p = the highest known mersenne prime 2^82589933
because of Java the exponent must be int so less than 2^32 - 1
I chose 2^26
so I try something like
Int a = 26;
int n = 82589933
int j =0;
BigInteger b = new BigInteger(“2”);
BigInteger c = new BigInteger (“3”);
BigInteger d= new BigInteger (“0”);
BigInteger e= new BigInteger (“0”);
BigInteger p= new BigInteger (“0”);
BigInteger large = new BigInteger (“0”);
BigInteger test= new BigInteger(“0”);
d=b.pow(a);
j = d.intVal(); // exponent must be int
e=b.pow(n); //
p=e.subtract(BigInteger.ONE); //prime
large = c.pow(j); // 3^(2^26);
test = large.mod(p);

this takes about a minute and a half to run
then I try
test = c.modPow(d,p) and I give up after an hour
I would feel better if they took roughly the same time.

I extended your test program to a broader class of inputs.
The (appended) program computes c^d mod p in both ways, where
* a = 2,3,4,5,...
* c is a random number in [3, 1023]
* d is a random number about the size of 2^a,
* and p is a random number about the size of 2^(2^a)

The results confirm the issue for random numbers of this kind. (see below)

Looking at BigInteger.modPow() source code, it looks quite sophisticated, using Montgomery squaring and multiplication and even tearing the modulus into an "odd part" and power of two,
and finally combining thew results using the chinese remainder theorem. But something the developers must have gotten terribly wrong.

Probably this is not the main cause, but one thing that irritated me for a long time is the following source code comment on BigInteger.montgomeryMultiply():
// Montgomery multiplication. These are wrappers for
// implMontgomeryXX routines which are expected to be replaced by
// virtual machine intrinsics. We don't use the intrinsics for
// very large operands: MONTGOMERY_INTRINSIC_THRESHOLD should be
// larger than any reasonable crypto key.

MONTGOMERY_INTRINSIC_THRESHOLD is 512 and measured in ints; so the intrinsics (optimized assembler code) is not used for modulus > 16384 bit.
This suggests that the method was never supposed to be optimized for large inputs.
Possibly it was only tested for the input sizes required by Java's own crypto libraries.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
test logs (getting interesting at a > 14)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

a = 2, c = 310, d = 7, computed in 0 ms ---------------------------
p (8 bit) computed in 1 ms
r = c^d (58 bit) computed in 0 ms
r % p (7 bit) computed in 0 ms
c.modPow(d,p) (7 bit) computed in 0 ms
results equal? true

a = 3, c = 36, d = 22, computed in 0 ms ---------------------------
p (24 bit) computed in 0 ms
r = c^d (114 bit) computed in 0 ms
r % p (24 bit) computed in 0 ms
c.modPow(d,p) (24 bit) computed in 0 ms
results equal? true

a = 4, c = 388, d = 24, computed in 0 ms ---------------------------
p (26 bit) computed in 0 ms
r = c^d (207 bit) computed in 0 ms
r % p (26 bit) computed in 0 ms
c.modPow(d,p) (26 bit) computed in 1 ms
results equal? true

a = 5, c = 379, d = 61, computed in 0 ms ---------------------------
p (62 bit) computed in 0 ms
r = c^d (523 bit) computed in 0 ms
r % p (60 bit) computed in 0 ms
c.modPow(d,p) (60 bit) computed in 0 ms
results equal? true

a = 6, c = 820, d = 183, computed in 0 ms ---------------------------
p (184 bit) computed in 0 ms
r = c^d (1772 bit) computed in 0 ms
r % p (183 bit) computed in 0 ms
c.modPow(d,p) (183 bit) computed in 0 ms
results equal? true

a = 7, c = 174, d = 201, computed in 0 ms ---------------------------
p (203 bit) computed in 0 ms
r = c^d (1497 bit) computed in 0 ms
r % p (203 bit) computed in 1 ms
c.modPow(d,p) (203 bit) computed in 0 ms
results equal? true

a = 8, c = 581, d = 345, computed in 0 ms ---------------------------
p (347 bit) computed in 0 ms
r = c^d (3168 bit) computed in 0 ms
r % p (347 bit) computed in 0 ms
c.modPow(d,p) (347 bit) computed in 0 ms
results equal? true

a = 9, c = 118, d = 1159, computed in 0 ms ---------------------------
p (1161 bit) computed in 0 ms
r = c^d (7977 bit) computed in 1 ms
r % p (1159 bit) computed in 0 ms
c.modPow(d,p) (1159 bit) computed in 0 ms
results equal? true

a = 10, c = 181, d = 2157, computed in 0 ms ---------------------------
p (2158 bit) computed in 1 ms
r = c^d (16178 bit) computed in 1 ms
r % p (2158 bit) computed in 0 ms
c.modPow(d,p) (2158 bit) computed in 1 ms
results equal? true

a = 11, c = 490, d = 3648, computed in 0 ms ---------------------------
p (3642 bit) computed in 0 ms
r = c^d (32601 bit) computed in 3 ms
r % p (3642 bit) computed in 2 ms
c.modPow(d,p) (3642 bit) computed in 0 ms
results equal? true

a = 12, c = 983, d = 11308, computed in 0 ms ---------------------------
p (11308 bit) computed in 1 ms
r = c^d (112414 bit) computed in 10 ms
r % p (11305 bit) computed in 8 ms
c.modPow(d,p) (11305 bit) computed in 3 ms
results equal? true

a = 13, c = 871, d = 24939, computed in 0 ms ---------------------------
p (24941 bit) computed in 0 ms
r = c^d (243568 bit) computed in 11 ms
r % p (24941 bit) computed in 9 ms
c.modPow(d,p) (24941 bit) computed in 21 ms
results equal? true

a = 14, c = 331, d = 24853, computed in 0 ms ---------------------------
p (24854 bit) computed in 0 ms
r = c^d (208037 bit) computed in 6 ms
r % p (24854 bit) computed in 6 ms
c.modPow(d,p) (24854 bit) computed in 18 ms
results equal? true

a = 15, c = 714, d = 76517, computed in 0 ms ---------------------------
p (76519 bit) computed in 1 ms
r = c^d (725365 bit) computed in 31 ms
r % p (76519 bit) computed in 34 ms
c.modPow(d,p) (76519 bit) computed in 154 ms
results equal? true

a = 16, c = 803, d = 143429, computed in 0 ms ---------------------------
p (143430 bit) computed in 1 ms
r = c^d (1383984 bit) computed in 104 ms
r % p (143429 bit) computed in 87 ms
c.modPow(d,p) (143429 bit) computed in 349 ms
results equal? true

a = 17, c = 520, d = 380768, computed in 0 ms ---------------------------
p (380770 bit) computed in 2 ms
r = c^d (3435429 bit) computed in 78 ms
r % p (380770 bit) computed in 146 ms
c.modPow(d,p) (380770 bit) computed in 2583 ms
results equal? true

a = 18, c = 792, d = 578458, computed in 0 ms ---------------------------
p (578460 bit) computed in 0 ms
r = c^d (5570179 bit) computed in 321 ms
r % p (578460 bit) computed in 293 ms
c.modPow(d,p) (578460 bit) computed in 6322 ms
results equal? true

a = 19, c = 370, d = 708558, computed in 0 ms ---------------------------
p (708559 bit) computed in 0 ms
r = c^d (6044979 bit) computed in 309 ms
r % p (708556 bit) computed in 330 ms
c.modPow(d,p) (708556 bit) computed in 9976 ms
results equal? true

a = 20, c = 888, d = 2737668, computed in 0 ms ---------------------------
p (2737669 bit) computed in 1 ms
r = c^d (26813859 bit) computed in 1849 ms
r % p (2737667 bit) computed in 2863 ms
c.modPow(d,p) (2737667 bit) computed in 148685 ms
results equal? true

a = 21, c = 126, d = 3460325, computed in 0 ms ---------------------------
p (3460327 bit) computed in 1 ms
r = c^d (24143657 bit) computed in 1902 ms
r % p (3460323 bit) computed in 2721 ms
c.modPow(d,p) (3460323 bit) computed in 260967 ms
results equal? true

a = 22, c = 621, d = 10025779, computed in 0 ms ---------------------------
p (10025777 bit) computed in 3 ms
r = c^d (93023684 bit) computed in 19541 ms
r % p (10025777 bit) computed in 19307 ms
c.modPow(d,p) (10025777 bit) computed in 2299139 ms
results equal? true
Attached Files
File Type: txt LargeModPowTest.txt (5.3 KB, 39 views)

Last fiddled with by Till on 2022-11-05 at 11:25 Reason: interesting tests at a>14
Till is offline   Reply With Quote
Old 2022-11-05, 17:24   #8
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

13·41 Posts
Default

After some more investigations, it seems to me that Montgomery multiplication is a bad choice for this kind of numbers (small base, big exponent, big modulus).

Let B be the base, E the exponent, N the modulus.

With Montgomery multiplication, modPow() does something like O(log E) multiplications/squares (and possibly modulus computations) of N-size numbers.
If B^E is not much bigger than N, then a small fraction of those required Montgomery mults/squares are enough to match the runtime of the "silly" B^E computation.
Doing "naive" repeated squaring (without Montgomery) might be even better because it starts with very small products/squares, whose size will surpass the modulus in only a few last operations.

Last fiddled with by Till on 2022-11-05 at 17:27 Reason: clean up
Till is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Java applet alternative a1call Programming 19 2019-11-08 22:31
Simple Atkin sieve java example lucaf Computer Science & Computational Number Theory 8 2017-10-06 20:56
Java Quadratic Sieve Ilya Gazman Factoring 3 2016-02-22 11:32
Java based program GordonBM Information & Answers 11 2012-04-24 08:27
Understandable QS java Implementation ThiloHarich Factoring 41 2005-11-28 21:18

All times are UTC. The time now is 10:29.


Fri Mar 24 10:29:06 UTC 2023 up 218 days, 7:57, 0 users, load averages: 1.32, 1.20, 1.10

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔