| hermann on Mon, 16 Feb 2026 15:49:59 +0100 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
| question on PARI/GP t_INTMOD exponentiation optimization for multiple similar size big exponents with same base |
https://t5k.org/prove/prove3_1.htmlWhile theorem 1 does not require the same a_i for different prime divisors p_i of N-1 to be used, I learned that a primitive root allows to be used for all prime divisors of N-1.
I played with prime Euclid numbers https://oeis.org/A014545, for which the factorization of N-1 is trivial. A simple sequential replacement of znprimroot() is much faster because znprimroot() has to factor N-1 first, and prime 457th Euclid number is smallest example where GP has to spend much time on the factorization:
hermann@7950x:~$ gp -q seq.gp ? # timer = 1 (on) ? euclid_prime_find_root(457) cpu time = 8,368 ms, real time = 8,370 ms. 31 ? lift(znprimroot(prime(457)#+1)) cpu time = 1min, 34,734 ms, real time = 1min, 34,744 ms. 31 ? #digits(prime(457)#+1) 1368 ?I created a parallel version of my gist which is much faster than sequential:
https://gist.github.com/Hermann-SW/a14dc268d533b96afbd52b2a9d3c92e8 hermann@7950x:~$ gp -q par2.gp ? # timer = 1 (on) ? euclid_prime_find_root(457) cpu time = 14,400 ms, real time = 1,216 ms. 31 ?Next I wanted to speedup the parallel computation in passing Mod(g,N)^((N-1)/prime(n)) to check function as a base where the slightly bigger Mod(g,N)^((N-1)/prime(i)) can be computed from quickly for 1<=i<n. This is the small diff:
hermann@7950x:~$ diff par2.gp par3.gp
1,2c1,2
< check(En, n, phi, g, i)={
< parfor(i=1, n, lift(Mod(g, En)^(phi / prime(i))) == 1, r, if(r,
return(0)));
---
check(En, n, phi, g, base, i)={
parfor(i=1, n, lift(base * Mod(g, En)^(phi / prime(i) - phi /
prime(n))) == 1, r, if(r, return(0)));
11c11,12 < if(check(En, n, phi, g, i), return(g)); ---
base=Mod(g, En)^(phi / prime(n));
if(check(En, n, phi, g, base, i), return(g));
hermann@7950x:~$I did let run seq.gp over night and hat to abort execution for prime 2673th Euclid number (with 10,387 decimal digits) after 12.5h. Parallel versions completed in 20 minutes (on 16C/32T AMD 7950X CPU), but to my surprise the runtime gets worse by my "improvement". First runtime for par2.gp, then for par3.gp:
? euclid_prime_find_root(2673) %5 = 139 ? ##*** last result: cpu time 4h, 45min, 45,149 ms, real time 19min, 46,915 ms.
? ? euclid_prime_find_root(2673) %4 = 139 ? ##*** last result: cpu time 4h, 48min, 3,278 ms, real time 22min, 14,287 ms.
?It seems that GP does optimize muliple modular exponentiation with same base better than how I tried to do it.
1) What optimization is happening in GP for par2.gp exponentiation? 2) Where in the code base do I have to look for that optimization? Regards, Hermann.