hermann on Mon, 16 Feb 2026 15:49:59 +0100


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

question on PARI/GP t_INTMOD exponentiation optimization for multiple similar size big exponents with same base


It all started with me trying to play with J. Brillhart, D. H. Lehmer and J. L. Selfridge 1975 paper theorem 1:
https://t5k.org/prove/prove3_1.html

While theorem 1 does not require the same a_i for different prime divisors p_i of N-1 to be used, I learned that a primitive root allows to be used for all prime divisors of N-1.


I played with prime Euclid numbers https://oeis.org/A014545, for which the factorization of N-1 is trivial. A simple sequential replacement of znprimroot() is much faster because znprimroot() has to factor N-1 first, and prime 457th Euclid number is smallest example where GP has to spend much time on the factorization:

hermann@7950x:~$ gp -q seq.gp
? #
   timer = 1 (on)
? euclid_prime_find_root(457)
cpu time = 8,368 ms, real time = 8,370 ms.
31
? lift(znprimroot(prime(457)#+1))
cpu time = 1min, 34,734 ms, real time = 1min, 34,744 ms.
31
? #digits(prime(457)#+1)
1368
?


I created a parallel version of my gist which is much faster than sequential:
https://gist.github.com/Hermann-SW/a14dc268d533b96afbd52b2a9d3c92e8

hermann@7950x:~$ gp -q par2.gp
? #
   timer = 1 (on)
? euclid_prime_find_root(457)
cpu time = 14,400 ms, real time = 1,216 ms.
31
?


Next I wanted to speedup the parallel computation in passing Mod(g,N)^((N-1)/prime(n)) to check function as a base where the slightly bigger Mod(g,N)^((N-1)/prime(i)) can be computed from quickly for 1<=i<n. This is the small diff:

hermann@7950x:~$ diff par2.gp par3.gp
1,2c1,2
< check(En, n, phi, g, i)={
< parfor(i=1, n, lift(Mod(g, En)^(phi / prime(i))) == 1, r, if(r, return(0)));
---
check(En, n, phi, g, base, i)={
parfor(i=1, n, lift(base * Mod(g, En)^(phi / prime(i) - phi / prime(n))) == 1, r, if(r, return(0)));
11c11,12
<       if(check(En, n, phi, g, i), return(g));
---
      base=Mod(g, En)^(phi / prime(n));
      if(check(En, n, phi, g, base, i), return(g));
hermann@7950x:~$


I did let run seq.gp over night and hat to abort execution for prime 2673th Euclid number (with 10,387 decimal digits) after 12.5h. Parallel versions completed in 20 minutes (on 16C/32T AMD 7950X CPU), but to my surprise the runtime gets worse by my "improvement". First runtime for par2.gp, then for par3.gp:

? euclid_prime_find_root(2673)
%5 = 139
? ##
*** last result: cpu time 4h, 45min, 45,149 ms, real time 19min, 46,915 ms.
?

? euclid_prime_find_root(2673)
%4 = 139
? ##
*** last result: cpu time 4h, 48min, 3,278 ms, real time 22min, 14,287 ms.
?


It seems that GP does optimize muliple modular exponentiation with same base better than how I tried to do it.

1) What optimization is happening in GP for par2.gp exponentiation?
2) Where in the code base do I have to look for that optimization?


Regards,

Hermann.