Bill Allombert on Sat, 15 Jul 2023 00:24:38 +0200


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: Why is "lift(Mod(qnr, n)^(n\4))" 16% slower than C libgmp "powm(r, qnr, n/4, n)" ?


On Fri, Jul 14, 2023 at 11:58:22PM +0200, hermann@stamm-wilbrandt.de wrote:
> On 2023-07-14 19:13, Bill Allombert wrote:
> > > The short answer is that the GNU MP library does not provide a
> > > function mpn_powm
> > > that PARI could use. mpz_powm use a lot of internal mpn functions
> > > for fast modular
> > > reduction which are very efficient but not public.
> > > 
> > > Now, I could add a wrapper for mpz_powm for large entries but 16%
> > > slower is not
> > > that bad and we need fast modular reduction in more general setting.
> > 
> > Could you compare
> > 
> > p=(2^95369 + 1)/3; Mod(2,p)^((p-1)/5)
> > with mpz_powm and PARI ?
> > 
> Sure.
> 
> Performance problem was related to the base, exponent and modulus I used.
> For your base, exponent and modulus gp outperforms C++ with mpz_powm() by
> 27% !
> That is very good to know.
> OS is Ubuntu 22.04 Desktop.
> 
> gp:  19.6s (5.440 GHz)
> g++: 26.9s (5.320 GHz)

Yes, I got the same result... but in fact, this is rather annoying!
If I change PARI to use mpz_powm then I will slowdown this case.

There are two factors that can effect this:
1) the base 2 is handled specially by PARI
2) the bit pattern of the exponent, in particular the number of 1s.

If you could find in which case PARI is faster then GMP, this would be helpful.

(you do not need to try very large numbers, 10000bit should be sufficient.)

Cheers,
Bill.