hermann on Sun, 01 Oct 2023 10:21:54 +0200


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: Pari/GP "halfgcd()" implementation


I came back to Bill's "myhalfgcd()" while working on functional
demo parity for RSA_numbers_factored repo.

Output on i7-11850H, that is a slowdown of more than 13000x ...

That was true, but for processing (large) 338342-digit numbers.
The factor is much less for smaller numbers.

Here is output of renamed function compared with other options
to compute x^2+y^2 of prime p =1 (mod 4) from sqrt(-1) (mod p):

$ gp -q < 36401.gp
36401-digit prime p
gcd(p, sqrtm1 + I)
  ***   last result computed in 2min, 6,352 ms.
qfbcornacchia(1, p)
  ***   last result computed in 1min, 29,420 ms.
ggcd([p, 0], [sqrtm1, 1])
  ***   last result computed in 14,647 ms.
billshalfgcd(sqrtm1,p)
  ***   last result computed in 4,808 ms.
halfgcd(sqrtm1,p)
  ***   last result computed in 6 ms.
all asserts OK
$

I transpiled the demo to JavaScript/NodeJS.
Complex bigint gcd, qfbcornacchia and halfgcd are missing.
But ggcd is available from RSA_numbers_repo.
And now billshalfgcd as well, and it is the fastest available
JavaScript option:

$ node 36401.js
RSA.validate() OK
36401-digit prime p
  no JS gcd(p, sqrtm1 + I)
  no JS qfbcornacchia(1, p)
ggcd([p, 0], [sqrtm1, 1])
94036 ms
billshalfgcd(sqrtm1,p)
26986 ms
  no JS halfgcd(sqrtm1,p)
all asserts OK
$


function billshalfgcd(a,b) {
    var M=[[1n,0n],[0n,1n]];
    var n = a<b ? b : a, q, r;
    while (b ** 2n >= n){
        [q,r] = [a/b,a%b];
        [a,b] = [b,r];
        // M = [0,1;1,-q]*M;
M = [[M[1][0], M[1][1]], [M[0][0]-q*M[1][0], M[0][1]-q*M[1][1]]];
    }
    return([M,[a,b]]);
}


Interesting, node is slower than Chrome JavaScript browser engine.
This is output on HTML page, same script:

36401-digit prime p
  no JS gcd(p, sqrtm1 + I)
  no JS qfbcornacchia(1, p)
ggcd([p, 0], [sqrtm1, 1])
47926 ms
billshalfgcd(sqrtm1,p)
13836 ms
  no JS halfgcd(sqrtm1,p)
all asserts OK


Not submitted to repo yet, just pushed to my website.
Firefox is superslow, Chrome is good.

Best start with few seconds 10000-digit demo first:
https://stamm-wilbrandt.de/10000/

I have to learn how to update HTML page on each "print()" output.
HTML page shows only after loading, after all computing finished:
https://stamm-wilbrandt.de/36401/

View page source shows the Javascript.


Regards,

Hermann.

On 2023-06-09 01:17, hermann@stamm-wilbrandt.de wrote:
Well, if you want a slow halfgcd, it is not harder than extended GCD:

myhalfgcd(a,b)=
{
  my(M=matid(2));
  my(n = max(a,b));
  while (b^2>=n,
    my([q,r]=divrem(a,b));
    [a,b] = [b,r];
    M = [0,1;1,-q]*M;
  );
  [M,[a,b]~];
}

(the matrix product can be inlined.)

halfgcdii implements both the Lehmer-Jebelean acceleration and the
subquadratic algorithm (following Thull-Yap).

I tried it, and it is really slow!

tail of "git diff 388342.halfgcd.gp", with your "myhalfgcd()" added at top:
...
 ##
-assert(V[2]^2 + M[2,1]^2 == p, "V[2],M[2,1]", " not sum of squares");
+[x,y] = [V[2], M[2,1]];
+assert(x^2 + y^2 == p, "x,y", " not sum of squares");
+
+[M,V]=myhalfgcd(sqrtm1,p);
+##
+[x,y] = [V[2], M[2,1]];
+assert(x^2 + y^2 == p, "x,y", " not sum of squares");
+
 print("done");


Output on i7-11850H, that is a slowdown of more than 13000x ...

$ python -c "print((13*60000+1518)/60)"
13025.3
$

$ gp-2.16 < 388342.halfgcd.gp
...
parisizemax = 2000003072, primelimit = 1048576, nbthreads = 8
foobar
  ***   last result: cpu time 60 ms, real time 60 ms.
*** last result: cpu time 13min, 1,518 ms, real time 13min, 4,658 ms.
done
Goodbye!
$


Regards,

Hermann Stamm-Wilbrandt.