Re: Parallel Processing of subsets

Bill Allombert on Mon, 19 Jun 2023 23:49:51 +0200

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: Parallel Processing of subsets

To: pari-users@pari.math.u-bordeaux.fr
Subject: Re: Parallel Processing of subsets
From: Bill Allombert <Bill.Allombert@math.u-bordeaux.fr>
Date: Mon, 19 Jun 2023 23:45:09 +0200
Arc-authentication-results: i=1; smail; arc=none
Arc-message-signature: i=1; a=rsa-sha256; d=math.u-bordeaux.fr; s=openarc; t=1687211100; c=relaxed/relaxed; bh=LAY34boXeoIJhfRnnHxhUE6dwZoZiY0J21PxCyj+C58=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:Mail-Followup-To: References:MIME-Version:Content-Type:Content-Disposition: In-Reply-To; b=UvWpkqig2iYgcOVR5f7E9LfDe1RvW1CmhlBgQyELGtzHJVLxJfKdTXVM+7v33Huo8v9IVnW919mMOSKu2zZYtBXWRMYyLXcc47Qu6VZOz/in8IMqz+ktHbE+qG3aI+TnomXKBjb3nbU8xAFZQ/zuiVv0E0cikPed7AtLuruntX9VQao30BFkug5+ZuQeIJMtuzSmDMzunP4uQ80EqXP2SpnCAUG7IFJLw6Kq2Efo+SPkGXqw6DQFymL64XbUsU6B16zg8rrN3d8RTiraWxzvGXYr8JGCCPIimAUSeYF9mNkPPTsszlhRiyZ5jczhRZ4as4Au5mRce3v3nQXGYs335yMX2sE5PFX4aI/k7HNI7jn+Q7ormIx2S/NCWS8jm9QGaaU82Ok3zneG24oiBIOnPq4azwtDofeo4cin3GHYctMmOuyY1A/BNhPu2iLf758/5zBWZ88h9oBfrfbKWOmobRFMawabGpxIqsKqbusPwSVF6U6alP9F5Ru1U7rqeSGJDyQtlz/E6z0nYMI9fbbeQZam4laT208bSL/IfTYc21SyNACRrtVw0a0yd1i+XlcYr7HD2pzAVAkrXG/4Q+W1R16XU/6vA7Ge6tfE6PYNKnri1sfavVYKog3o4ScwtjQe6s9oXv1R+mUUftDw0Km4w/RJNWluhB3g8JXzMh9T3x0=
Arc-seal: i=1; a=rsa-sha256; d=math.u-bordeaux.fr; s=openarc; t=1687211100; cv=none; b=SkFZr+zmPHCqFDH5YnxKTFT9C+KsP6dzfYS4ENDsHrhvYQdmCzqLxsGDzWUKPvO+z2qbLKMXOvSUq+WZnhC2NiaAzTxj6iYdoXxMU9n4cBUlC1xk+8YNRiXSA9nuLSiRktSCY/fbQD4yC+qEvKNZ4BaCV7kognnmXSoo1f7By5s8RdPpCYRHNpTrE09FONHMKe/dJCKLmPymjcSLjVDV0vOmcO9QHWbzQfxu8KfY13U678c1ahaKg0FQIzj9dZQmnuZeIYE/TUHXz6JizzSac6/3Gg0tLhFAOGOJXM9Bxk2+RuTEIYbyAvo2fJI+SnBB1XtYOdxkcI0xEE9refxQ4VpNtb3wNGbaMd1dbVn9I3Gz1ztXM0kNbR6kJd/xoUufSlARuHjwjXzy35mvN0gsDJP7lInzYYPPyocddbUeR+EhWonnUxn4kd3L/J+ksBpJ4gQtJs03Xs7dOhhnBuw5Qgdfm177dYlbq168HXhbjYPg/h8A+dvqMFEU0R/ZfCg14W5ggMEBDubqphzH6Df47YTaX3jjjaqUjOR1Rep/CaWVlf+KO1jzQU9qNHV3jhA0fBhDtOIBu72jYrLlzYeCFiDhaGcNu1LWRJhg6ZTCKoaP2MmJdgnsv26yQbi0DP5C3Yg92QEK5zMDs6naZ+Xhc0epX/pBMU6jZ7WNFbvJnMc=
Authentication-results: smail; arc=none
Delivery-date: Mon, 19 Jun 2023 23:49:51 +0200
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=math.u-bordeaux.fr; s=2022; t=1687211099; bh=LAY34boXeoIJhfRnnHxhUE6dwZoZiY0J21PxCyj+C58=; h=Date:From:To:Subject:References:In-Reply-To:From; b=Kkb9BlK86ywsewS/+FFem2Hctdv5hkI2hH+jZVEs2CtkX2eWCv5v3kAvAEi3+T7Mq BRemmlohXJ7F10UeZ+GSTBIlP9kk+PC2DASke/acdGMj3tqb6HDQAHrS0aPeD38KS9 06R/hwQWjexisDQEPTFBLMzNNrrs709/SNJ0qfndeoh+wmYko12psSNGm7pkzYctgW nLIoOq25XfI/5eKEzmJaNNe2cAzKt9w/1k8Unvd7FslZ/LTzsMESJyr9ttZo9eUJEu a+IKvHxBxp0yGVSKMAVjkEn1qjU9sh+aXSn+OFbTXAZTPu0poqI9+y5IwNp+PDJ1b1 Y6rJiOJowXvzvgzQqtRN/2RqtlbVdqI7PzyuXINVnp1i38dKmop6Ms23dWY6UqdgAL EvJIatdTgvryPdpaPDfaryCPN/jUdyx0EaEl96Gwwg0uN6sEMAq/W0qDXl3yl0ungR 6aKSV9xpVzXSmF2GZhddRdl0QxTRAD3E3APMeprqpIEw0WOP9J+hciIKMe0yBHNGyY TKZ3B5EWsoyiNLeP55uouRThS8P3/UVW4bfWw/0a8wFFId7U30ltK7sm0NuLHUYBz1 HSLis7r7uxUxeAMdQKH3R2FJdLkQY74Nmr26YORHYDSAlMu6RqqnDOtVitxfXm6AOL bOc5F/2SEI0AS6wzhDD8iBb0=
In-reply-to: <ab62c1d7-f0fe-f435-8b11-cbd4736aca31@hs-rm.de>
Mail-followup-to: pari-users@pari.math.u-bordeaux.fr
References: <ab62c1d7-f0fe-f435-8b11-cbd4736aca31@hs-rm.de>

On Mon, Jun 19, 2023 at 10:09:32PM +0200, Daniel Berger wrote:
> Hello,
> 
> I want to do computations on subsets of a given set, that is quite large
> (around 600+ elements). Since there are quite many subsets, I do not
> necessarily want to process them all, but still as many as possible given my
> computational resources, using parallel processing.
> 
> For smaller sets (~25 elements), i have successfully used parfor, iterating
> over the number of subsets and using vecextract to get to the subsets
> themselves.
> 
> However, I don't think this approach is optimal for two reasons:
> 
>  * Each computation only takes a very short amount of time (few ms), so
>    the added overhead wastes resources (or so I assume after having
>    read the resources on using thread in PARI/GP), I think it would be
>    better to split up the data into chunks and let each core compute a
>    bunch before returning and getting a new batch of data.
>  * It would be better for me to start with smaller subsets (like
>    forsubset works) and the above approach doesn't do this.
> 
> So my questions are: Is my assumption above correct?

I would say yes.

> What is the easiest and/or best way to do this in PARI/GP?

Functions like forvec/forsubset etc are nice but often inefficient because
you need to recompute everything from scratch. One trick to speed up
computation is to reuse part of previous computation, which require considering
subset in a suitable order (think Gray code)

An example: solve the knapsack problem:

V=[2,3,5]
S=7
forsubset(3,s,if(vecsum(vecextract(V,s))==S,print(s)));

This is nice, but this redo the summation for each set.

{
  for(s1=0,1,
    my(t1=if(s1,V[1],0));
    for(s2=0,1,
      my(t2=if(s2,t1+V[2],t1));
      for(s3=0,1,
        my(t3=if(s3,t2+V[3],t2));
        if(t3==S,print([s1,s2,s3])))))
}
this reduce the number of additions to perform from 16 to 6.

Cheers,
Bill

References:
- Parallel Processing of subsets
  - From: Daniel Berger <Daniel.Berger@hs-rm.de>

Prev by Date: Re: Parallel Processing of subsets
Next by Date: efficient transformations between "sum of squares" and "sqrt(-1) (mod p)" for 1M-digit prime
Previous by thread: Re: Parallel Processing of subsets
Next by thread: efficient transformations between "sum of squares" and "sqrt(-1) (mod p)" for 1M-digit prime
Index(es):
- Date
- Thread