简体   繁体   English

PostgreSQL选择与给定间隔匹配的最近时间点

[英]PostgreSQL select closest points in time that match given intervals

I am looking to figure out if I can solve the following problem in SQL, or if I'm better off selecting the values into my scripting language and just bulk update from there.我想弄清楚我是否可以在 SQL 中解决以下问题,或者我是否最好将值选择到我的脚本语言中,然后从那里进行批量更新。

There are some points in time, and there are some time intervals defined by the center of the time interval and a maximum duration from the center, let it be 10 minutes for all of them.有一些时间点,并且有一些时间间隔由时间间隔的中心和距中心的最大持续时间定义,所有这些都设为 10 分钟。 Centers may be at any duration from each other, points may be at any duration from each other.中心可以彼此相距任何时长,点之间可以相距任何时长。 Looking to select all time intervals, together with one or zero points, so that each point is either not assigned or assigned to only one interval.希望选择所有时间间隔以及一个或零个点,以便每个点不被分配或仅分配给一个间隔。 If one point matches more than one interval, or vice versa, points shall be chosen so that the total duration between points and interval centers is minimized.如果一个点与多个间隔匹配,反之亦然,则应选择点以使点和间隔中心之间的总持续时间最小化。

Sample data样本数据

interval
id centertime
1 2001-01-01 12.00     # starts at 11.50 ends at 12.10
2 2001-01-01 12.15     # starts at 12.05 ends at 12.25
3 2001-01-01 12.20     # starts at 12.10 ends at 12.30

point
id time
21 2001-01-01 12.00     
22 2001-01-01 12.11
23 2001-01-01 12.17
24 2001-01-01 12.19

Desired results:预期结果:

interval_id point_id
1 21
2 23
3 24

Explanation解释

Point 21 exactly matches center of interval 1, and nothing else, so is assigned.点 21 与区间 1 的中心完全匹配,没有别的,因此被分配。

Point 23 is closer to interval 2 than 3, but point 24 is even closer to 3, so interval 3 is assigned point 24.点 23 比点 3 更接近区间 2,但点 24 更接近于 3,因此区间 3 被指定为点 24。

Point 22 is the closest remaining point to interval 2, so is assigned.点 22 是距离区间 2 最接近的剩余点,因此被分配。

Point 21 is within interval 2, but point 22 is available and closer, so 21 is not assigned to an interval and does not appear in results.点 21 在区间 2 内,但点 22 可用且更近,因此 21 未分配给区间且不会出现在结果中。

point 23 is even closer to 3, so 22 is the closest remaining one点 23 更接近于 3,所以 22 是最接近的剩余点

Okay I got it.好的,我明白了。

It uses a lateral join to calculate the duration from each centertime to each point, and wraps that in an additional SELECT to get only the closest match using an ordered window function.它使用横向连接来计算从每个中心时间到每个点的持续时间,并将其包装在额外的 SELECT 中以使用有序窗口函数仅获得最接近的匹配。

SELECT * FROM (
    SELECT
        i.id as interval_id,
        p.id as point_id,
        p.duration,
        ROW_NUMBER() OVER (PARTITION BY i.id ORDER BY duration) AS rownumber
    FROM "interval" i
    LEFT JOIN LATERAL (
        SELECT
            p1.id,
            p1."time",
            ABS(EXTRACT(EPOCH FROM i.centertime - p1."time")) as duration
        FROM "point" p1
        WHERE p1."time"
            BETWEEN i.centertime - interval '10 minute'
            AND i.centertime + interval '10 minute'
     ) p on true)
     AS q
WHERE rownumber=1;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM