简体   繁体   English

查找PostgreSQL中的下一个最近的数字

[英]Find Next Closest Number in PostgreSQL

I am running PostgreSQL 9.1.9 x64 with PostGIS 2.0.3 under Windows Server 2008 R2. 我在Windows Server 2008 R2下使用PostGIS 2.0.3运行PostgreSQL 9.1.9 x64。

I have a table: 我有一张桌子:

CREATE TABLE field_data.trench_samples (
   pgid SERIAL NOT NULL,
   trench_id TEXT,
   sample_id TEXT,
   from_m INTEGER
);

With some data in it: 有一些数据:

INSERT INTO field_data.trench_samples (
   trench_id, sample_id, from_m
)
VALUES
   ('TR01', '1000001', 0),
   ('TR01', '1000002', 5),
   ('TR01', '1000003', 10),
   ('TR01', '1000004', 15),
   ('TR02', '1000005', 0),
   ('TR02', '1000006', 3),
   ('TR02', '1000007', 9),
   ('TR02', '1000008', 14);

Now, what I am interested in is finding the difference (distance in metres in this example) between a record's "from_m" and the "next" "from_m" for that trench_id. 现在,我感兴趣的是找到记录的“from_m”和“next_m”之间的差异(在这个例子中以米为单位)。

So, based on the data above, I'd like to end up with a query that produces the following table: 因此,基于上面的数据,我想最终得到一个产生下表的查询:

pgid, trench_id, sample_id, from_m, to_m, interval
1, 'TR01', '1000001', 0, 5, 5
2, 'TR01', '1000002', 5, 10, 5
3, 'TR01', '1000003', 10, 15, 5
4, 'TR01', '1000004', 15, 20, 5
5, 'TR02', '1000005', 0, 3, 3
6, 'TR02', '1000006', 3, 9, 6
7, 'TR02', '1000007', 9, 14, 5
8, 'TR02', '1000008', 14, 19, 5

Now, you are likely saying "wait, how do we infer an interval length for the last sample in each line, since there is no "next" from_m to compare to?" 现在,您可能会说“等待,我们如何推断每行中最后一个样本的间隔长度,因为没有”next“from_m来比较?”

For the "ends" of lines (sample_id 1000004 and 1000008) I would like to use the identical interval length of the previous two samples. 对于行的“结束”(sample_id 1000004和1000008),我想使用前两个样本的相同间隔长度。

Of course, I have no idea how to tackle this in my current environment. 当然,我不知道如何在我目前的环境中解决这个问题。 Your help is very much appreciated. 非常感激你的帮助。

Here is how you get the difference, using the one previous example at the end (as shown in the data but not explained clearly in the text). 以下是您如何获得差异,最后使用前面的一个示例(如数据中所示,但未在文中明确说明)。

The logic here is repeated application of lead() and lag() . 这里的逻辑是重复应用lead()lag() First apply lead() to calculate the interval. 首先应用lead()来计算间隔。 Then apply lag() to calculate the interval at the boundary, by using the previous interval. 然后应用lag()来计算边界处的间隔,使用前一个间隔。

The rest is basically just arithmetic: 其余的基本上只是算术:

select trench_id, sample_id, from_m,
       coalesce(to_m,
                from_m + lag(interval) over (partition by trench_id order by sample_id)
               ) as to_m,
       coalesce(interval, lag(interval) over (partition by trench_id order by sample_id))
from (select t.*,
             lead(from_m) over (partition by trench_id order by sample_id) as to_m,
             (lead(from_m) over (partition by trench_id order by sample_id) -
              from_m
             ) as interval
      from field_data.trench_samples t
     ) t

Here is the SQLFiddle showing it working. 是SQLFiddle显示它的工作原理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM