简体   繁体   English

Oracle 按连续值查询分组并获取开始日期和结束日期

[英]Oracle query group by consecutive value and get start date and end date

I have a table like this (actually, is the result of a large large query):我有一个这样的表(实际上是大型查询的结果):

id   |  date_measured        |  out_of_range
-----+-----------------------+--------------
3147 |  09/08/2019 20.00:00  |  1
3147 |  09/08/2019 21.00:00  |  0
3147 |  09/08/2019 22.00:00  |  0
3147 |  09/08/2019 23.00:00  |  1
3147 |  10/08/2019 00.00:00  |  1
3147 |  10/08/2019 01.00:00  |  1
3147 |  10/08/2019 02.00:00  |  0
3125 |  09/08/2019 20.00:00  |  0
3125 |  09/08/2019 21.00:00  |  1
3125 |  09/08/2019 22.00:00  |  1
3125 |  09/08/2019 23.00:00  |  0
3125 |  10/08/2019 00.00:00  |  1
3125 |  10/08/2019 01.00:00  |  1
3125 |  10/08/2019 02.00:00  |  1

and I need this result:我需要这个结果:

id   |  date_measured_start  |  date_measured_end    |  consecutive_out_of_range
-----+-----------------------+-----------------------+--------------------------
3147 |  09/08/2019 20.00:00  |  09/08/2019 20.00:00  |  1
3147 |  09/08/2019 23.00:00  |  10/08/2019 01.00:00  |  3
3125 |  09/08/2019 21.00:00  |  09/08/2019 22.00:00  |  2
3125 |  10/08/2019 00.00:00  |  10/08/2019 02.00:00  |  3

that is the consecutive recurrence of the value out_of_range = 1 and the relative start and end date.这是值out_of_range = 1和相对开始和结束日期的连续重复。

I tried to use this solution but I just can't have only the consecutive 1 for the out_of_range .我尝试使用解决方案,但我不能只有连续的1用于out_of_range value.价值。

Here is a different application of the same method as in MT0's answer.这是与 MT0 答案中相同方法的不同应用。 The method is known as the "fixed differences" method (the "fixed differences", in both solutions, are the additional, computed value by which we group the data);该方法被称为“固定差异”方法(两种解决方案中的“固定差异”是我们对数据进行分组的附加计算值); also known as the "tabibitosan" method.也称为“tabibitosan”方法。

In this solution I subtract a row_number() (appropriately modified) directly from the date, but after selecting just the rows with the flag equal to 1. This may be important if you have a very large amount of data, but only a relatively small fraction of rows have the flag equal to 1. This is because row_number() needs to order the data, and ordering is an expensive operation.在这个解决方案中,我直接从日期中减去row_number() (适当修改),但是选择标志等于 1 的行之后。如果您有大量数据,这可能很重要,但只有相对较小一小部分行的标志等于 1。这是因为row_number()需要对数据进行排序,并且排序是一项昂贵的操作。 To solve the problem, we don't need to order (by date) the rows where the flag is 0 - only the rows where the flag is 1.为了解决这个问题,我们不需要(按日期)对标志为 0 的行进行排序 - 只需对标志为 1 的行进行排序。

EDIT (based on MT0's comment below this answer)编辑(基于 MT0 在此答案下方的评论)

MT0 points out, correctly, that my solution assumes something that is true in the test data posted by the OP, but not stated explicitly. MT0 正确地指出,我的解决方案假定 OP 发布的测试数据中的某些内容是正确的,但没有明确说明。 Namely, that the date-times in the date_measured column are continuous sequences of date-time, spaced at one hour intervals.也就是说, date_measured列中的日期时间是日期时间的连续序列,间隔为一小时。

In fact, what my solution really does is this.事实上,我的解决方案真正做的是这个。 Suppose that from the very beginning the data consisted only of the out-of-range rows (with flag equal to 1), and that the date-times in the date_measured column were always rounded to the hour, as they are in the OP's test data.假设从一开始数据只包含超出范围的行(标志等于 1),并且date_measured列中的日期时间总是四舍五入到小时,因为它们在 OP 的测试中数据。 The question, then, would be to identify the sequences of rows where the times are "consecutive" (meaning one hour apart).那么,问题将是识别时间“连续”(意味着相隔一小时)的行序列。 That's what the query does.这就是查询的作用。

END EDIT结束编辑

I used MT0's table - from his db fiddle test.我使用了 MT0 的表格——来自他的 db fiddle 测试。 Thanks MT0!感谢MT0!

with
  tabibitosan (id, date_measured, grp) as (
    select id, date_measured,
           date_measured 
           - row_number() over (partition by id order by date_measured) 
             * interval '1' hour
    from   table_name
    where  out_of_range = 1    
  )
select id, min(date_measured) as date_measured_start, 
           max(date_measured) as date_measured_end,
           count(*)           as consecutive_out_of_range
from   tabibitosan
group  by id, grp
order  by id, date_measured_start    --  or whatever
;

  ID DATE_MEASURED_START DATE_MEASURED_END CONSECUTIVE_OUT_OF_RANGE
---- ------------------- ----------------- ------------------------
3125 2019-08-09 21:00    2019-08-09 22:00                         2
3125 2019-08-10 00:00    2019-08-10 02:00                         3
3147 2019-08-09 20:00    2019-08-09 20:00                         1
3147 2019-08-09 23:00    2019-08-10 01:00                         3

Use the ROW_NUMBER analytic function if give each row two incrementing numeric values - one per id and the other per id / out_of_range pair.如果给每一行两个递增的数值 - 每个id一个,另一个每个id / out_of_range对,则使用ROW_NUMBER分析 function。 If you subtract one from the other then the resulting number will be constant within a consecutive set of rows with the same id / out_of_range values and you can use this to GROUP BY :如果您从另一个中减去一个,则结果数字将在具有相同id / out_of_range值的一组连续行中保持不变,您可以将其用于GROUP BY

Query :查询

SELECT id,
       MIN( date_measured ) AS date_measured_start,
       MAX( date_measured ) AS date_measured_end,
       COUNT( * ) AS consecutive_out_of_range
FROM   (
  SELECT t.*,
         ROW_NUMBER() OVER ( PARTITION BY id ORDER BY date_measured )
           - ROW_NUMBER() OVER ( PARTITION BY id, out_of_range ORDER BY date_measured )
           AS rn
  FROM   table_name t
)
WHERE out_of_range = 1
GROUP BY id, rn

Output : Output

 ID |身份证 | DATE_MEASURED_START | DATE_MEASURED_START | DATE_MEASURED_END | DATE_MEASURED_END | CONSECUTIVE_OUT_OF_RANGE ---: |:------------------ |:------------------ | CONSECUTIVE_OUT_OF_RANGE ---: |:----------------- |:----------------- | -----------------------: 3147 | ----------------------: 3147 | 2019-08-09 20:00:00 | 2019-08-09 20:00:00 | 2019-08-09 20:00:00 | 2019-08-09 20:00:00 | 1 3147 | 1 3147 | 2019-08-09 23:00:00 | 2019-08-09 23:00:00 | 2019-08-10 01:00:00 | 2019-08-10 01:00:00 | 3 3125 | 3 3125 | 2019-08-10 00:00:00 | 2019-08-10 00:00:00 | 2019-08-10 02:00:00 | 2019-08-10 02:00:00 | 3 3125 | 3 3125 | 2019-08-09 21:00:00 | 2019-08-09 21:00:00 | 2019-08-09 22:00:00 | 2019-08-09 22:00:00 | 2 2

db<>fiddle here db<> 在这里摆弄

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Oracle SQL查询按日期对连续记录进行分组 - Oracle sql query to group consecutive records by date Oracle SQL查询中的开始/结束日期和时间 - start/end date and time in oracle sql query 如何从单个日期列获取开始/结束日期-oracle - How to get start/end date from single date column -oracle 带开始和结束日期的团体价格 - Group price with start and end date 如果输入日期不可用,如何在 oracle 中获取开始日期和结束日期之间的输入日期,然后返回最大日期记录 - How to get the input date between start and end date in oracle if input date not available then return max date record 如何在oracle中以小时和分钟的形式获取开始日期和结束日期之间的日期差异 - How to get the date difference between start date and end date in oracle as hours and minutes 如何根据一天中的任务完成情况从给定日期列中获取开始日期和结束日期(Oracle DB) - How to get start date and end date from the given date column according to task completion in a day (oracle DB) Oracle SQL 查询获取最大值日期 - Oracle SQL Query Get Value at Max Date 如何根据一天中的任务完成情况从给定日期列中获取开始日期和结束日期(Oracle) - How to get start date and end date from the given date column according to task completion in a day (oracle) 无法在Oracle查询中按日期分组 - Unable to group by date in Oracle query
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM