简体   繁体   English

选择给定数据的最新值并缺少记录

[英]Selecting the latest values given data with missing records

... where "missing records" are identical to the last recorded value, hence no record. ...“丢失的记录”与上次记录的值相同,因此没有记录。

This may be subjective, but I'm hoping there's a standardised way of doing this. 可能是主观的,但是我希望有一种标准化的方法可以做到这一点。

So, let's say I have a bunch of analytics in a MySQL table. 因此,假设我在MySQL表中有很多分析。 There is some missing information, but as mentioned above, that's because their previous value is the same as the current value. 缺少一些信息,但是如上所述,这是因为它们的先前值与当前值相同。

table "table":

id    value      datetime
1     5          1285891200    // Today
1     4          1285804800    // Yesterday
2     18         1285804800    // Yesterday
2     16         1285771094    // The day before yesterday

As you can see, I don't have a value for today for id 2. 如您所见,我今天没有ID 2的值。

If I wanted to pull the "most recent value" from this table (that is, 1's "today", and 2's "yesterday", how do I do that? I've achieved it by running the following query: 如果我想从该表中获取“最新值”(即1的“今天”和2的“昨天”),该怎么做呢?我通过运行以下查询来实现:

SELECT id, value FROM (SELECT * FROM table ORDER BY datetime DESC) as bleh GROUP BY id

Which utilizes a subquery to order the data first, and then I rely on "GROUP BY" to pick the first value (which, since it is ordered, is the most recent) from each id. 它利用子查询首先对数据进行排序,然后我依靠“ GROUP BY”从每个id中选择第一个值(因为它是有序的,所以是最新的)。 However, I don't know if shoving a subquery in there is the best way to get the most recent value. 但是,我不知道在其中推子查询是否是获取最新值的最佳方法。

How would you do it? 你会怎么做?

The desired table: 所需表:

id    value      datetime
1     5          1285891200    // Today
2     18         1285804800    // Yesterday

Thanks... 谢谢...

Gotta love MySQL for allowing an order by in a subquery. 一定喜欢MySQL,因为它允许在子查询中进行排序。 That's not allowed by the SQL standard :) SQL标准不允许这样做:)

You could rewrite the query in a standards complaint way like: 您可以采用标准投诉方式来重写查询,例如:

select  *
from    YourTable a
where   not exists
        (
        select  *
        from    YourTable b
        where   a.id = b.id
        and     a.datetime < b.datetime
        )

In case there are duplicates that you can't split apart in the subquery, you can group by and then pick an arbitrary value: 如果子查询中存在无法拆分的重复项,则可以按group by ,然后选择一个任意值:

select  a.id
,       max(a.value)
,       max(a.datetime)
from    YourTable a
where   not exists
        (
        select  *
        from    YourTable b
        where   a.id = b.id
        and     a.datetime < b.datetime
        )
group by
        a.id

This chooses the maximum a.value sharing the latest datetime . 这将选择共享最新datetime的最大a.value Now datetime is the same for all duplicate rows, but standard SQL doesn't know that, so you have to specify a way to pick from the equal days. 现在,所有重复行的datetime都相同,但是标准SQL并不知道这一点,因此您必须指定一种从相等的日期中进行选择的方法。 Here, I'm using max , but min or even avg would work just as well. 在这里,我使用的是max ,但是min甚至avg都可以正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM