简体   繁体   English

选择每个月的最大日期

[英]Selecting max date of each month

I have a table with a lot of cumulative columns, these columns reset to 0 at the end of each month.我有一个包含很多累积列的表,这些列在每个月末重置为 0。 If I sum this data, I'll end up double counting.如果我sum这些数据,我最终会重复计算。 Instead, With Hive, I'm trying to select the max date of each month.相反,使用 Hive,我试图选择每个月的最大日期。

I've tried this:我试过这个:

SELECT
    yyyy_mm_dd,
    id,
    name,
    cumulative_metric1,
    cumulative_metric2
FROM
    mytable

WHERE
    yyyy_mm_dd = last_day(yyyy_mm_dd)

mytable has daily data from the start of the year. mytable有从年初开始的每日数据。 In the output of the above, I only see the last date for January but not February.在上面的输出中,我只看到一月的最后日期,而不是二月。 How can I select the last day of each month?如何选择每个月的最后一天?

February is not over yet.二月还没有结束。 Perhaps a window function does what you want:也许窗口函数可以满足您的需求:

SELECT yyyy_mm_dd, id, name, cumulative_metric1, cumulative_metric2
FROM (SELECT t.*,
             MAX(yyyy_mm_dd) OVER (PARTITION BY last_day(yyyy_mm_dd)) as last_yyyy_mm_dd
      FROM mytable t
     ) t
WHERE yyyy_mm_dd = last_yyyy_mm_dd;

This calculates the last day in the data .这将计算数据中的最后一天。

use correlated subquery and date to month function in hive在 hive 中使用相关子查询和日期到月份函数

SELECT
    yyyy_mm_dd,
    id,
    name,
    cumulative_metric1,
    cumulative_metric2
FROM
    mytable t1

WHERE
    yyyy_mm_dd = select max(yyyy_mm_dd) from mytable t2 where
     month(t1.yyyy_mm_dd)= month(t2.yyyy_mm_dd)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM