简体   繁体   English

select 数据来自另一个表,最大日期在 hive

[英]select data from another table with max date in hive

I have one table t1 like this我有一张这样的桌子 t1

A     B
1    2020-05-01
1    2020-05-04
1    2020-05-05
1    2020-05-06
2    2020-04-10

and another table t2和另一张桌子 t2

A     C
1    2020-04-30
5    2020-04-08

and I need out like this:我需要这样:

A     B             c
1    2020-05-01    2020-04-30
1    2020-05-04    2020-04-30
1    2020-05-05    2020-04-30
1    2020-05-06    2020-04-30
2    2020-04-10    2020-04-08

As you can see i am getting last max date as c from table t2 which less than B here 2020-04-30 is the max date less than 2020-05-01,04,05 and 06, and for 2020-04-10 the date is 2020-04-08.如您所见,我从表 t2 中获得最后一个最大日期为 c,此处小于 B 2020-04-30 是小于 2020-05-01,04,05 和 06 的最大日期,以及 2020-04-10日期为 2020-04-08。

I am trying it like this but getting wrong answer:我正在尝试这样但得到错误的答案:

select t1.*,t2.C, max(C) over (partition by t2.A ) from t1 inner join t2 on t1.A=t2.A and t2.C<t1.B

You could try this approach.你可以试试这个方法。 I use CTE(Common Table Expresion) and query the CTE with MAX and GROUP BY我使用 CTE(Common Table Expresion) 并使用 MAX 和 GROUP BY 查询 CTE

WITH t AS(
SELECT t1.a, t1.b, t2.c
FROM t1, t2
WHERE t1.b > t2.c)
SELECT a, b, MAX(c) AS c
FROM t
GROUP BY a,b;

expected output预计 output

+----+-------------+-------------+--+
| a  |      b      |      c      |
+----+-------------+-------------+--+
| 1  | 2020-05-01  | 2020-04-30  |
| 1  | 2020-05-04  | 2020-04-30  |
| 1  | 2020-05-05  | 2020-04-30  |
| 1  | 2020-05-06  | 2020-04-30  |
| 2  | 2020-04-10  | 2020-04-08  |
+----+-------------+-------------+--+

You can try this:你可以试试这个:

Select t1.A,t1.B,MAX(t2.B) from t1 t1 join t2 t2 on t1.A=t2.A group by t1.A,t1.B;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM