[英]How can I get two columns in a subquery using top 1?
Here's the thing.事情就是这样。 I have these two tables:我有这两张表:
table A:表一:
id col1 date_x
A xxxx 2020-02-02
B yyyy 2020-02-02
C zzzz 2020-02-02
table B表 B
id col2 date_y
A yyyy 2020-01-02
A yyyy 2020-02-02
A yyyy 2020-03-02
I wanted to bring col2 when date_y is the highest possible but it has to be lower than date_x.我想在 date_y 可能最高但必须低于 date_x 时带上 col2。
This is what I've done:这就是我所做的:
select *,
(
select top 1 col2
from table_B
where table_B.date_y < a.date_x
and table_B.id = a.id
) as col2
from table_A a
Now, I wanted to bring date_y as well, in order to do some validation.现在,我也想带上 date_y,以便进行一些验证。
What is the best way of doing this?这样做的最佳方法是什么? I thought about creating another (select top 1...) but this seems very inefficient.我考虑过创建另一个(选择前 1 个...),但这似乎效率很低。 Another join would also be inefficient.另一个联接也将是低效的。
You can join the tables on your conditions and use MAX()
and FIRST_VALUE()
window functions to get the date_y
and col2
values:您可以根据您的条件加入表格并使用MAX()
和FIRST_VALUE()
window 函数来获取date_y
和col2
值:
select distinct a.*,
first_value(b.col2) over (partition by a.id order by b.date_y desc, b.col2) col2,
max(b.date_y) over (partition by a.id) date_y
from tableA a left join tableB b
on b.id = a.id and b.date_y < a.date_x
You may change the LEFT
join to an INNER
join if you want only matched rows from the 2 tables.如果您只想要两个表中的匹配行,您可以将LEFT
连接更改为INNER
连接。
See the demo .请参阅演示。
Your approach using a correlated subquery is OK - and Redshift supports top
(although I prefer limit
, that is more widely supported in other databases).您使用相关子查询的方法是可以的 - Redshift 支持top
(虽然我更喜欢limit
,这在其他数据库中得到更广泛的支持)。
However you are missing an order by
clause in the subquery - without it, you get an unpredictable row out of those that satisfy the where
clause, which is not what you want.但是,您在子查询中缺少order by
子句 - 没有它,您会从满足where
子句的行中得到不可预测的行,这不是您想要的。
I would recommend:我会推荐:
select
a.*,
(
select col2
from table_B b
where b.date_y < a.date_x and b.id = a.id
order by b.date_y desc
limit 1
) as col2
from table_A a
For performance, consider an index on table_B(id, date_y, col2)
.为了提高性能,请考虑table_B(id, date_y, col2)
上的索引。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.