[英]Aggregate and concatenate from two tables based on comparison between columns
I have two tables like this:我有两个这样的表:
Table1表格1
A T
a1 t1
a2 t2
a3 t3
a4 t4
a5 t5
...
...
Table2表2
E T
e1 t1
e2 t2
e3 t3
e4 t4
e5 t5
...
...
what I wanted to achieve is this:我想要实现的是:
Table 3表3
E A'
e1 a1,a2,a3
e2 a4,a5,a6
...
...
The aggregation A' is done like this: In table 2 for each e
there is a value in column T
: t
and with that t
you look for the last 3 values in Table 1 that are less than the t
in question.聚合 A' 是这样完成的:在表 2 中,每个
e
在列T
: t
中都有一个值,并且通过该t
,您可以查找表 1 中小于相关t
的最后 3 个值。 So a1, a2, a3 are values of A
whose t
values are less than t1
in Table 2 whose E is e1
.所以 a1, a2, a3 是
A
的值,其t
值小于表 2 中的t1
,其 E 为e1
。
I know that I could write two queries for this like this:我知道我可以这样写两个查询:
ResultSet (rt) -> select t from e
结果集(rt)->
select t from e
and then iterate ResultSet and do something like this:然后迭代 ResultSet 并执行以下操作:
select A from Table1 where t < rt[i] limit 3
- not sure how to concatenate here:) select A from Table1 where t < rt[i] limit 3
- 不知道如何在这里连接:)
but I m pretty sure this is utterly inefficient.但我很确定这是完全低效的。 There should be a better way to do this.
应该有更好的方法来做到这一点。
I m working with Postgresql.我正在使用 Postgresql。
If it had been a dataframe from a file I would use python's pandas.如果它是来自文件的 dataframe,我会使用 python 的 pandas。 Also I know that python has read_sql but the tables are very huge I don't want to load the whole table in memory which I think it won't but not sure either - anyway its a separate story.
我也知道 python 有 read_sql 但表非常大我不想在 memory 中加载整个表,我认为它不会但也不确定 - 无论如何它是一个单独的故事。
How do we solve this in SQL?我们如何在 SQL 中解决这个问题? Any ideas please.
请有任何想法。
In table 2 for each e there is a value in column T: t and with that t you look for the last 3 values in Table 1 that are less than the t in question.
在表 2 中,每个 e 在列 T: t 中都有一个值,通过该 t,您可以查找表 1 中小于相关 t 的最后 3 个值。
I don't understand the results follow this logic.我不明白结果遵循这个逻辑。 But based on your description, you can use a lateral join:
但根据您的描述,您可以使用横向连接:
select t2.*, t1.the_as
from t2 left join lateral
(select array_agg(t1.a) as the_as
from (select t1.*
from t1
where t1.T <= t2.T
order by t1.T desc
limit 3
) t1
) t1
on 1=1;
Note that this uses arrays rather than strings because I think arrays are a better data structure for storing multiple values.请注意,这使用 arrays 而不是字符串,因为我认为 arrays 是存储多个值的更好数据结构。 That said, you can just use
string_agg()
instead, if you really want a string.也就是说,如果你真的想要一个字符串,你可以只使用
string_agg()
。 The syntax would be string_agg(t1.a, ',')
.语法为
string_agg(t1.a, ',')
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.