[英]Joining two subqueries in Postgres
尝试对两个子查询运行内部联接,但收到错误消息:
org.postgresql.util.PSQLException: ERROR: syntax error at or near "JOIN"
Position: 550
GROUP BY year
JOIN temp ON temp.year = MN.ye
^
-- INNER JOIN (
这是我的查询
WITH temp as(
SELECT
SUM(CASE WHEN rain = 'TRUE' THEN 1 END)*1.0/COUNT(date) * 100 as rain,
EXTRACT(YEAR FROM date) as year
FROM sample
GROUP BY year
)
SELECT AVG(mind) as avg_min,
AVG(maxd) as avg_max,
EXTRACT(YEAR FROM date) as year
FROM sample MN
GROUP BY year
JOIN temp ON temp.year = MN.year
和我的数据样本
date prcp maxd mind rain
1948-01-01 00:00:00 0.47 51 42 TRUE
1948-01-02 00:00:00 0.59 45 36 TRUE
1948-01-03 00:00:00 0.42 45 35 TRUE
1948-01-04 00:00:00 0.31 45 34 TRUE
1948-01-05 00:00:00 0.17 45 32 TRUE
1948-01-06 00:00:00 0.44 48 39 TRUE
1948-01-07 00:00:00 0.41 50 40 TRUE
1948-01-08 00:00:00 0.04 48 35 TRUE
1948-01-09 00:00:00 0.12 50 31 TRUE
1948-01-10 00:00:00 0.74 43 34 TRUE
1948-01-11 00:00:00 0.01 42 32 TRUE
1948-01-12 00:00:00 0 41 26 FALSE
1948-01-13 00:00:00 0 45 29 FALSE
1948-01-14 00:00:00 0 38 26 FALSE
1948-01-15 00:00:00 0 34 31 FALSE
1948-01-16 00:00:00 0 34 28 FALSE
1948-01-17 00:00:00 0 35 29 FALSE
1948-01-18 00:00:00 0 33 28 FALSE
1948-01-19 00:00:00 0 34 27 FALSE
1948-01-20 00:00:00 0 36 29 FALSE
1948-01-21 00:00:00 0 48 32 FALSE
1948-01-22 00:00:00 0.21 47 44 TRUE
1948-01-23 00:00:00 0 47 43 FALSE
1948-01-24 00:00:00 0.1 45 34 TRUE
1948-01-25 00:00:00 0 46 30 FALSE
1948-01-26 00:00:00 0 45 32 FALSE
1948-01-27 00:00:00 0 53 33 FALSE
1948-01-28 00:00:00 0 53 25 FALSE
1948-01-29 00:00:00 0.22 42 34 TRUE
1948-01-30 00:00:00 0.03 47 30 TRUE
1948-01-31 00:00:00 0.21 35 27 TRUE
我的理想结果是类似这样的
avg_tmin, avg_tmax, avg_rain, year
x x x 1948
x x x 1949
...
所以我的数据集中每年的平均思想(tmin),maxd(tmax)和降雨
我不明白您的查询试图实现的逻辑。 但是,从您的示例数据和预期结果来看,您似乎只想要聚合:
select
avg(mind) avg_mind,
avg(maxd) avg_maxd,
avg( (rain)::int ) avg_rain,
extract(year from date) year
from sample
group by extract(year from date)
我认为不需要 JOIN 开始:
SELECT count(*) filter (where rain = 'TRUE') * 1.0 / count(*) as rain,
AVG(mind) as avg_min,
AVG(maxd) as avg_max,
EXTRACT(YEAR FROM date) as year
FROM sample
GROUP BY year
以上是做你想做的最有效的方法。
但是,要回答为什么您的代码不起作用的直接问题:您需要在加入后移动组,并且不能在定义它的同一级别 ( mn
) 上使用列别名year
:
WITH temp as (
SELECT count(*) filter (where rain = 'TRUE') *1.0 / COUNT(date) * 100 as rain,
EXTRACT(YEAR FROM date) as year
FROM sample
GROUP BY year
),
SELECT AVG(mn.mind) as avg_min,
AVG(mn.maxd) as avg_max,
tmp.year
FROM sample MN
JOIN temp ON temp.year = EXTRACT(YEAR FROM mn.date)
GROUP BY tmp.year
请注意,这不使用 CTE 中的rain
列。 如果要添加它,则需要通过以下方式将其包含在组中:
WITH temp as (
SELECT count(*) filter (where rain = 'TRUE') *1.0 / COUNT(date) * 100 as rain,
EXTRACT(YEAR FROM date) as year
FROM sample
GROUP BY year
),
SELECT AVG(mn.mind) as avg_min,
AVG(mn.maxd) as avg_max,
tmp.year,
tmp.rain
FROM sample MN
JOIN temp ON temp.year = EXTRACT(YEAR FROM mn.date)
GROUP BY tmp.year, tmp.rain
或者将其拆分为两个连接的聚合查询。
WITH temp1 as (
SELECT count(*) filter (where rain = 'TRUE') *1.0 / COUNT(date) * 100 as rain,
EXTRACT(YEAR FROM date) as year
FROM sample
GROUP BY year
), temp2 as (
SELECT AVG(mind) as avg_min,
AVG(maxd) as avg_max,
EXTRACT(YEAR FROM date) as year
FROM sample MN
GROUP BY year
)
select *
from temp1
join temp2 using (year);
但同样:不需要加入,这会降低整个事情的效率。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.