[英]BigQuery - Scalar subquery produced more than one element
I have the following data我有以下数据
LastSubmission | PatientName | Phone | SubmissionId | HealthCondition
2020-12-17 16:02:56 UTC | a |123456789| abc123 | Good
2020-12-18 14:24:33 UTC | a |123456789| abc123 | Bad
2020-12-18 14:24:51 UTC | b |523456789| def321 | okay
2020-12-18 14:25:09 UTC | b |523456789| def321 | bad
2020-12-21 17:11:40 UTC | c |623456789| hij987 | better
2020-12-21 17:05:30 UTC | c |623456789| hij981 | worse
I want to write a query that returns only the latest data for each SubmissionId我想编写一个只返回每个 SubmissionId 的最新数据的查询
Currently, I have the following code -目前,我有以下代码 -
SELECT *
FROM `myproject.dataset.qualtrics`
WHERE LastSubmission =
(
SELECT MAX(LastSubmission),
FROM `myproject.dataset.qualtrics`
GROUP BY SubmissionID, LastSubmission
)
;
But when I run this I get an error saying 'Scalar subquery produced more than one element' Please help me solve this.但是当我运行它时,我收到一条错误消息,提示“标量子查询产生了多个元素”请帮我解决这个问题。
You want a correlated subquery:你想要一个相关的子查询:
SELECT q.*
FROM `myproject.dataset.qualtrics` q
WHERE LastSubmission = (SELECT MAX(q2.LastSubmission)
FROM `myproject.dataset.qualtrics` q2
WHERE q2.SubmissionID = q.SubmissionID
) ;
A more "bigquery"ish way to write the query would use aggregation:编写查询的一种更“大查询”的方式是使用聚合:
select array_agg(q order by q.LastSubmission desc limit 1)[ordinal(1)].*
from `myproject.dataset.qualtrics` q
group by q.SubmissionID;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.