简体   繁体   English

SQL合并结果集在唯一列值上

[英]SQL merging result sets on a unique column value

I have 2 similar queries which both work on the same table, and I essentially want to combine their results such that the second query supplies default values for what the first query doesn't return. 我有两个类似的查询,它们都在同一个表上工作,我基本上想要结合他们的结果,使第二个查询提供第一个查询不返回的默认值。 I've simplified the problem as much as possible here. 我在这里尽可能地简化了这个问题。 I'm using Oracle btw. 我正在使用Oracle btw。

The table has account information in it for a number of accounts, and there are multiple entries for each account with a commit_date to tell when the account information was inserted. 该表中包含许多帐户的帐户信息,每个帐户都有多个条目,其中commit_date用于指示何时插入帐户信息。 I need get the account info which was current for a certain date. 我需要获取当前特定日期的帐户信息。

The queries take a list of account ids and a date. 查询会记录帐户ID和日期。

Here is the query: 这是查询:

-- Select the row which was current for the accounts for the given date. (won't return anything for an account which didn't exist for the given date)
SELECT actr.*
FROM Account_Information actr 
WHERE actr.account_id in (30000316, 30000350, 30000351) 
AND actr.commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ')
AND actr.commit_date = 
(
    SELECT MAX(actrInner.commit_date) 
    FROM Account_Information actrInner 
    WHERE actrInner.account_id = actr.account_id
    AND actrInner.commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ')
) 

This looks a little ugly, but it returns a single row for each account which was current for the given date. 这看起来有点难看,但它为每个帐户返回一行,该行在给定日期是当前的。 The problem is that it doesn't return anything if the account didn't exist until after the given date. 问题是,如果帐户在给定日期之后才存在,则不会返回任何内容。

Selecting the earliest account info for each account is trival - I don't need to supply a date for this one: 为每个帐户选择最早的帐户信息很简单 - 我不需要为此提供日期:

-- Select the earliest row for the accounts.
SELECT actr.*
FROM Account_Information actr 
WHERE actr.account_id in (30000316, 30000350, 30000351) 
AND actr.commit_date = 
(
    SELECT MAX(actrInner .commit_date) 
    FROM Account_Information actrInner 
    WHERE actrInner .account_id = actr.account_id
)  

But I want to merge the result sets in such a way that: 但我希望以这样一种方式合并结果集:

For each account, if there is account info for it in the first result set - use that. 对于每个帐户,如果第一个结果集中有帐户信息 - 请使用该帐户。 Otherwise, use the account info from the second result set. 否则,请使用第二个结果集中的帐户信息。

I've researched all of the joins I can use without success. 我已经研究过所有可以使用但没有成功的联接。 Unions almost do it but they will only merge for unique rows. 工会几乎做到了,但他们只会合并为唯一的行。 I want to merge based on the account id in each row. 我想根据每行中的帐户ID进行合并。

Sql Merging two result sets - my case is obviously more complicated than that Sql合并两个结果集 - 我的情况显然比这更复杂

SQL to return a merged set of results - I might be able to adapt that technique? SQL返回一组合并的结果 - 我可能能够适应这种技术? I'm a programmer being forced to write SQL and I can't quite follow that example well enough to see how I could modify it for what I need. 我是一个被迫编写SQL的程序员,我不能很好地遵循这个例子,看看我如何根据我的需要修改它。

The standard way to do this is with a left outer join and coalesce. 执行此操作的标准方法是使用左外连接和合并。 That is, your overall query will look like this: 也就是说,您的整体查询将如下所示:

SELECT ...
FROM defaultQuery
LEFT OUTER JOIN currentQuery ON ...

If you did a SELECT * , each row would correspond to the current account data plus your defaults. 如果您执行了SELECT * ,则每行将对应当前帐户数据加上您的默认值。 With me so far? 和我一起到目前为止?

Now, instead of SELECT * , for each column you want to return, you do a COALESCE() on matched pairs of columns: 现在,对于要返回的每一列,而不是SELECT * ,在匹配的列对上执行COALESCE()

SELECT COALESCE(currentQuery.columnA, defaultQuery.columnA) ...

This will choose the current account data if present, otherwise it will choose the default data. 这将选择当前帐户数据(如果存在),否则将选择默认数据。

You can do this more directly using analytic functions: 您可以使用分析函数更直接地执行此操作:

select *
from (SELECT actr.*, max(commit_date) over (partition by account_id) as maxCommitDate,
             max(case when commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ') then commit_date end) over
                   (partition by account_id) as MaxCommitDate2
      FROM Account_Information actr 
      WHERE actr.account_id in (30000316, 30000350, 30000351)
     ) t
where (MaxCommitDate2 is not null and Commit_date = MaxCommitDate2) or
      (MaxCommitDate2 is null and Commit_Date = MaxCommitDate) 

The subquery calculates two values, the two possibilities of commit dates. 子查询计算两个值,即提交日期的两种可能性。 The where clause then chooses the appropriate row, using the logic that you want. 然后where子句使用您想要的逻辑选择适当的行。

I've combined the other answers. 我把其他答案结合起来了。 Tried it out at apex.oracle.com . apex.oracle.com上试过了。 Here's some explanation. 这是一些解释。

MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD')) will give us the latest date not before Dec 30th, or NULL if there isn't one. MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD'))将提供不在12月30日之前的最新日期,如果没有,则为NULL。 Combining that with a COALESCE , we get COALESCE(MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD') THEN commit_date END), MAX(commit_date)) . 将它与COALESCE相结合,我们得到COALESCE(MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD') THEN commit_date END), MAX(commit_date))

Now we take the account id and commit date we have and join them with the original table to get all the other fields. 现在我们获取帐户ID和提交日期,并将它们与原始表连接以获取所有其他字段。 Here's the whole query that I came up with: 这是我提出的整个查询:

SELECT *
FROM Account_Information
JOIN (SELECT account_id,
             COALESCE(MAX(CASE WHEN commit_date <=
                                    to_date('2010-DEC-30', 'YYYY-MON-DD')
                               THEN commit_date END),
                      MAX(commit_date)) AS commit_date
      FROM Account_Information
      WHERE account_id in (30000316, 30000350, 30000351)
      GROUP BY account_id)
  USING (account_id, commit_date);

Note that if you do use USING, you have to use * instead of acrt.* . 请注意,如果您使用USING,则必须使用*而不是acrt.*

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM