简体   繁体   中英

BigQuery: How to make FULL JOIN with multiple tables ON multiple keys without losing data?

I have been struggling with this issue in BigQuery for a while now and can't find a solution for it. I have 3 tables with campaign costs that I need to join together in a single table, here are some examples below of what the tables look like:

Table 1

Date Country Costs Campaign A
2021-06-01 Argentina 10
2021-06-01 Brazil 30
2021-06-01 Colombia 10
2021-06-02 Argentina 50
2021-06-02 Brazil 65
2021-06-02 Colombia 40

Table 2

Date Country Costs Campaign B
2021-06-01 Argentina 54
2021-06-01 Brazil 38
2021-06-01 Germany 94
2021-06-02 Argentina 51
2021-06-02 Brazil 48
2021-06-02 Germany 88

Table 3

Date Country Costs Campaign C
2021-06-01 Argentina 27
2021-06-01 Brazil 55
2021-06-01 Poland 46
2021-06-02 Argentina 86
2021-06-02 Brazil 99
2021-06-02 Poland 47

My output should look like this in the end

Date Country Costs Campaign A Costs Campaign B Costs Campaign C
2021-06-01 Argentina 10 54 27
2021-06-01 Brazil 30 38 55
2021-06-01 Colombia 10 0 0
2021-06-01 Germany 0 94 0
2021-06-01 Poland 0 0 46
2021-06-02 and so on... x y z

And here is a similar query to the one I've been trying:

SELECT
  t1.Date,
  t1.Country,
  SUM(t1.CostsCampaignA),
  SUM(t2.CostsCampaignB),
  SUM(t3.CostsCampaignC)
FROM `table1` AS t1
FULL JOIN `table2` AS t2 ON t1.Date = t2.Date AND t1.Country = t2.Country
FULL JOIN `table3` AS t3 ON t1.Date = t3.Date AND t1.Country = t3.Country
GROUP BY 1,2

Now my problem is that for the countries not present in table 1, they do not appear in my output at all (in the example, Germany and Poland). I would assume that the FULL JOIN would ensure that no data gets lost, but this is not the case.

I would appreciate if anyone has a solution for my issue or any suggestions for a workaround.

Thank you!

Consider below approach

select * from (
  select date, country, cost_campaign_a as cost, 'A' campaign from table_1 
  union all select date, country, cost_campaign_b, 'B' from table_2 
  union all select date, country, cost_campaign_c, 'C' from table_3 
)
pivot (any_value(cost) cost_campaign for campaign in ('A', 'B', 'C'))

if applied to sample data in your question - output is

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM