Using COALESCE in Postgres and grouping by the resulting value

Question

I have two tables in a Postgres database:

table a

transaction_id | city   | store_name | amount
-------------------------------
123            | London | McDonalds  | 6.20
999            | NULL   | KFC        | 8.40
etc...

table b

transaction_id | location | store_name | amount
-----------------------------------
123            | NULL     | McDonalds  | 6.20
999            | Sydney   | KFC        | 7.60
etc...

As you can see, the location might be missing in one table but present in another table. For example with transaction 123, the location is present in table a but missing in table b. Apart from that, the rest of the data ( amount , store_name etc.) is the same, row by row, assumed that we join on the transaction_id .

For a given merchant, I need to retrieve a list of locations and the total amount for that location.

An example of the desired result:

KFC sales Report:

suburb   | suburb_total
---------------
London   | 2500
Sydney   | 3500

What I tried:

select 
  coalesce(a.city, b.location) as suburb, 
  sum(a.amount) as suburbTotal
from tablea a
join tableb b on a.transaction_id = b.transaction_id
where a.store_name ilike 'KFC'
group by(suburb);

But I get the error column "a.city" must appear in the GROUP BY clause or be used in an aggregate function

So I tried:

select 
  coalesce(a.city, b.location) as suburb, 
  sum(a.amount) as suburbTotal,
  max(a.city) as city_max,
  max(b.location) as location_max
from tablea a
join tableb b on a.transaction_id = b.transaction_id
where a.store_name ilike 'McDonalds'
group by(suburb);

But, surprisingly, I'm getting the same error, even thought I'm now using that column in an aggregate function.

How could I achieve the desired result?

NB there are reasons why we have de-normalised data across two tables, that are currently outside of my control. I have to deal with it.

EDIT: added FROM and JOIN, sorry I forgot to type those...

Answer 1

Your querires are missing a from clause, which makes it unclear which logic you are trying to implement.

Based on your sample data and expected results, I think that's a full join on the transaction_id , and then aggregation. Using a positional parameter in the group by clause avoids repeating the expression:

select 
    store_name,
    coalesce(a.city, b.location) as suburb, 
    sum(amount) suburb_total
from tablea a
full join tableb b using(transaction_id)
group by 1, 2

Answer 2

I can only imagine getting that error with your query if suburb were a column in one of the tables. One way around this is to define the value in the from clause:

select v.suburb, 
       sum(a.amount) as suburbTotal,
       max(a.city) as city_max,
       max(b.location) as location_max
from tablea a join
     tableb b
     on a.transaction_id = b.transaction_id cross join lateral
     (values (coalesce(a.city, b.location))) as v(suburb)
where a.store_name ilike 'McDonalds'
group by v.suburb;

This is one of the downsides of allowing column aliases in the group by . Sometimes, you might have conflicts with table columns.

Using COALESCE in Postgres and grouping by the resulting value

Question

2 answers

solution1
1 2020-07-30 08:19:52

solution2
1 ACCPTED 2020-07-30 12:02:30

Using COALESCE in Postgres and grouping by the resulting value

Question

2 answers

solution1 1 2020-07-30 08:19:52

solution2 1 ACCPTED 2020-07-30 12:02:30

solution1
1 2020-07-30 08:19:52

solution2
1 ACCPTED 2020-07-30 12:02:30