I'm pivoting in PostgreSQL but when I run the query the output says:
ERROR: return and sql tuple descriptions are incompatible SQL state: 42601
Summarizing, I want the distribution channel on raw, the year in the columns and the operative margin as value.
Without the pivot the output is:
dist_chann_name | year | operative_margin |
---|---|---|
1 | 2020 | 20783 |
1 | 2021 | 5791 |
2 | 2020 | 30362 |
3 | 2021 | 14501 |
3 | 2020 | 2765 |
3 | 2021 | 4535 |
This is my query:
SELECT *
FROM crosstab(
'SELECT dist_chann_id, year, operative_margin
FROM marginality_by_channel
ORDER BY dist_chann_id, year'
) AS ct ("DC" int, "2020" int, "2021" int);
Using filtered aggregation is typically much easier than the somewhat convoluted crosstab()
function (at least in my opinion).
select dist_chann_name as dc,
sum(operative_margin) filter (where year = 2020) as "2020",
sum(operative_margin) filter (where year = 2021) as "2021"
from marginality_by_channel
group by dist_chann_name
order by dist_chann_name;
One of the columns does not have data type you think it has. Must be operative_margin
, probably text
?
The 1-parameter form of crosstab()
only uses the "category" column ( year
in your example) only for sorting. And the "row_name" column ( dist_chann_name
- or dist_chann_id
?) would produce a different error msg.
Either way, unless you can guarantee that every "row_name" has exactly two values to it, it's safer to use the 2-parameter form of corosstab()
:
SELECT *
FROM crosstab(
$$
SELECT dist_chann_name, year, operative_margin
FROM marginality_by_channel
ORDER BY 1, 2
$$
, 'VALUES (2020), (2021)'
) AS ct ("DC" int, "2020" int, "2021" int);
db<>fiddle here
This variant also happens to be more tolerant with type mismatches (as everything is passed as text
anyway). See:
crosstab()
shines for many resulting value columns (faster, shorter). For just two "value" columns, aggregate FILTER
might be the better (simpler) choice. Not much performance to gain (if any, after adding some overhead). See:
That aside, your setup is ambiguous to begin with. It includes two rows for the same (dist_chann_name, year) = (3, 2021)
.
sum()
in his aggregate FILTER
solution. You might also use min()
or max()
, or whatever ... A clean solution would use an explicit sort order and document the effect, or work with a query producing distinct values, or use the appropriate aggregate function with the FILTER
solution.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.