I'm very new to SQL and I've been struggling with a pretty basic problem. I've tried to come at this a few different ways but nothing so far has worked.
So in my table touristdata I have 4 columns: transportType, month, year and count. Example:
Month Year Transport Type Count
Nov 1992 Car 100
Nov 1992 Plane 250
Dec 1992 Car 200
Dec 1992 Plane 250
Jan 1993 Car 200
Jan 1993 Plane 200
Except I actually have four different transport types and many more months and years.
I want to calculate the percentage of each transport used over the years. So my desired output would be something along the lines of:
Year Transport Type Percentage
1992 Car 37.5%
1992 Plane 62.5%
1993 Car 50%
1993 Plane 50%
My current code looks like this:
WITH t1 as(
select transport, SUM(ncount) AS transportTotal
from touristdata
GROUP BY transport)
SELECT years, touristdata.transport, ROUND(100.0 *(transportTotal/SUM(ncount)))
FROM touristdata, t1
GROUP BY years;
In this form I get the error:
ERROR: column "touristdata.transport" must appear in the GROUP BY clause or be used in an aggregate function
LINE 5: SELECT years, touristdata.transport, ROUND(100.0 *(transpor...
But I know that adding touristdata.transport and transportTotal in to the GROUP BY won't work either. I tried it to make sure and I ended up with 4 entries for each transport type per year.
I didn't have the final 'GROUP BY years' before, and I tried to do it with subqueries but I couldn't figure it out.
If anyone could help me get my head around this I'd really appreciate it!
You can do this using window functions:
SELECT td.years, td.transport, SUM(ncount),
SUM(ncount) / SUM( 1.0*SUM(ncount) ) OVER (PARTITION BY year) as ratio
FROM touristdata td
GROUP BY td.years, td.transport;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.