I have been given a large dataset that has data arranged like so:
location cost year
1 23.15 1986
1 23.91 1988
1 23.31 1989
1 23.91 1993
1 22.98 1994
1 23.99 1995
1 23.71 1997
1 23.01 1999
2 23.21 2000
2 24.28 2004
2 24.4 2005
I'd like to rearrange this so it is in the form:
location 1985 1986 1987 1988
1 20.00 20.00 20.0 20.0
2 20.00 20.00 20.0 20.0
3 20.00 20.00 20.0 20.0
4 20.00 20.00 20.0 20.0
5 20.00 20.00 20.0 20.0
(note: ignore that the new costs are all 20.0. My goal is to turn the values within the year
column into headers, so that each location
is only listed once, with the cost
for a specific year located in that column.)
Is there a straightforward way to do this? I've looked into groupy
and transpose
but have not been able to produce anything close to what I'd like.
Thank you in advance for any pointers you can provide.
You need to use pivot_table
:
pd.pivot_table(df, index='location', columns='year', values='cost', fill_value=0)
With your sample:
#Out[11]:
#year 1986 1988 1989 1993 1994 1995 1997 1999 2000 \
#location
#1 23.15 23.91 23.31 23.91 22.98 23.99 23.71 23.01 0.00
#2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 23.21
#year 2004 2005
#location
#1 0.00 0.0
#2 24.28 24.4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.