I am reading a csv file which have the following structure:
Continent, Country, Year, GDP
All countries have multiple years but some countries might missing some years.
My aim is to have as index the Continent and Country, and as columns the GDP for each year.
Continent Country 2009 2010 2011 2012 2013 2014
I have tried this:
df.pivot(index=["Continent", "Country"], columns="Year", values="GDP")
but it gives me this error:
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
You can try this sample data:
pd.DataFrame(columns=['Continent', 'Country', 'Year', 'GDP'],
data=[['NA', 'US', 2014, 1234], ['NA', 'US', 2013, 2345]])
If you use pivot_table
instead of pivot
, it works:
In [47]: df.pivot_table(index=["Continent", "Country"], columns="Year", values="GDP")
Out[47]:
Year 2013 2014
Continent Country
NA US 2345 1234
The problem is that pivot
cannot handle a list of columns for the index/columns argument. The only caveat is that now the default is to take the mean if there are multiple values for one continent/country/year combination.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.