Original dataframe
I want it to transform into the following structure:
Area |Ind3_2016|Ind6_2016|...|Ind12_2016|Ind3_2017|Ind6_2017|...| Ind12_2017
-------|---------|---------|---|----------|---------|---------|---|-----------
Alabama| 2306 | 2270 |...| 35621 | 2409 | 3391 |...| 36397
Create columns out of every unique value in IndCode column for 2016 and 2017 and place values of 2016 and 2017 column under these columns.
You can either perform two separate pivots and then concatenate the results, or do some stacking beforehand and just do one pivot.
import pandas as pd
df = pd.DataFrame({'Area': ['A', 'A','A','A','A'],
'IndCode': [3, 6, 10, 11, 12],
'Industry': ['blah', 'foo', 'bar', 'baz', 'boo'],
'2016': [2306, 2270, 5513, 7730, 35621],
'2017': [2409, 3391, 5438, 7890, 36397]
})
pd.concat([pd.pivot_table(df, index='Area', columns='Ind'+df.IndCode.astype(str)+'_2016', values='2016'),
pd.pivot_table(df, index='Area', columns='Ind'+df.IndCode.astype(str)+'_2017', values='2017')],axis=1)
Outputs:
IndCode Ind10_2016 Ind11_2016 Ind12_2016 Ind3_2016 Ind6_2016 Ind10_2017 Ind11_2017 Ind12_2017 Ind3_2017 Ind6_2017
Area
A 5513 7730 35621 2306 2270 5438 7890 36397 2409 3391
df2 = df.set_index(['Area', 'IndCode'])[['2016', '2017']].stack().reset_index()
pd.pivot_table(df2, index='Area',
columns='Ind'+df2.IndCode.astype('str')+'_'+df2.level_2.astype(str),
values=0).reset_index()
Outputs:
Area Ind10_2016 Ind10_2017 Ind11_2016 Ind11_2017 Ind12_2016 Ind12_2017 Ind3_2016 Ind3_2017 Ind6_2016 Ind6_2017
0 A 5513 5438 7730 7890 35621 36397 2306 2409 2270 3391
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.