[英]Creating columns out of categorical variables with values from two other columns in pandas
Original dataframe 原始数据框
I want it to transform into the following structure: 我希望它转换为以下结构:
Area |Ind3_2016|Ind6_2016|...|Ind12_2016|Ind3_2017|Ind6_2017|...| Ind12_2017
-------|---------|---------|---|----------|---------|---------|---|-----------
Alabama| 2306 | 2270 |...| 35621 | 2409 | 3391 |...| 36397
Create columns out of every unique value in IndCode column for 2016 and 2017 and place values of 2016 and 2017 column under these columns. 在2016和2017的IndCode列中的每个唯一值中创建列,并将2016和2017列的值放在这些列下。
You can either perform two separate pivots and then concatenate the results, or do some stacking beforehand and just do one pivot. 您可以执行两个单独的枢轴,然后连接结果,或者预先进行一些堆叠,然后仅执行一个枢轴。
import pandas as pd
df = pd.DataFrame({'Area': ['A', 'A','A','A','A'],
'IndCode': [3, 6, 10, 11, 12],
'Industry': ['blah', 'foo', 'bar', 'baz', 'boo'],
'2016': [2306, 2270, 5513, 7730, 35621],
'2017': [2409, 3391, 5438, 7890, 36397]
})
pd.concat([pd.pivot_table(df, index='Area', columns='Ind'+df.IndCode.astype(str)+'_2016', values='2016'),
pd.pivot_table(df, index='Area', columns='Ind'+df.IndCode.astype(str)+'_2017', values='2017')],axis=1)
Outputs: 输出:
IndCode Ind10_2016 Ind11_2016 Ind12_2016 Ind3_2016 Ind6_2016 Ind10_2017 Ind11_2017 Ind12_2017 Ind3_2017 Ind6_2017
Area
A 5513 7730 35621 2306 2270 5438 7890 36397 2409 3391
df2 = df.set_index(['Area', 'IndCode'])[['2016', '2017']].stack().reset_index()
pd.pivot_table(df2, index='Area',
columns='Ind'+df2.IndCode.astype('str')+'_'+df2.level_2.astype(str),
values=0).reset_index()
Outputs: 输出:
Area Ind10_2016 Ind10_2017 Ind11_2016 Ind11_2017 Ind12_2016 Ind12_2017 Ind3_2016 Ind3_2017 Ind6_2016 Ind6_2017
0 A 5513 5438 7730 7890 35621 36397 2306 2409 2270 3391
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.