[英]Pandas PivotTable
I have a Pandas dataframe with the following columns:我有一个包含以下列的 Pandas 数据框:
SecId Date Sector Country
184149 2019-12-31 Utility USA
184150 2019-12-31 Banking USA
187194 2019-12-31 Aerospace FRA
...............
128502 2020-02-12 CommSvcs UK
...............
SecId & Date columns are the indices. SecId & Date 列是索引。 What I want is the following..
我想要的是以下..
SecId Date Aerospace Banking CommSvcs ........ Utility AFG CAN .. FRA .... UK USA ...
184149 2019-12-31 0 0 0 1 0 0 0 0 1
184150 2019-12-31 0 1 0 0 0 0 0 0 1
187194 2019-12-31 1 0 0 0 0 0 1 0 0
................
128502 2020-02-12 0 0 1 0 0 0 0 1 0
................
What is the efficient way to pivot this?什么是有效的方法来解决这个问题? The original data is denormalized for each day and can have millions of rows.
原始数据每天都经过非规范化处理,可能有数百万行。
You can use get_dummies
.您可以使用
get_dummies
。 You can cast as a categorical dtype beforehand to define what columns will be created.您可以预先将其转换为分类 dtype 以定义将创建哪些列。
code:代码:
SECTORS = df.Sector.unique()
df["Sector"] = df.Sector.astype(pd.Categorical(SECTORS))
COUNTRIES = df.Country.unique()
df["Country"] = df.Country.astype(pd.Categorical(COUNTRIES))
df2 = pd.get_dummies(data=df, columns=["Sector", "Country"], prefix="", pefix_sep="")
output:输出:
SecId Date Aerospace Banking Utility FRA USA
0 184149 2019-12-31 0 0 1 0 1
1 184150 2019-12-31 0 1 0 0 1
2 187194 2019-12-31 1 0 0 1 0
Try as @BEN_YO suggests:按照@BEN_YO 的建议尝试:
pd.get_dummies(df,columns=['Sector', 'Country'], prefix='', prefix_sep='')
Output:输出:
SecId Date Aerospace Banking Utility FRA USA
0 184149 2019-12-31 0 0 1 0 1
1 184150 2019-12-31 0 1 0 0 1
2 187194 2019-12-31 1 0 0 1 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.