[英]Convert categories to binary columns (concat the category columns)
想要將類別轉換為二進制列,連接到 df。 類別列值應該是新列,每個 id 根據值是否存在為 0 或 1。
df = pd.DataFrame({"id": [0,1,1,3,3],
"value1": ["ryan", "delta", "delta", "delta", "alpha"],
"category": ["teacher", "pilot", "engineer", "pilot", "teacher"],
"value2": [1, 1, 2, 3, 7]})
df
答案 df 應該是:
finaldf = pd.DataFrame({"id": [0,1,3],
"teacher":[1,0,1],
"pilot":[0,1,1],
"engineer": [0,1,0]})
使用pd.get_dummies
:
finaldf = pd.get_dummies(df, columns=["category"], prefix="", prefix_sep="")
output:
value1 value2 engineer pilot teacher
0 0 ryan 1 0 0 1
1 1 delta 1 0 1 0
2 2 delta 2 1 0 0
3 3 delta 3 0 1 0
使用pd.crosstab
進行一些額外的清理:
finaldf = pd.crosstab(df["id"], df["category"]).reset_index().rename_axis(columns=None)
output:
id engineer pilot teacher
0 0 0 0 1
1 1 1 1 0
2 3 0 1 1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.