![](/img/trans.png)
[英]How to transpose/pivot and group columns to rows in Pandas Dataframe?
[英]Pivot - Transpose columns by duplicates pandas dataframe
我有一個名為“ID”的列的 DataFrame,該列具有重復的觀察結果。 每個“ID”行都有一個或多個“文章”值列。 我想通過“ID”轉置整個數據框分組,在唯一“ID”的同一行添加新列。
我擁有的:
ID Article_1 Article_2
1 Banana Coconut
2 Apple Strawberry
1 Apple
3 Tomatoe
1 Pineapple
2 Banana
4 Apple
5 Apple Strawberry
3 Apple
我想要的是:
ID Article_1 Article_2 Article_3 Article_4
0001 Banana Coconut Apple Pineapple
0002 Apple Strawberry Banana NaN
0003 Tomatoe Apple NaN NaN
0004 Apple NaN NaN NaN
0005 Apple Strawberry NaN NaN
新編輯:
我遇到過一些順序很重要的情況。
我的DF:
ID Article Article_2
1 Banana NaN
2 Apple NaN
1 Apple Coconut
3 Tomatoe Coconut
1 Pineapple Tropical
2 Banana Coconut
4 Apple Coconut
5 Apple Coconut
3 Apple Pineapple
第一個@Erfan 解決方案的輸出:
Article_1 Article_2 Article_3 Article_4 Article_5 Article_6
0001 Banana Apple Pineapple NaN Coconut Tropical
0002 Apple Banana NaN Coconut NaN NaN
0003 Tomatoe Apple Coconut Pineapple NaN NaN
0004 Apple Coconut NaN NaN NaN NaN
0005 Apple Coconut NaN NaN NaN NaN
我需要的:
Article_1 Article_2 Article_3 Article_4 Article_5 Article_6
0001 Banana Apple Pineapple Coconut Tropical NaN
0002 Apple Banana Coconut NaN NaN NaN
0003 Tomatoe Apple Coconut Pineapple NaN NaN
0004 Apple Coconut NaN NaN NaN NaN
0005 Apple Coconut NaN NaN NaN NaN
我不能在同一行中使用具有 NaN 值的 Article_5 和具有值的 Article_6。
如果文章的順序不重要,我們可以使用DataFrame.melt
將您的文章轉為行。
然后我們使用DataFrame.pivot_table
聚合到每個ID
。 雖然我們使用GroupBy.cumcount
為ID
每篇article
提供唯一標識符:
dfn = df.melt(id_vars='ID', value_vars=['Article_1', 'Article_2'])
dfn = dfn.pivot_table(index='ID',
columns=dfn.groupby('ID')['value'].cumcount().add(1),
values='value',
aggfunc='first').add_prefix('Article_').rename_axis(None, axis='index')
Article_1 Article_2 Article_3 Article_4
0001 Banana Apple Pineapple Coconut
0002 Apple Banana Strawberry NaN
0003 Tomatoe Apple NaN NaN
0004 Apple NaN NaN NaN
0005 Apple Strawberry NaN NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.