![](/img/trans.png)
[英]Column in pandas dataframe has lists as values. How do I create a version of this column but with only the first value in the list?
[英]How do I create a pivot table from a dataframe that has a column contains lists?
我有一個數據框,它看起來像,
import pandas as pd
data = [
{
"userId": 1,
"binary_vote": 0,
"genres": [
"Adventure",
"Comedy"
]
},
{
"userId": 1,
"binary_vote": 1,
"genres": [
"Adventure",
"Drama"
]
},
{
"userId": 2,
"binary_vote": 0,
"genres": [
"Comedy",
"Drama"
]
},
{
"userId": 2,
"binary_vote": 1,
"genres": [
"Adventure",
"Drama"
]
},
]
df = pd.DataFrame(data)
print(df)
userId binary_vote genres
0 1 0 [Adventure, Comedy]
1 1 1 [Adventure, Drama]
2 2 0 [Comedy, Drama]
3 2 1 [Adventure, Drama]
我想從binary_vote
創建列。 這是預期的輸出,
userId binary_vote_0 binary_vote_1
0 1 [Adventure, Comedy] [Adventure, Drama]
1 2 [Comedy, Drama] [Adventure, Drama]
我試過這樣的事情,但我得到一個錯誤,
pd.pivot_table(df, columns=['binary_vote'], values='genres')
這是錯誤,
DataError:沒有要聚合的數字類型
任何想法? 提前致謝。
我們必須創建我們自己的aggfunc
,在這種情況下它是一個簡單的。
它失敗的原因是因為它試圖取mean
因為它是默認聚合函數。 顯然,這將在您的列表中失敗。
piv = (
df.pivot_table(index='userId', columns='binary_vote', values='genres', aggfunc=lambda x: x)
.add_prefix('binary_vote_')
.reset_index()
.rename_axis(None, axis=1)
)
print(piv)
userId binary_vote_0 binary_vote_1
0 1 [Adventure, Comedy] [Adventure, Drama]
1 2 [Comedy, Drama] [Adventure, Drama]
用另一種方式set_index()
和unstack()
m=(df.set_index(['userId','binary_vote']).unstack()
.add_prefix('binary_vote_').droplevel(level=0,axis=1))
m.reset_index().rename_axis(None,axis=1)
userId binary_vote_0 binary_vote_1
0 1 [Adventure, Comedy] [Adventure, Drama]
1 2 [Comedy, Drama] [Adventure, Drama]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.