具有唯一列值的 pandas pivot 表

Question

I have df with some string values.我有一些字符串值的 df 。

so = pd.DataFrame({
"col1": ["row0", "row1", "row2"],
"col2": ["A", "B", "C"],
"col3": ["A", "A", "B"],
"col4":  ["B", "A", "B"],
})

I need to create pivot table where:我需要创建 pivot 表，其中：

index is values from column "col1"索引是列“col1”中的值
columns are unique values from columns ['col2':'col4']列是列 ['col2':'col4'] 中的唯一值
values at the intersection are count of column name matches for every row交点处的值是每行的列名匹配计数

For my example, the answer should be:对于我的例子，答案应该是：

Please help... thank you in advance请帮助...提前谢谢你

Answer 1

melt and crosstab : melt和crosstab ：

df2 = so.melt('col1')
pd.crosstab(df2['col1'], df2['value'])

or melt and groupby.count :或melt和groupby.count ：

so.melt('col1').groupby(['col1', 'value']).size().unstack(fill_value=0)

output: output：

value  A  B  C
col1          
row0   2  1  0
row1   2  1  0
row2   0  2  1

NB.注意。 for the exact output, use .reset_index().rename_axis(columns=None)对于确切的 output，使用.reset_index().rename_axis(columns=None)

Answer 2

here is one way to do it这是一种方法

df.melt('col1').pivot_table(index='col1', columns='value', aggfunc=(lambda x: int(x.size))  ).fillna(0).reset_index()

    col1    variable
value       A         B     C
0   row0    2.0     1.0     0.0
1   row1    2.0     1.0     0.0
2   row2    0.0     2.0     1.0