在pandas中生成带有二进制计数值的交叉表类型数据帧

Question

I have a pandas dataframe like this 我有像这样的熊猫数据框

UIID  ISBN
a      12
b      13

I want to compare each UUID with the ISBN and add a count column in the dataframe. 我想将每个UUID与ISBN进行比较，并在数据框中添加一个计数列。

UUID ISBN Count
 a     12   1
 a     13   0
 b     12   0
 b     13   1

How can this be done in pandas. 怎么能在熊猫里做到这一点。 I know the crosstab function does the same thing but I want the data in this format. 我知道交叉表功能做同样的事情，但我想要这种格式的数据。

Answer 1

Use crosstab with melt : 使用带有melt crosstab ：

df = pd.crosstab(df['UIID'], df['ISBN']).reset_index().melt('UIID', value_name='count')
print (df)
  UIID ISBN  count
0    a   12      1
1    b   12      0
2    a   13      0
3    b   13      1

Alternative solution with GroupBy.size and reindex by MultiIndex.from_product : 通过MultiIndex.from_product使用GroupBy.size和reindex的替代解决方案：

s = df.groupby(['UIID','ISBN']).size()
mux = pd.MultiIndex.from_product(s.index.levels, names=s.index.names)
df = s.reindex(mux, fill_value=0).reset_index(name='count')
print (df)
  UIID  ISBN  count
0    a    12      1
1    a    13      0
2    b    12      0
3    b    13      1

Answer 2

You can also use pd.DataFrame.unstack : 您还可以使用pd.DataFrame.unstack ：

df = pd.crosstab(df.UIID, df.ISBN).unstack().reset_index()
print(df)
   ISBN UIID  0
0    12    a  1
1    12    b  0
2    13    a  0
3    13    b  1

在pandas中生成带有二进制计数值的交叉表类型数据帧

问题描述

2 个解决方案

解决方案1
6 2019-02-12 08:51:07

解决方案2
1 2019-02-12 08:53:00

在pandas中生成带有二进制计数值的交叉表类型数据帧

问题描述

2 个解决方案

解决方案1 6 2019-02-12 08:51:07

解决方案2 1 2019-02-12 08:53:00

解决方案1
6 2019-02-12 08:51:07

解决方案2
1 2019-02-12 08:53:00