Python：在DataFrame中，如何遍历一列的所有字符串并检查它们是否出现在另一列中并计数？

Question

I've got a dataframe and want to loop through all cells within column c2 and count how many times each entire string appears in another column c1 , if it exists. 我有一个数据框，想要遍历c2列中的所有单元格，并计算每个完整字符串出现在另一列c1 （如果存在）的次数。 Then print the results. 然后打印结果。

Example df: df示例：

id     c1                c2
0      luke skywalker    han solo
1      leia organa       r2d2
2      darth vader       finn
3      han solo          the emporer
4      han solo          c3po
5      finn              leia organa
6      r2d2              darth vader

Example printed result: 示例打印结果：

han solo      2
r2d2          1
finn          1
the emporer   0
c3po          0
leia organa   1
darth vader   1

I'm using Jupyter notebook with python and pandas. 我正在将Jupyter Notebook与python和pandas一起使用。 Thanks! 谢谢！

Answer 1

You can use some Numpy magic. 您可以使用一些Numpy魔术。
Use count and broadcasting to compare each combination. 使用count和广播比较每个组合。

from numpy.core.defchararray import count

c1 = df.c1.values.astype(str)
c2 = df.c2.values.astype(str)

pd.Series(
    count(c1, c2[:, None]).sum(1),
    c2
)

han solo       2
r2d2           1
finn           1
the emporer    0
c3po           0
leia organa    1
darth vader    1
dtype: int64

Answer 2

You can pass them as category and using value_counts 您可以将它们作为category并使用value_counts传递

df.c1.astype('category',categories=df.c2.tolist()).value_counts(sort=False)
Out[572]: 
han solo       2
r2d2           1
finn           1
the emporer    0
c3po           0
leia organa    1
darth vader    1
Name: c1, dtype: int64

Or you can do 或者你可以做

pd.crosstab(df.c2,df.c1).sum().reindex(df.c2,fill_value=0)
Out[592]: 
c2
han solo       2
r2d2           1
finn           1
the emporer    0
c3po           0
leia organa    1
darth vader    1

Answer 3

df[c3] = pd.Series([df[c1].count(n) for n in df[c2]])

Python：在DataFrame中，如何遍历一列的所有字符串并检查它们是否出现在另一列中并计数？

问题描述

3 个解决方案

解决方案1
3 已采纳 2018-02-15 23:35:21

解决方案2
2 2018-02-15 23:34:43

解决方案3
0 2018-02-15 23:53:23

Python：在DataFrame中，如何遍历一列的所有字符串并检查它们是否出现在另一列中并计数？

问题描述

3 个解决方案

解决方案1 3 已采纳 2018-02-15 23:35:21

解决方案2 2 2018-02-15 23:34:43

解决方案3 0 2018-02-15 23:53:23

解决方案1
3 已采纳 2018-02-15 23:35:21

解决方案2
2 2018-02-15 23:34:43

解决方案3
0 2018-02-15 23:53:23