简体   繁体   English

Python:在DataFrame中,如何遍历一列的所有字符串并检查它们是否出现在另一列中并计数?

[英]Python: In a DataFrame, how do I loop through all strings of one column and check to see if they appear in another column and count them?

I've got a dataframe and want to loop through all cells within column c2 and count how many times each entire string appears in another column c1 , if it exists. 我有一个数据框,想要遍历c2列中的所有单元格,并计算每个完整字符串出现在另一列c1 (如果存在)的次数。 Then print the results. 然后打印结果。

Example df: df示例:

id     c1                c2
0      luke skywalker    han solo
1      leia organa       r2d2
2      darth vader       finn
3      han solo          the emporer
4      han solo          c3po
5      finn              leia organa
6      r2d2              darth vader

Example printed result: 示例打印结果:

han solo      2
r2d2          1
finn          1
the emporer   0
c3po          0
leia organa   1
darth vader   1

I'm using Jupyter notebook with python and pandas. 我正在将Jupyter Notebook与python和pandas一起使用。 Thanks! 谢谢!

You can use some Numpy magic. 您可以使用一些Numpy魔术。
Use count and broadcasting to compare each combination. 使用count和广播比较每个组合。

from numpy.core.defchararray import count

c1 = df.c1.values.astype(str)
c2 = df.c2.values.astype(str)

pd.Series(
    count(c1, c2[:, None]).sum(1),
    c2
)

han solo       2
r2d2           1
finn           1
the emporer    0
c3po           0
leia organa    1
darth vader    1
dtype: int64

You can pass them as category and using value_counts 您可以将它们作为category并使用value_counts传递

df.c1.astype('category',categories=df.c2.tolist()).value_counts(sort=False)
Out[572]: 
han solo       2
r2d2           1
finn           1
the emporer    0
c3po           0
leia organa    1
darth vader    1
Name: c1, dtype: int64

Or you can do 或者你可以做

pd.crosstab(df.c2,df.c1).sum().reindex(df.c2,fill_value=0)
Out[592]: 
c2
han solo       2
r2d2           1
finn           1
the emporer    0
c3po           0
leia organa    1
darth vader    1
df[c3] = pd.Series([df[c1].count(n) for n in df[c2]])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python:在DataFrame中,如何查找某一列中的字符串出现在另一列中的年份? - Python: In a DataFrame, how do I find the year that strings from one column appear in another column? 如何遍历列中的行并使用Python进行计数? - How do I loop through the rows in a column and count them using Python? 在 Python 中,如何根据另一列更改 dataframe 的一列? - In Python, how do I change one column of a dataframe based on another? 如何通过 python 中的循环从 dataframe 中一一删除列? - How to remove column one by one from a dataframe through a loop in python? 如何检查字符串列表中的数据框列值? - How do I check dataframe column value in list of strings? 如何从一个数据框中的列中提取特定值并将它们附加到另一个数据框中的列中? - 熊猫 - How do you extract specific values from a column in one dataframe and append them to a column in another dataframe? - Pandas 如何将一个数据框中的列表列与另一数据框中的字符串列连接在一起? - How to join a column of lists in one dataframe with a column of strings in another dataframe? 如何为基于另一个DataFrame的字符串创建标签列? - How do I create a labeling column for strings based on another DataFrame? 如何从一列中查找也出现在 Python 中 DataFrame 的另一列中的元素 - How to find the elements from one column which also appear in another column of a DataFrame in Python 使用 Python pandas 数据框列作为通过另一列的循环的输入 - Using a Python pandas dataframe column as input to a loop through another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM