简体   繁体   English

在熊猫python中按列计算匹配部分字符串的出现次数

[英]count occurrences matching partial string by column in pandas python

new_data is a pandas dataframe with 4 columns and: new_data 是一个有 4 列的 Pandas 数据框,并且:

If I want to get a count of occurrences for an exact matching by column I do this:如果我想按列获取精确匹配的出现次数,我会这样做:

new_data[new_data == 'blank'].count()

Output:输出:

A          0
B          0
C          0
D          2654

What if I want a partial match for the string 'bla', would be something like this:如果我想要字符串 'bla' 的部分匹配,会是这样的:

new_data[new_data in 'bla'].count()

But of course that does not work.但这当然行不通。 What is the right way to do it?正确的做法是什么?

Use DataFrame.apply and Series.str.contains with sum for count True s:DataFrame.applySeries.str.containssum用于 count True s:

np.random.seed(1234)

new_data = pd.DataFrame(np.random.choice(['a blas', 's'], size=(2,4)), columns=list('ABCD'))
print (new_data)
        A       B       C  D
0       s       s  a blas  s
1  a blas  a blas  a blas  s

print (new_data.apply(lambda x: x.str.contains('bla')).sum())
A    1
B    1
C    2
D    0
dtype: int64

Your solution:您的解决方案:

print (new_data[new_data.apply(lambda x: x.str.contains('bla'))].count())
A    1
B    1
C    2
D    0
dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM