简体   繁体   English

计数并映射字符串出现的次数

[英]Count and map number of appearances of strings

I am mapping specific keywords with text data using applymap in Python. 我正在使用Python中的applymap将特定的关键字与文本数据映射。 Let's say I want to check how often the keyword "hello" matches with the text data over all rows. 假设我要检查关键字“ hello”与所有行上的文本数据匹配的频率。 Applymap gives me the desired matrix outcome, however only a "True" or "False" instead of the number of appearances. Applymap为我提供了所需的矩阵结果,但是只有“ True”或“ False”,而不是出现次数。

I tried to connect count() with my applymap function, but I could not make it work. 我试图将count()与我的applymap函数连接起来,但无法使其正常工作。

The minimal working example is as follows: 最小的工作示例如下:

import pandas as pd
import numpy as np

df = pd.DataFrame({'text': ['hello hello', 'yes no hello', 'good morning']})
keys = ['hello']
keyword = pd.DataFrame({0:keys})

res = []
for a in df['text']:
    res.append(keyword.applymap(lambda x: x in a))

map = pd.concat(res, axis=1).T
map.index = np.arange(len(map))

#Output
map
       0
0   True
1   True
2  False

#Desired Output with 'hello' appearing twice in the first row, once in the second and zero in the third of df.
   0
0  2
1  1
2  0

I am looking for a way to keep my applymap function to obtain the matrix form, but replace the True (1) and False (0) with the number of appearances, such as the desired output shows above. 我正在寻找一种方法来保留我的applymap函数以获取矩阵形式,但将True(1)和False(0)替换为外观数量,例如上面显示的所需输出。

Instead of testing for an item in the list: 代替测试列表中的项目:

res.append(keyword.applymap(lambda x: x in a)) # x == a res.append(keyword.applymap(lambda x: x in a)) #x == a

You should use: 您应该使用:

res.append(keyword.applymap(lambda x: str.count(a, x))) # counting occurrence of "a" res.append(keyword.applymap(lambda x: str.count(a, x))) #计数“ a”的出现

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM