python数据帧计数单词出现次数

Question

我在这里搜索了很多，但找不到答案。 我有一个包含长字符串的“描述”列的数据框，我正在尝试计算特定单词“餐厅”的出现次数，

df['has_restaurants'] = 0
for index,text in enumerate(df['Description']):
    text = text.split()
    df['has_restaurants'][index] = (sum(map(lambda count : 1 if 'restaurant' in count else 0, text)))

上面做了，它可以工作，但它看起来不是一个好方法，它也会产生这个“错误”：

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['has_restaurants'][index] = (sum(map(lambda count : 1 if 'restaurant' in count else 0, text)))

Answer 1

您可以通过使用.str.count方法来简化它，考虑以下简单示例

import pandas as pd
df = pd.DataFrame({"description":["ABC DEF GHI","ABC ABC ABC","XYZ XYZ XYZ"]})
df['ABC_count'] = df.description.str.count("ABC")
print(df)

输出

   description  ABC_count
0  ABC DEF GHI          1
1  ABC ABC ABC          3
2  XYZ XYZ XYZ          0

Answer 2

您可以使用 Python 的原生.count()方法：

df['has_restaurants'] = 0
for index,text in enumerate(df['Description']):
    df['has_restaurants'][index] = text.count('restaurant')

python数据帧计数单词出现次数

问题描述

2 个解决方案

解决方案1
2 已采纳 2022-05-30 12:08:19

解决方案2
0 2022-05-30 12:08:01

python数据帧计数单词出现次数

问题描述

2 个解决方案

解决方案1 2 已采纳 2022-05-30 12:08:19

解决方案2 0 2022-05-30 12:08:01

解决方案1
2 已采纳 2022-05-30 12:08:19

解决方案2
0 2022-05-30 12:08:01