计算列表中两个标记之间的子字符串数

Question

我想计算以下列表中提及该名称的次数，并将提及的次数和频率放入字典中。

所以，例如，这是对话列表

dialogue = ["This is great! RT @user14: Can you believe this?",
            "That's right RT @user22: The dodgers are destined to win the west!",
            "This is about things @user14, how could you",
            "RT @user11: The season is looking great!"]

我希望我的 output 是{user14:2, user22:1, user11:1}

我尝试开始编写以下内容以生成名单，然后将名单和 output 计数到字典中。 但不知道如何做到这一点

user_name = [x.split('@')[1].split(':')[:-1] for x in tweets]

Answer 1

正则表达式可能是解释用户名后未知字符的最佳方法：

from collections import defaultdict
import re

result = defaultdict(int)

for item in dialogue:

    user = re.search('(?<=@)[\w\s]+', item).group(0)
    result[user] += 1

print(result)

给出：

{'user14': 2, 'user22': 1, 'user11': 1}

Answer 2

在单程中使用collections.Counter object 和re.findall function：

from collections import Counter
import re

...
uname_counts = Counter(re.findall(r'@(\w+)', ''.join(dialogue)))
print(dict(uname_counts))   # {'user14': 2, 'user22': 1, 'user11': 1}

计算列表中两个标记之间的子字符串数

问题描述

2 个解决方案

解决方案1
1 2019-10-25 09:23:12

解决方案2
1 2019-10-25 09:29:19

计算列表中两个标记之间的子字符串数

问题描述

2 个解决方案

解决方案1 1 2019-10-25 09:23:12

解决方案2 1 2019-10-25 09:29:19

解决方案1
1 2019-10-25 09:23:12

解决方案2
1 2019-10-25 09:29:19