計算 python 中單詞在塊中的出現次數（列表理解）

Question

我對編程非常陌生，所以如果這太愚蠢了，我很抱歉。

我試圖按塊計算一個單詞的所有出現，然后我需要 plot 這些結果。 我的文字是《傲慢與偏見》，我試圖通過 3000 個單詞來找出'Mr.Darcy'這個名字的頻率。 所以我嘗試了下一個不成功的。

x = [chunk.count('Mr. Darcy') for chunk in partition(100000, text1_pride)]

任何人都可以幫忙嗎？ 非常感謝。

Answer 1

如前所述，“達西先生”如果用空格分隔，將被計為 2 個單詞。 如果您只想查找“Darcy”，如果您的字符串稱為text1_pride ，您可能會這樣做

words = text1_pride.split()
chunks = [words[x:x+3000] for x in range(0, len(words), 3000)]
darcy_counts = [chunk.count('Darcy') for chunk in chunks]

這一切都可以用嵌套列表推導式在一行中完成。

Answer 2

您想要基於隨機數據執行的操作的最小版本是：

import random
import loremipsum


text = ' '.join(loremipsum.get_sentences(400)).split() # split into words

# where to replace part with Mr. Darcy
where = [random.randint(1, len(text) - 1) for _ in range(1000)]

for p in where:
    text[p] = "Mr. Darcy"

text = ' '.join(text)

chunk_size = 100

# check for chunk_size list elements (some containing "Mr. Darcy" - most not)

# joins each chunk into a text then looks for Mr. Darcy    
x = [' '.join(chunk).count('Mr. Darcy') for chunk in (
    text[i: i + chunk_size] for i in range(0, len(text), chunk_size))]
    
print(x)

Output：

[34, 28, 28, 34, 35, 22, 25, 31, 26, 32, 23, 21, 37, 32, 29, 40, 30,
28, 40, 29, 35, 31, 25, 34, 28, 31, 32, 11]

你需要做

with open("yourfile.txt") as f:
    text = f.read().split()

chunk_size = 3000
chunks = [ ' '.join(text[i: i + chunk_size]) for i in range(0, len(text), chunk_size))]

然后按塊計算每個塊。

計算 python 中單詞在塊中的出現次數（列表理解）

問題描述

2 個解決方案

解決方案1
0 2020-12-18 11:26:23

解決方案2
0 2020-12-18 11:30:13

計算 python 中單詞在塊中的出現次數（列表理解）

問題描述

2 個解決方案

解決方案1 0 2020-12-18 11:26:23

解決方案2 0 2020-12-18 11:30:13

解決方案1
0 2020-12-18 11:26:23

解決方案2
0 2020-12-18 11:30:13