簡體   English   中英

TypeError使用正則表達式在Python中進行文本分析

[英]TypeError using regular expressions for text analysis in Python

我正在嘗試編寫一些代碼,以掃描與正則表達式“ PP +”匹配的每個字符串,並告訴我它出現了多少次。 這是我的代碼:

with open ('testfile.txt') as f:
data = f.read()
data = data.split()

import re


the_sum = 0

prolist = []

for word in data:
    pronoun = re.compile(r'PP+')
    result = pronoun.match(data)
    if word == result:
        the_sum += 1

print the_sum

我收到此錯誤消息:

Traceback (most recent call last):
  File "C:/Python27/RE_counter.py", line 14, in 
    result = pronoun.match(data)
TypeError: expected string or buffer

有人可以告訴我我在做什么錯嗎?

您將在每次迭代中傳遞整個列表(即TypeError ),並且由於沒有返回單詞,因此也沒有正確檢查匹配結果:

for word in data:
    pronoun = re.compile(r'PP+')
    result = pronoun.match(word)  # ← you had pronoun.match(data)
    if result is not None:        # ← you had if word == result
        the_sum += 1

你可以直接從中得到你所得到的。

with open ('testfile.txt') as f:
    data = f.read()
    print len(re.findall(r"\bPP\+\b",data))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM