繁体   English   中英

如果单词在字典中,我如何计算每行中的单词出现次数

[英]How do I count word occurrence in each line if the word is in a dictionary

我正在尝试计算每行中正面、负面和中性词的数量。 我有一个包含名为reviews.txt 的评论行的文本文件。

我的代码:

poswords = {} #contains positive words
negwords = {} #contains negative words
with open(path + "reviews.txt", 'r') as f:
    possum = 0
    negsum = 0
    neutsum = 0
    for line in f.readlines():
        lower = line.lower()
        for word in lower.split():
            if word in poswords:
                possum += 1
            elif word in negwords:
                negsum += 1
            else:
                neutsum += 1
print(possum)
print(negsum)
print(neutsum)

Output:

1401
633
18351

我如何显示每行的计数,而不是计算整个文本文件的正面、负面和中性词?

将最后 3 个打印语句放入 for 循环中。 喜欢

poswords = {} #contains positive words
negwords = {} #contains negative words
with open(path + "reviews.txt", 'r') as f:
    for line in f.readlines():
        possum = 0
        negsum = 0
        neutsum = 0
        lower = line.lower()
        for word in lower.split():
            if word in poswords:
                possum += 1
            elif word in negwords:
                negsum += 1
            else:
                neutsum += 1
        print("Line: ", line)
        print(possum)
        print(negsum)
        print(neutsum)

将每行的计数变量设置为零,然后在完成该行后打印变量。

poswords = {} #contains positive words
negwords = {} #contains negative words
with open(path + "reviews.txt", 'r') as f:
    for line in f.readlines():
        possum = 0
        negsum = 0
        neutsum = 0 
        lower = line.lower()    
        for word in lower.split():
            if word in poswords:
                possum += 1
            elif word in negwords:
                negsum += 1
            else:
                neutsum += 1
        print("\n", line)
        print(possum)
        print(negsum)
        print(neutsum)

这也可以用re来完成:

poswords = {...}
negwords = {...}
pos = '|'.join(poswords)
neg = '|'.join(negwords)

with open("reviews.txt", 'r') as f:
    matches = re.findall(f'({pos})|({neg})|(\w+)', f.read())
positive, negitive, neutral = (sum(map(bool, g)) for g in zip(*matches))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM