簡體 English 中英

NLTK止動詞列表

[英]NLTK Stopword List

原文 2014-03-31 13:45:24 0 1 python/ nltk/ stop-words

我有下面的代碼，我試圖將停用詞列表應用於單詞列表。 然而，結果仍然顯示“a”和“the”這樣的詞，我認為這個詞會被這個過程刪除。 任何出錯的想法都會很棒。

import nltk
from nltk.corpus import stopwords

word_list = open("xxx.y.txt", "r")
filtered_words = [w for w in word_list if not w in stopwords.words('english')]
print filtered_words

1 個解決方案

一些值得注意的事情。

如果您要反復檢查列表中的成員資格，我會使用集合而不是列表。
stopwords.words('english')返回一個小寫停用詞列表。 您的來源很可能包含大寫字母，因此不匹配。
您沒有正確讀取文件，您正在檢查文件對象而不是按空格分割的單詞列表。

把它們放在一起：

import nltk
from nltk.corpus import stopwords

word_list = open("xxx.y.txt", "r")
stops = set(stopwords.words('english'))

for line in word_list:
    for w in line.split():
        if w.lower() not in stops:
            print w

NLTK停用詞刪除問題

[英]NLTK stopword removal issue

使用NLTK和Pandas刪除停用詞

[英]Stopword removal with NLTK and Pandas

使用 NLTK 去除停用詞

[英]Stopword removal with NLTK

如何從NLTK擴展禁用詞列表並刪除帶有擴展列表的停用詞？

[英]How to extend the stopword list from NLTK and remove stop words with the extended list?

python 列表中的停用詞刪除

[英]stopword removal in python list

Sklearn - 如何從 txt 文件添加自定義停用詞列表

[英]Sklearn - How to add custom stopword list from txt file

轉換NLTK LazySubsequence為列表

[英]Convert NLTK LazySubsequence to a list

NLTK 的額外縮寫列表？

[英]List of extra abbreviations for NLTK?

語法nltk在Python中的列表

[英]Grammar nltk for list in Python

nltk 不識別列表

[英]nltk do not recognize list

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 NLTK停用詞刪除問題使用NLTK和Pandas刪除停用詞使用 NLTK 去除停用詞如何從NLTK擴展禁用詞列表並刪除帶有擴展列表的停用詞？ python 列表中的停用詞刪除 Sklearn - 如何從 txt 文件添加自定義停用詞列表轉換NLTK LazySubsequence為列表 NLTK 的額外縮寫列表？語法nltk在Python中的列表 nltk 不識別列表

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM