![](/img/trans.png)
[英]Keyword extraction: same word in plural/singular/past_tense/-ing format
[英]Is there a way to correctly remove the tense or plural from a word?
是否可以使用nltk將諸如跑步,幫助,做飯,發現和快樂地變成跑步,幫助,做飯,尋找和快樂的詞?
在nltk
實現了一些詞干算法。 看來Lancaster
阻止算法將為您工作。
>>> from nltk.stem.lancaster import LancasterStemmer
>>> st = LancasterStemmer()
>>> st.stem('happily')
'happy'
>>> st.stem('cooks')
'cook'
>>> st.stem('helping')
'help'
>>> st.stem('running')
'run'
>>> st.stem('finds')
'find'
>>> from nltk.stem import WordNetLemmatizer
>>> wnl = WordNetLemmatizer()
>>> ls = ['running', 'helping', 'cooks', 'finds']
>>> [wnl.lemmatize(i) for i in ls]
['running', 'helping', u'cook', u'find']
>>> ls = [('running', 'v'), ('helping', 'v'), ('cooks', 'v'), ('finds','v')]
>>> [wnl.lemmatize(word, pos) for word, pos in ls]
[u'run', u'help', u'cook', u'find']
>>> ls = [('running', 'n'), ('helping', 'n'), ('cooks', 'n'), ('finds','n')]
>>> [wnl.lemmatize(word, pos) for word, pos in ls]
['running', 'helping', u'cook', u'find']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.