簡體   English   中英

如何在nltk中使用Regexp Tagger?

[英]How do I use Regexp Tagger in nltk?

如果我嘗試這段代碼:

import nltk
pattern = [(r'(March)$','MAR')]
tagger=nltk.RegexpTagger(pattern)
print tagger.tag('He was born in March 1991')

我得到一個輸出類似於:

[('H',無),('e',無),('',無),('w',無),('a',無),('s',無),(' ',無),>('b',無),('o',無),('r',無),('n',無),('',無),('我',無),('n',無),('',無),('M',無),('a',無),('r',無),('c',無), ('h',無),('',無),('1',無),('9',無),('9',無),('1',無)]

事實上,我希望這個標記器能夠識別帶有'MAR'標簽的'March'字樣。

試試這個:

import nltk
pattern = [(r'(March)$','MAR')]
tagger = nltk.RegexpTagger(pattern)
print tagger.tag(nltk.word_tokenize('He was born in March 1991'))

你必須對單詞進行標記。

這是我得到的輸出:

[('He', None), ('was', None), ('born', None), ('in', None), ('March', 'MAR'), ('1991', None)]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM