从NTLK pos_tag中仅选择“ NN”和“ VB”字

Question

I need to print only 'NN' and 'VB' words from an entered sentence. 我只需要从输入的句子中打印“ NN”和“ VB”单词。

import nltk
import re
import time

var = raw_input("Please enter something: ")


exampleArray = [var]


def processLanguage():
    try:
        for item in exampleArray:
            tokenized = nltk.word_tokenize(item)
            tagged = nltk.pos_tag(tokenized)
            print tagged

            time.sleep(555)


    except Exception, e:
        print str(e)

processLanguage()

Answer 1

How about changing 如何改变

    print tagged

to 至

    print [(word, tag) for word, tag in tagged if tag in ('NN', 'VB')]

Answer 2

You might need to use the first 2 characters of the POS tag, see NLTK - Get and Simplify List of Tags 您可能需要使用POS标签的前2个字符，请参阅NLTK-获取和简化标签列表

nn_vb_tagged = [(word,tag) for word, tag in tagged 
                if tag.startswith('NN') or tag.startswith('VB')]

Answer 3

You can try this: 您可以尝试以下方法：

example = "This is a sample sentence, showing off the stop words filtration.!"
word_tokens = word_tokenize(example)
pos = nltk.pos_tag(word_tokens)
selective_pos = ['NN','VB']
selective_pos_words = []
for word,tag in pos:
     if tag in selective_pos:
         selective_pos_words.append((word,tag))
print(selective_pos_words)

By adding your selective parts of speech in the list "selective_pos", you can select any of your preferable word. 通过在列表“ selective_pos”中添加您选择的词性，您可以选择任何您喜欢的单词。

从NTLK pos_tag中仅选择“ NN”和“ VB”字

问题描述

3 个解决方案

解决方案1
5 已采纳 2015-07-04 13:25:14

解决方案2
1 2015-07-05 19:12:43

解决方案3
0 2019-10-06 10:42:52

从NTLK pos_tag中仅选择“ NN”和“ VB”字

问题描述

3 个解决方案

解决方案1 5 已采纳 2015-07-04 13:25:14

解决方案2 1 2015-07-05 19:12:43

解决方案3 0 2019-10-06 10:42:52

解决方案1
5 已采纳 2015-07-04 13:25:14

解决方案2
1 2015-07-05 19:12:43

解决方案3
0 2019-10-06 10:42:52