从元组列表中搜索索引

Question

I have a sentence = "hi there, my car number is H 11231, and my card number is 11122" . 我有一句话=“嗨，我的车号是H 11231，我的卡号是11122”。 I tokenized the sentence then POS tagged the tokenized sentence. 我标记了该句子，然后POS标记了标记化的句子。 I want to grab the car number, I created a loop to check if the index is at a number lets say (11231). 我想获取车号，我创建了一个循环来检查索引是否在数字上（11231）。 Then check 1 tuple before or after if it has the tag NNP (which stands for 1 Letter) 然后在它之前或之后检查1个元组是否具有标签NNP（代表1个字母）

import nltk

sentence = 'hi there, my car number is H 11231, and my card number is 11122'

tokenizedSent = nltk.word_tokenize(sentence)
tagged = nltk.pos_tag(tokenizedSent)
print(tagged)
output = []

print(tagged)
for i in tagged: 
    if i[1] == 'CD':
        output.append(i[0])
    elif i[1] == 'NNP':
        output.append(i[0])

The sentence has two numbers which are 11231 and 11122. However, only one of them is the car number which is the one before is tagged by NNP 该句子有两个数字，分别是11231和11122。但是，其中只有一个是汽车号，而之前的那个是由NNP标记的

Answer 1

Your solution traverse the collection of tag and only takes CD and NNP tags. 您的解决方案遍历标签的集合，并且仅使用CD和NNP标签。

You want only numbers. 您只需要数字。

so first thing to do : get the numbers : 所以首先要做的是：获取数字：

IndexTag = namedtuple('IndexTag', ['tag', 'index'])
numbers = []
for i, tag in enumerate(tagged):
    if tag[1] == 'number':
        numbers.append(IndexTag(tag, i))

now that you have your numbers you can check that the "previous tag is NNP" : 现在您有了电话号码，就可以检查“上一个标签是NNP”了：

car_ids = []
for number in numbers:
    if number.index > 0 and tagged[number.index - 1][1] == 'NNP':
       car_ids.append(number)

从元组列表中搜索索引

问题描述

1 个解决方案

解决方案1
0 2018-11-28 08:30:12

从元组列表中搜索索引

问题描述

1 个解决方案

解决方案1 0 2018-11-28 08:30:12

解决方案1
0 2018-11-28 08:30:12