[英]Getting second element of a tuple (in a list of tuples) as a string
I have an output that is a list of tuples. 我有一个输出是元组列表。 It looks like this: 看起来像这样:
annot1=[(402L, u"[It's very seldom that you're blessed to find your equal]"),
(415L, u'[He very seldom has them in this show or his movies]')…
I need to use the second part of the tuple only to apply 'split' and get each word on the sentence separately. 我只需要使用元组的第二部分来应用“拆分”并分别获取句子中的每个单词。
At this point, I'm not able to isolate the second part of the tuple (the text). 在这一点上,我无法隔离元组的第二部分(文本)。
This is my code: 这是我的代码:
def scope_match(annot1):
scope = annot1[1:]
scope_string = ‘’.join(scope)
scope_set = set(scope_string.split(' '))
But I get: 但是我得到:
TypeError: sequence item 0: expected string, tuple found
I tried to use annot1[1] but it gives me the second index of the text instead of the second element of the tuple. 我尝试使用annot1 [1],但是它给了我文本的第二个索引,而不是元组的第二个元素。
You can do something like this with list comprehensions: 您可以使用列表推导功能执行以下操作:
annot1=[(402L, u"[It's very seldom that you're blessed to find your equal]"),
(415L, u'[He very seldom has them in this show or his movies]')]
print [a[1].strip('[]').encode('utf-8').split() for a in annot1]
Output: 输出:
[["It's", 'very', 'seldom', 'that', "you're", 'blessed', 'to', 'find', 'your', 'equal'], ['He', 'very', 'seldom', 'has', 'them', 'in', 'this', 'show', 'or', 'his', 'movies']]
You can calculate the intersection of strings in corresponding positions in annot1 and annot2 like this: 您可以像这样在annot1和annot2中的相应位置计算字符串的交集:
for x,y in zip(annot1,annot2):
print set(x[1].strip('[]').encode('utf-8').split()).intersection(y[1].strip('[]').encode('utf-8').split())
annot1
is a list of tuples. annot1
是一个元组列表。 To get the string from each of the elements, you can do something like this 要从每个元素中获取字符串,您可以执行以下操作
def scope_match(annot1):
for pair in annot1:
string = pair[1]
print string # or whatever you want to do
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.