[英]convert this “list of a list the tuple” with 2 elements to a “list of tuple” with 3 elements
I want to convert this "list of a list the tuple" with 2 elements to a "list of tuple" with 3 elements我想将这个带有 2 个元素的“元组列表”转换为带有 3 个元素的“元组列表”
[([('Yes', 'UH'),
(',', ','),
('it', 'PRP'),
("'s", 'VBZ'),
('annoying', 'JJ'),
('and', 'CC'),
('cumbersome', 'JJ'),
('to', 'TO'),
('separate', 'VB'),
('your', 'PRP$'),
('rubbish', 'NN'),
('properly', 'RB'),
('all', 'PDT'),
('the', 'DT'),
('time', 'NN'),
('.', '.')],
'P'),
([('Three', 'CD'),
('different', 'JJ'),
('bin', 'JJ'),
('bags', 'NNS'),
('stink', 'VBP'),
('away', 'RB'),
('in', 'IN'),
('the', 'DT'),
('kitchen', 'NN'),
('and', 'CC'),
('have', 'VB'),
('to', 'TO'),
('be', 'VB'),
('sorted', 'VBN'),
('into', 'IN'),
('different', 'JJ'),
('wheelie', 'NN'),
('bins', 'NNS'),
('.', '.')],
'P'),
([('But', 'CC'),
('still', 'RB'),
('Germany', 'NNP'),
('produces', 'VBZ'),
('way', 'RB'),
('too', 'RB'),
('much', 'JJ'),
('rubbish', 'NN')],
'P'),
([('and', 'CC'),
('too', 'RB'),
('many', 'JJ'),
('resources', 'NNS'),
('are', 'VBP'),
('lost', 'VBN'),
('when', 'WRB'),
('what', 'WP'),
('actually', 'RB'),
('should', 'MD'),
('be', 'VB'),
('separated', 'VBN'),
('and', 'CC'),
('recycled', 'VBN'),
('is', 'VBZ'),
('burnt', 'VBN'),
('.', '.')],
'P'),
([('We', 'PRP'),
('Berliners', 'NNS'),
('should', 'MD'),
('take', 'VB'),
('the', 'DT'),
('chance', 'NN'),
('and', 'CC'),
('become', 'VB'),
('pioneers', 'NNS'),
('in', 'IN'),
('waste', 'NN'),
('separation', 'NN'),
('!', '.')],
'C')]
To this list到这个列表
[('Yes', 'UH', 'B-P'),
(',', ',','I-P'),
('it', 'PRP','I-P'),
("'s", 'VBZ','I-P'),
('annoying', 'JJ','I-P'),
('and', 'CC','I-P'),
('cumbersome', 'JJ','I-P'),
('to', 'TO', 'I-P'),
('separate', 'VB', 'I-P'),
('your', 'PRP$','I-P'),
('rubbish', 'NN','I-P'),
('properly', 'RB','I-P'),
('all', 'PDT','I-P'),
('the', 'DT','I-P'),
('time', 'NN','I-P'),
('.', '.','I-P')],
.
.
.
.
([('We', 'PRP','B-C'),
('Berliners', 'NNS','I-C'),
('should', 'MD','I-C'),
('take', 'VB','I-C'),
('the', 'DT','I-C'),
('chance', 'NN','I-C'),
('and', 'CC','I-C'),
('become', 'VB','I-C'),
('pioneers', 'NNS','I-C'),
('in', 'IN','I-C'),
('waste', 'NN','I-C'),
('separation', 'NN','I-C'),
('!', '.','I-C')]
as you see there,如你所见,
everywhere we have label P---> we add label BP (BEGINNING TOKEN OF LIST) and IP as 3d member of the tuple我们到处都有 label P---> 我们添加 label BP(列表的开始令牌)和 IP 作为元组的 3d 成员
everywhere we have label C---> we add a label BC (BEGINNING TOKEN OF LIST) and IP as 3d member of the tuple,,, they call this BIO tagging我们到处都有 label C---> 我们添加了一个 label BC(列表的开始令牌)和 IP 作为 3d 元组的成员,他们称此为 B 标记
https://medium.com/analytics-vidhya/bio-tagged-text-to-original-text-99b05da6664#:~:text=The%20BIO%20%2F%20IOB%20format%20(short,named%2Dentity%20recognition) . https://medium.com/analytics-vidhya/bio-tagged-text-to-original-text-99b05da6664#:~:text=The%20BIO%20%2F%20IOB%20format%20(short,named%2Dentity %20 识别) 。
I have tried different ways still could`not find the solution我尝试了不同的方法仍然找不到解决方案
listtoken=[]
listsent=[]
for lst in a:
for tpl,l in zip(lst,b):
c=(*tpl, l)
listtoken.append(c)
listsent.append(listtoken)
To add a single item to a tuple, you can use +
with a one-element tuple (denoted as (element,)
).要将单个项目添加到元组,您可以将+
与单元素元组一起使用(表示为(element,)
)。 So (1,2)+(3,) => (1,2,3)
所以(1,2)+(3,) => (1,2,3)
A list comprehension should do the job easily:列表理解应该很容易完成这项工作:
# with your list as L
R = [ [t+('BI'[i>0]+'-'+lb,) for i,t in enumerate(T)] for T,lb in L ]
output: output:
print(R)
[
[ ('Yes', 'UH', 'B-P'), (',', ',', 'I-P'), ('it', 'PRP', 'I-P'), ("'s", 'VBZ', 'I-P'), ('annoying', 'JJ', 'I-P'), ('and', 'CC', 'I-P'), ('cumbersome', 'JJ', 'I-P'), ('to', 'TO', 'I-P'), ('separate', 'VB', 'I-P'), ('your', 'PRP$', 'I-P'), ('rubbish', 'NN', 'I-P'), ('properly', 'RB', 'I-P'), ('all', 'PDT', 'I-P'), ('the', 'DT', 'I-P'), ('time', 'NN', 'I-P'), ('.', '.', 'I-P')],
[ ('Three', 'CD', 'B-P'), ('different', 'JJ', 'I-P'), ('bin', 'JJ', 'I-P'), ('bags', 'NNS', 'I-P'), ('stink', 'VBP', 'I-P'), ('away', 'RB', 'I-P'), ('in', 'IN', 'I-P'), ('the', 'DT', 'I-P'), ('kitchen', 'NN', 'I-P'), ('and', 'CC', 'I-P'), ('have', 'VB', 'I-P'), ('to', 'TO', 'I-P'), ('be', 'VB', 'I-P'), ('sorted', 'VBN', 'I-P'), ('into', 'IN', 'I-P'), ('different', 'JJ', 'I-P'), ('wheelie', 'NN', 'I-P'), ('bins', 'NNS', 'I-P'), ('.', '.', 'I-P')],
[ ('But', 'CC', 'B-P'), ('still', 'RB', 'I-P'), ('Germany', 'NNP', 'I-P'), ('produces', 'VBZ', 'I-P'), ('way', 'RB', 'I-P'), ('too', 'RB', 'I-P'), ('much', 'JJ', 'I-P'), ('rubbish', 'NN', 'I-P')],
[ ('and', 'CC', 'B-P'), ('too', 'RB', 'I-P'), ('many', 'JJ', 'I-P'), ('resources', 'NNS', 'I-P'), ('are', 'VBP', 'I-P'), ('lost', 'VBN', 'I-P'), ('when', 'WRB', 'I-P'), ('what', 'WP', 'I-P'), ('actually', 'RB', 'I-P'), ('should', 'MD', 'I-P'), ('be', 'VB', 'I-P'), ('separated', 'VBN', 'I-P'), ('and', 'CC', 'I-P'), ('recycled', 'VBN', 'I-P'), ('is', 'VBZ', 'I-P'), ('burnt', 'VBN', 'I-P'), ('.', '.', 'I-P')],
[ ('We', 'PRP', 'B-C'), ('Berliners', 'NNS', 'I-C'), ('should', 'MD', 'I-C'), ('take', 'VB', 'I-C'), ('the', 'DT', 'I-C'), ('chance', 'NN', 'I-C'), ('and', 'CC', 'I-C'), ('become', 'VB', 'I-C'), ('pioneers', 'NNS', 'I-C'), ('in', 'IN', 'I-C'), ('waste', 'NN', 'I-C'), ('separation', 'NN', 'I-C'), ('!', '.', 'I-C')]
]
You can use a list comprehension with unpacking:您可以在解包中使用列表推导:
d = [([('Yes', 'UH'), (',', ','), ('it', 'PRP'), ("'s", 'VBZ'), ('annoying', 'JJ'), ('and', 'CC'), ('cumbersome', 'JJ'), ('to', 'TO'), ('separate', 'VB'), ('your', 'PRP$'), ('rubbish', 'NN'), ('properly', 'RB'), ('all', 'PDT'), ('the', 'DT'), ('time', 'NN'), ('.', '.')], 'P'), ([('Three', 'CD'), ('different', 'JJ'), ('bin', 'JJ'), ('bags', 'NNS'), ('stink', 'VBP'), ('away', 'RB'), ('in', 'IN'), ('the', 'DT'), ('kitchen', 'NN'), ('and', 'CC'), ('have', 'VB'), ('to', 'TO'), ('be', 'VB'), ('sorted', 'VBN'), ('into', 'IN'), ('different', 'JJ'), ('wheelie', 'NN'), ('bins', 'NNS'), ('.', '.')], 'P'), ([('But', 'CC'), ('still', 'RB'), ('Germany', 'NNP'), ('produces', 'VBZ'), ('way', 'RB'), ('too', 'RB'), ('much', 'JJ'), ('rubbish', 'NN')], 'P'), ([('and', 'CC'), ('too', 'RB'), ('many', 'JJ'), ('resources', 'NNS'), ('are', 'VBP'), ('lost', 'VBN'), ('when', 'WRB'), ('what', 'WP'), ('actually', 'RB'), ('should', 'MD'), ('be', 'VB'), ('separated', 'VBN'), ('and', 'CC'), ('recycled', 'VBN'), ('is', 'VBZ'), ('burnt', 'VBN'), ('.', '.')], 'P'), ([('We', 'PRP'), ('Berliners', 'NNS'), ('should', 'MD'), ('take', 'VB'), ('the', 'DT'), ('chance', 'NN'), ('and', 'CC'), ('become', 'VB'), ('pioneers', 'NNS'), ('in', 'IN'), ('waste', 'NN'), ('separation', 'NN'), ('!', '.')], 'C')]
new_d = [[(*a, f'B-{c}'), *[(*j, f'I-{c}') for j in b]] for [a, *b], c in d]
Output: Output:
[[('Yes', 'UH', 'B-P'), (',', ',', 'I-P'), ('it', 'PRP', 'I-P'), ("'s", 'VBZ', 'I-P'), ('annoying', 'JJ', 'I-P'), ('and', 'CC', 'I-P'), ('cumbersome', 'JJ', 'I-P'), ('to', 'TO', 'I-P'), ('separate', 'VB', 'I-P'), ('your', 'PRP$', 'I-P'), ('rubbish', 'NN', 'I-P'), ('properly', 'RB', 'I-P'), ('all', 'PDT', 'I-P'), ('the', 'DT', 'I-P'), ('time', 'NN', 'I-P'), ('.', '.', 'I-P')], [('Three', 'CD', 'B-P'), ('different', 'JJ', 'I-P'), ('bin', 'JJ', 'I-P'), ('bags', 'NNS', 'I-P'), ('stink', 'VBP', 'I-P'), ('away', 'RB', 'I-P'), ('in', 'IN', 'I-P'), ('the', 'DT', 'I-P'), ('kitchen', 'NN', 'I-P'), ('and', 'CC', 'I-P'), ('have', 'VB', 'I-P'), ('to', 'TO', 'I-P'), ('be', 'VB', 'I-P'), ('sorted', 'VBN', 'I-P'), ('into', 'IN', 'I-P'), ('different', 'JJ', 'I-P'), ('wheelie', 'NN', 'I-P'), ('bins', 'NNS', 'I-P'), ('.', '.', 'I-P')], [('But', 'CC', 'B-P'), ('still', 'RB', 'I-P'), ('Germany', 'NNP', 'I-P'), ('produces', 'VBZ', 'I-P'), ('way', 'RB', 'I-P'), ('too', 'RB', 'I-P'), ('much', 'JJ', 'I-P'), ('rubbish', 'NN', 'I-P')], [('and', 'CC', 'B-P'), ('too', 'RB', 'I-P'), ('many', 'JJ', 'I-P'), ('resources', 'NNS', 'I-P'), ('are', 'VBP', 'I-P'), ('lost', 'VBN', 'I-P'), ('when', 'WRB', 'I-P'), ('what', 'WP', 'I-P'), ('actually', 'RB', 'I-P'), ('should', 'MD', 'I-P'), ('be', 'VB', 'I-P'), ('separated', 'VBN', 'I-P'), ('and', 'CC', 'I-P'), ('recycled', 'VBN', 'I-P'), ('is', 'VBZ', 'I-P'), ('burnt', 'VBN', 'I-P'), ('.', '.', 'I-P')], [('We', 'PRP', 'B-C'), ('Berliners', 'NNS', 'I-C'), ('should', 'MD', 'I-C'), ('take', 'VB', 'I-C'), ('the', 'DT', 'I-C'), ('chance', 'NN', 'I-C'), ('and', 'CC', 'I-C'), ('become', 'VB', 'I-C'), ('pioneers', 'NNS', 'I-C'), ('in', 'IN', 'I-C'), ('waste', 'NN', 'I-C'), ('separation', 'NN', 'I-C'), ('!', '.', 'I-C')]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.