简体   繁体   中英

Conditionally concatenate string values of a tuple in a list in python based on the elements

Here is the list which includes tags to the word type

t = [('The','OTHER'),('name','OTHER'),('is','OTHER'),('Wall','ORGANIZATION'),('Mart','ORGANIZATION'),('and','OTHER'),('Thomas','ORGANIZATION'),('Cook','ORGANIZATION')]

The expectation is to conditionally check if the subsequent tuple is tagged as organization if so concatenate them with a space and continue with the same over the entire list.

Expected output:

Wall Mart, Thomas Cook

for x in t:
    if(x[1] == 'ORGANIZATION'):
         org_list = org_list + ' | ' + x[0]

I was just able to extract the names but not really getting a way where I could concatenate the words tagged as organization.

Refereed to other Question asked: [Link] Concatenate elements of a tuple in a list in python

Expected output: Wall Mart, Thomas Cook

Given that there will always be an 'OTHER' between two subsequent 'ORGANIZATION' , one approach is using itertools.groupby to group subsequent tuples by their second element, and str.join their first items if the grouping key is 'ORGANIZATION' :

t = [('The','OTHER'),('name','OTHER'),('is','OTHER'),('Wall','ORGANIZATION'),
     ('Mart','ORGANIZATION'),('and','OTHER'),('Thomas','ORGANIZATION'),
     ('Cook','ORGANIZATION')]

from itertools import groupby
from operator import itemgetter as g

[' '.join(i[0] for i in [*v]) for k,v in groupby(t, key=g(1)) if k=='ORGANIZATION']
# ['Wall Mart', 'Thomas Cook']

If you prefer a for loop solution without any imports, you can do: -- This will work only for two subsequent tags:

f = False
out = []
for i in t:
    if i[1] == 'ORGANIZATION':
        if not f:
            out.append(i[0])
            f = True
        else:
            out[-1] += f' {i[0]}'
            f = False

print(out)
# ['Wall Mart', 'Thomas Cook']

You can use the following solution:

t = [('The','OTHER'),('name','OTHER'),('is','OTHER'),('Wall','ORGANIZATION'),('Mart','ORGANIZATION'),('and','OTHER'),('Thomas','ORGANIZATION'),('Cook','ORGANIZATION')]

result = [[]]
for i, j in t:
    if j == 'ORGANIZATION':
        result[-1].append(i)
    elif result[-1]:
        result.append([])       

result = [' '.join(i) for i in result if i]
# ['Wall Mart', 'Thomas Cook']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM