简体   繁体   中英

How to group contents in a list from item to item based on whether those items are in another list?

I have the following list:

x = ['0001', 'Random message XYX', 'Random second message IAI', '0002', 'Random message IAM', 'Random second message OMA', 'Random third message OMA', '0003', 'Random message XAK', 'Random second message YAB', '0004', ' Random message INA']

I also have another list

y = ['0001', '0002', '0003', '0004']

I want to group list x based on group y so that the output is:

x = [['0001', 'Random message XYX', 'Random second message IAI'], ['0002', 'Random message IAM', 'Random second message OMA', 'Random third message OMA'], ['0003', 'Random message XAK', 'Random second message YAB'], ['0004', ' Random message INA']]

I have tried:

x = ['0001', 'Random message XYX', 'Random second message IAI', '0002', 'Random message IAM', 'Random second message OMA', 'Random third message OMA', '0003', 'Random message XAK', 'Random second message YAB', '0004', ' Random message INA']

y = ['0001', '0002','0003', '0004']

grouped_list = []
for entry in x:
    if entry in y:
        new_list = []
        new_list.append(entry)
        for i in range(x.index(entry)+1, len(x)):
            if(x[i][0] not in y):
                new_list.append(x[i])
            else:
                break
        grouped_list.append(list(new_list))
print (grouped_list)

However this just prints []

Can someone please show me what I need to do to print the output I am after?

Edit:

I have made some changes using y.luis' answer which worked for this example, however I have discovered an issue when using my actual data. I have duplicate entries in both lists, which is causing it to overwrite the data in the x list, not just group it. If this code is run, the last part of the x list is overwritten:

x = ['0001', 'Random message XYX', 'Random second message IAI', '0002', 'Random message IAM', 'Random second message OMA', 'Random third message OMA', '0003', 'Random message XAK', 'Random second message YAB', '0004', ' Random message INA', '0001', 'Random message ryryry', 'Random second message ryyryyryryry']

y = ['0001', '0002','0003', '0004', '0001', '0002']

grouped_list = []
for entry in x:
    if entry in y:
        new_list = []
        new_list.append(entry)
        for i in range(x.index(entry)+1, len(x)):
            if(x[i] not in y):
                new_list.append(x[i])
            else:
                break
        grouped_list.append(list(new_list))
print (grouped_list)

Can someone show me how to avoid this?

There is an error in your most inner if :

if(x[i][0] not in y):

here you are checking whether the first character of the item is on the list. It should be:

if(x[i] not in y):

If you you want to avoid group keys duplicated, you can use a dictionary:

grouped_list = []
d = {}
i = 0
current_key = None

while i < len(x):

    if x[i] in y:
        current_key = x[i]
        if not d.has_key(current_key):
            d[current_key] = []
        i += 1
        continue

    while i < len(x) and x[i] not in y:
        d[current_key].append(x[i])
        i += 1

for k in d:
    grouped_list.append([k] + d[k])

print (grouped_list)

How about a two liner? (Sorry, couldn't do it in one line)

# At the top of your .py file    
from __future__ import print_function

x = ['0001', 'Random message XYX', 'Random second message IAI', '0002', 'Random message IAM', 'Random second message OMA', 'Random third message OMA', '0003', 'Random message XAK', 'Random second message YAB', '0004', ' Random message INA']
y = ['0001', '0002', '0003', '0004']

indexes = [k for k in [x.index(toks) for toks in y]]
print([x[i:j] for i, j in zip(indexes, indexes[1:]+[len(x)])])

Gives me

[['0001', 'Random message XYX', 'Random second message IAI'],
 ['0002',
  'Random message IAM',
  'Random second message OMA',
  'Random third message OMA'],
 ['0003', 'Random message XAK', 'Random second message YAB'],
 ['0004', ' Random message INA']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM