简体   繁体   中英

How to reorganize a list in specific way in Python

So, what I am trying to do is if you have the following list:

example_list=['This', 'is', 'QQQQQ', 'an', 'QQQQQ', 'example', 'list', 'QQQQQ', '.']

I want it to be reorganised as this:

example_list=['This is', 'an', 'example list', '.']

Notice how QQQQQs are being used as placeholders. So, basically I want everything between the QQQQQs to be one list element. How do I do that?

I have seen other posts about the join() function, but the problem I have is putting a space in between, if there is more than 1 word.

Using a simple iteration.

Ex:

example_list=['This', 'is', 'QQQQQ', 'an', 'QQQQQ', 'example', 'list', 'QQQQQ', '.']

res = [[]]
for i in example_list:
    if i == "QQQQQ":
        res.append([])
    else:
        res[-1].append(i)
print([" ".join(i) for i in res])

Output:

['This is', 'an', 'example list', '.']

You can use itertools.groupby() :

>>> from itertools import groupby
>>> example_list=['This', 'is', 'QQQQQ', 'an', 'QQQQQ', 'example', 'list', 'QQQQQ', '.']
>>> [' '.join(g) for k, g in groupby(example_list, lambda x: x == 'QQQQQ') if not k]
['This is', 'an', 'example list', '.']

Or even with .__eq__ comparison, as suggested by @tobias_k in the comments:

>>> [' '.join(g) for k, g in groupby(example_list, key='QQQQQ'.__eq__) if not k]
['This is', 'an', 'example list', '.']

Try join together with strip() to get rid of white spaces

answer = [s.strip() for s in ' '.join(map(str, example_list)).split('QQQQQ')]
print (answer)

Output

['This is', 'an', 'example list', '.']

Simple solution: Do a join with space and then just add the spaces to placeholder in a split function.

Example:

example_list = ['This', 'is', 'QQQQQ', 'an', 'QQQQQ', 'example', 'list', 'QQQQQ', '.']

print(' '.join(example_list).split(' QQQQQ '))

Result:

['This is', 'an', 'example list', '.']

or more generalized:

split_arg = ' {} '.format(place_holder)
example_list = ' '.join(example_list).split(split_arg)

edit after comment by tobias_k

comment was: "Of course, this only works if the placeholder actually is a string, and if that stirng does not appear in any of the other words. Ie it would not work if the placeholder was, eg, None, 'Q', or '' – tobias_k"

Which is true, so I made an even more generalised solution so it should work for each placeholder.

import random
import string

example_list = ['This', 'is', None, 'an', None, 'example', 'list', None, '.']
place_holder = None
# create a random string of length 10
random_place_holder = ''.join(random.choices(string.ascii_uppercase + string.digits, k=10))  
# Replace all old place holders with our new random string placeholder
example_list = [x if x != place_holder else random_place_holder for x in example_list ]
split_arg = ' {} '.format(random_place_holder)
example_list = ' '.join(example_list).split(split_arg)
print(example_list)

To be honest you might be better off using any of the other solutions if you have an inconvenient place holder such as mentioned by tobias_k.

Decided to time it: used:

example_list = ['This', 'is', None, 'an', None, 'example', 'list', None, '.'] * 10000
place_holder = None

I used a longer list so that the creation of the random-string isn't a significant time consuming part, and timing is silly when you aren't using big lists anyway.

This solution: 11.6 ms ± 153 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Rakesh' loop solution: 25.8 ms ± 819 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

RoadRunner's groupby: 34.4 ms ± 1.21 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM