简体   繁体   中英

Python lists and their splitting

For example, I have such code

a = ["a;b", "c;d",...,"y;z"]

I want to split every list element into to items of the same list. So i wanna get something like this:

["a", "b", "c", "d", ...., "y", "z"]

How can I do such thing? Thanks for your answers.

Using only string operations seem to be simplest (this is subjective, of course) and fastest (by a huge margin, compared to other solutions posted so far).

>>> a = ["a;b", "c;d", "y;z"]
>>> ";".join(a).split(";")
['a', 'b', 'c', 'd', 'y', 'z']

Proof / benchmarks

Sorted in ascending order of elapsed time:

python -mtimeit -s'a=["a;b","x;y","p;q"]*99' '";".join(a).split(";")'
10000 loops, best of 3: 48.2 usec per loop

python -mtimeit -s'a=["a;b","x;y","p;q"]*99' '[single for pair in a for single in pair.split(";")]'
1000 loops, best of 3: 347 usec per loop

python -mtimeit -s'from itertools import chain; a=["a;b","x;y","p;q"]*99' 'list(chain(*(s.split(";") for s in a)))'
1000 loops, best of 3: 350 usec per loop

python -mtimeit -s'a=["a;b","x;y","p;q"]*99' 'sum([x.split(";") for x in a],[])'
1000 loops, best of 3: 1.13 msec per loop

python -mtimeit -s'a=["a;b","x;y","p;q"]*99' 'sum(map(lambda x: x.split(";"), a), [])'
1000 loops, best of 3: 1.22 msec per loop

python -mtimeit -s'a=["a;b","x;y","p;q"]*99' 'reduce(lambda x,y:x+y, [pair.split(";") for pair in a])'
1000 loops, best of 3: 1.24 msec per loop

You can use itertools.chain :

>>> a = ["a;b", "c;d","y;z"]
>>> list(itertools.chain(*(s.split(';') for s in a)))
['a', 'b', 'c', 'd', 'y', 'z']

A bit more functional approach:

>>> l = ["a;b", "c;d", "e;f", "y;z"]
>>> sum(map(lambda x: x.split(';'), l), [])
['a', 'b', 'c', 'd', 'e', 'f', 'y', 'z']

That's work :

l = []
for item in ["a;b", "c;d", "e;f"]:
     l += item.split(";")

print l

It gives :

['a', 'b', 'c', 'd', 'e', 'f']
a = ["a;b", "c;d","y;z"]
print [atom for pair in a for atom in pair.split(';')]

gives what you want:

['a', 'b', 'c', 'd', 'y', 'z']

note: i can't tell you how to get from '...' to '....' in the middle of your array :)

l = []
for current in [c.split(';') for c in a]:
   l.extend(current)

You might want to read up on list comprehensions http://docs.python.org/tutorial/datastructures.html#list-comprehensions

a = ["a;b", "c;d","e;f","y;z"]
b = []
for i in a:
    c = i.split(';')
    b = b + c

print b

A bit longer than Felix Kling's answer, but here goes. First split the list into sub-lists

>>> a_split = [i.split(";", 1) for i in a]

This will result in a list of the form:

[[a,b], [c,d], ..., [y,z]]

You now need to 'merge' the inner and outer lists in some way. The builtin reduce() function is a perfect fit for this:

>>> reduce(lambda x, y: x + y, a_split)

Voila:

['a', 'b', 'c', 'd', ... 'y', 'z']

Strings can be used for this:

>>> a = ["a;b", "c;d","y;z"]
>>> list(''.join(a).replace(';', ''))
['a', 'b', 'c', 'd', 'y', 'z']

This solution is one of the fastest suggested so far:

# Shawn Chin's solution (the fastest so far, by far):
python -mtimeit -s'a=["a;b","x;y","p;q"]*99' '";".join(a).split(";")'
10000 loops, best of 3: 27.4 usec per loop

# This solution:
python -mtimeit -s'a=["a;b","x;y","p;q"]*99' "list(''.join(a).replace(';', ''))"
10000 loops, best of 3: 33.5 usec per loop

The morale is that lists represented by strings can be quite efficient in this case, possibly because of a simpler memory handling (characters are stored in consecutive memory locations).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM