I know there are many other similar questions posted, but there is a difference in mine that makes it unsolvable with their answers.
I have several lists of characters that may have multiple consecutive spaces, of which I need to keep only one. Repetitions of any other character should remain. I did it in the following way:
myList = ['o', 'e', 'i', ' ', ' ', ' ', 'l', 'k', ' ', ' ', ' ', ' ', ' ', 'j', 'u']
myList_copy = [myList[0]]
for i in range(1, len(myList):
if not(myList[i] == ' ' and myList[i-1] == ' '):
myList_copy.append(myList[i])
which successfully gives me
['o', 'e', 'i', ' ', 'l', 'k', ' ', 'j', 'u', ' ']
I don't really think this is a very good, fast way to do it.
I have seen posts like this one (and others) which have similar questions. However, see that I actually need to remove only repeated spaces. Maybe what I need help with is using groupby to do this, but that's why the new post.
Thanks in advance.
Yes,Using groupby
is a good idea:
import itertools
myList = ['o', 'e', 'i', ' ', ' ', ' ', 'l', 'k', ' ', ' ', ' ', ' ', ' ', 'j', 'u']
result = [key for key,group in itertools.groupby(myList)])
# ['o', 'e', 'i', ' ', 'l', 'k', ' ', 'j', 'u']
If you want to get another elements also duplicate,you can use this:
myList = ['o', 'e', 'i', 'i' , ' ', ' ', ' ', 'l', 'k', ' ', ' ', ' ', ' ', ' ', 'j', 'u']
result = []
for key,group in itertools.groupby(myList):
if key != ' ': # ' 'string
for j in group:
result.append(j)
else: result.append(key)
print(result)
Another simple? way to do it:
myList
to create a stringmyList = ['o', 'e', 'i', ' ', ' ', ' ', 'l', 'k', ' ', ' ', ' ', ' ', ' ', 'j', 'u']
new = list(' '.join(''.join(myList).split()))
print(new)
['o', 'e', 'i', ' ', 'l', 'k', ' ', 'j', 'u']
this is the same as yours but in one line
myList_copy = [myList[x] for x in range(len(myList)) if not(myList[x] == ' ' and myList[x-1] == ' ')]
How about using numpy? Try this code.
import numpy as np
myList = ['o', 'e', 'i', ' ', ' ', ' ', 'l', 'k', ' ', ' ', ' ', ' ', ' ', 'j', 'u']
myList = np.array(myList)
myList = [myList[0]] + list(myList[1:][~((myList[1:] == myList[:-1]) & (myList[1:] == ' '))])
print(myList)
You can use zip in a list comprehension to compare each character with the previous one and exclude spaces that are preceded by another space:
myList = [ c for p,c in zip([""]+myList,myList) if (p,c) != (' ',' ') ]
same approach can be used on a string
myList = [ c for p,c in zip("."+myString, myString) if (p,c) != (' ',' ') ]
but split() would probably be more concise if you have a string and want a string as output:
myString = " ".join(myString.split())
What about using a pandas Series and shifting the results?
import pandas as pd
serie = pd.Series(['o', 'e', 'i', ' ', ' ', ' ', 'l', 'k', ' ', ' ', ' ', ' ', ' ', 'j', 'u'])
index = ~(serie == serie.shift(1))
serie = serie[index]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.