My dataframe had a column of strings (col A). I tokenized it and now I have:
Input:
Col A
'A', B', 'C', 'dog', 'C', 'C', 'C', 'C'
'A', B', 'B', 'dog', 'D', 'A', 'C', 'C', 'D'
I want to get 2 itens right before and after the word 'dog' in a column B. Therefore, I want something like this:
Output:
Col B
'B', 'C', 'dog', 'C', 'C'
'B', 'B', 'dog', 'D', 'A'
How do I get that?
If there must exist one and only one dog
in your column.
import pandas as pd
df = pd.DataFrame({'Col A': ["'A', 'B', 'C', 'dog', 'C', 'C', 'C', 'C'", "'A', 'B', 'B', 'dog', 'D', 'A', 'C', 'C', 'D'"]})
def extract(l):
l = [e.strip() for e in l]
idx = l.index("'dog'")
return l[(idx-2 if idx-2 >= 0 else 0):idx+3]
df['Col B'] = df['Col A'].str.split(',').apply(extract)
print(df)
Col A Col B
0 'A', 'B', 'C', 'dog', 'C', 'C', 'C', 'C' ['B', 'C', 'dog', 'C', 'C']
1 'A', 'B', 'B', 'dog', 'D', 'A', 'C', 'C', 'D' ['B', 'B', 'dog', 'D', 'A']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.