I'm trying to create a subset of a pandas dataframe, based on values in a list. However, I need to include string indexing. I'll demonstrate with an example:
Here is my dataframe:
df = pd.DataFrame({'A' : ['1-2', '2', '3', '3-8', '4']})
Here is what it looks like:
A
0 1-2
1 2
2 3
3 3-8
4 4
I have a list of values I want to use to select rows from my dataframe.
list1 = ['2', '3']
I can use the.isin() function to select rows from my dataframe using my list items.
subset = df[df['A'].isin(list1)]
print(subset)
A
1 2
2 3
However, I want any value that includes '2' or '3'. This is my desired output:
A
1 1-2
2 2
3 3
4 3-8
Can I use string indexing in my.isin() function? I am struggling to come up with another workaround.
Check str.split
with isin
and any
Newdf=df[df.A.str.split('-',expand=True).isin(['2','3']).any(1)].copy()
Out[189]:
A
0 1-2
1 2
2 3
3 3-8
You can try with regular expression:
import re
pattern=re.compile(".*(("+(")|(").join(list1)+"))")
print(df.loc[df['A'].apply(lambda x: True if pattern.match(x) else False)])
Output:
A
0 1-2
1 2
2 3
3 3-8
[Program finished]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.