I have dataframe containing column like this:
col_1
W (2) L / W (1) L
W (1) D / W (2) L
NaN
W (1) L
W (2) D / W (1) D
W (1) D
I want to select rows with values starting with W (1) L
or W (2) L
or W (2) D
so the result will be:
col_1
W (2) L / W (1) L
W (1) L
W (2) D / W (1) D
I tried this but didn't work:
df.loc[df.col_1.str.startswith('W \(1\) L')]
and this didn't worked:
df.loc[df.col_1.str.contains('^W\(L\).+', regex=True)]
Using str.contains:
df.loc[df['col1'].str.contains('^W \([12]\) L|^W \(2\) D', regex=True, na=False), :]
Once the regex is correct, the trick is to pass na=False
so that the df.loc[]
functions correctly - otherwise you'll get an error for the NaN value.
An example using str.match
:
df['col_1'].replace(np.nan, 'none', inplace=True)
df[df['col_1'].str.match(r'^W\s\((1\)\sL|2\)\sL|2\)\sD)')]
In preparation, the first line cleans the data a bit and replaces any NaN
values with a 'none'
string. Then the regex takes care of the filtering, in line two.
Output:
col_1
0 W (2) L / W (1) L
3 W (1) L
4 W (2) D / W (1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.