use regular expression or escape characters with pandas startswith function

Question

I have dataframe containing column like this:

col_1                
W (2) L / W (1) L 
W (1) D / W (2) L
NaN
W (1) L
W (2) D / W (1) D
W (1) D

I want to select rows with values starting with W (1) L or W (2) L or W (2) D so the result will be:

col_1                
W (2) L / W (1) L 
W (1) L
W (2) D / W (1) D

I tried this but didn't work:

df.loc[df.col_1.str.startswith('W \(1\) L')]

and this didn't worked:

df.loc[df.col_1.str.contains('^W\(L\).+', regex=True)]

Answer 1

Using str.contains:

df.loc[df['col1'].str.contains('^W \([12]\) L|^W \(2\) D', regex=True, na=False), :]

Once the regex is correct, the trick is to pass na=False so that the df.loc[] functions correctly - otherwise you'll get an error for the NaN value.

Answer 2

An example using str.match :

df['col_1'].replace(np.nan, 'none', inplace=True)
df[df['col_1'].str.match(r'^W\s\((1\)\sL|2\)\sL|2\)\sD)')]

In preparation, the first line cleans the data a bit and replaces any NaN values with a 'none' string. Then the regex takes care of the filtering, in line two.

Output:

               col_1
0  W (2) L / W (1) L
3            W (1) L
4  W (2) D / W (1)

use regular expression or escape characters with pandas startswith function

Question

2 answers

solution1
1 ACCPTED 2020-09-27 18:28:56

solution2
1 2020-09-27 18:36:41

use regular expression or escape characters with pandas startswith function

Question

2 answers

solution1 1 ACCPTED 2020-09-27 18:28:56

solution2 1 2020-09-27 18:36:41

solution1
1 ACCPTED 2020-09-27 18:28:56

solution2
1 2020-09-27 18:36:41