简体   繁体   中英

Remove the row from dataframe, that has string length greater than a certain number, after a certain character(“,”) till end

Input: I have a dataframe with name containing 2 values divided by ","

id name
1  xy,ab
2  cv,asdf
3  piy,bs

Expected Output: I want to remove the row with name having string length greater than 2 after ",".

id name
1  xy,ab
3  piy,bs

Code I tried:

df = df[~df['name'].str.split().str.len().ge(2)]
df

This code only removes the string length greater than 2 but i want it to happen after ",".

You can use Series.str.match and pass the regex :

>>> df[df['name'].str.match('.*?,\w{0,2}$')]

   id    name
0   1   xy,ab
2   3  piy,bs

Or you can just split the values on comma, take the last value, and check if length is less than or equals to 2:

>>> df[df['name'].str.split(',').str[-1].str.len().le(2)]
   id    name
0   1   xy,ab
2   3  piy,bs

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM