简体   繁体   中英

pythonic way to drop columns where length of list in column is x

I would like drop the rows where a certain column has a list of length X. What is the most pythonic or efficient way? Instead of looping...

Code example:

import pandas as pd

data = {'column_1': ['1', '2', '3'] ,
    'column_2': [['A','B'], ['A','B','C'], ['A']], 
    "column_3": ['a', 'b', 'c']}

df = pd.DataFrame.from_dict(data)

drop rows where length of list = 3. In this case, row 2 should be deleted since the length of the list is 3

Use Series.str.len to make a boolean indexing

new_df = df[df["column_2"].str.len().ne(3)]


  column_1 column_2 column_3
0        1   [A, B]        a
2        3      [A]        c

Or if you want to remove rows where list length is equal or greater than 3:

new_df = df[df["column_2"].str.len().le(2)]

print(df["column_2"].str.len().ne(3))
#0     True
#1    False
#2     True
#Name: column_2, dtype: bool

Use Series.apply

res = df[df["column_2"].apply(len).le(2)]
print(res)

Output

  column_1 column_2 column_3
0        1   [A, B]        a
2        3      [A]        c

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM