简体   繁体   中英

pandas df masking specific row by list

I have pandas df which has 7000 rows * 7 columns. And I have list (row_list) that consists with the value that I want to filter out from df. What I want to do is to filter out the rows if the rows from df contain the corresponding value in the list. This is what I got when I tried,

"Empty DataFrame Columns: [A,B,C,D,E,F,G] Index: []"

df = pd.read_csv('filename.csv')
df1 = pd.read_csv('filename1.csv', names = 'A')

row_list = []
for index, rows in df1.iterrows():
    my_list = [rows.A]
    row_list.append(my_list)

boolean_series = df.D.isin(row_list)
filtered_df = df[boolean_series]
print(filtered_df)

replace

boolean_series = df.RightInsoleImage.isin(row_list)

with

boolean_series = df.RightInsoleImage.isin(df1.A)

And let us know the result. If it doesn't work show a sample of df and df1.A

(1) generating separate dfs for each condition, concat, then dedup (slow)

(2) a custom function to annotate with bool column (default as False, then annotated True if condition is fulfilled), then filter based on that column

(3) keep a list of indices of all rows with your row_list values, then filter using iloc based on your indices list

Without an MRE , sample data, or a reason why your method didn't work, it's difficult to provide a more specific answer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM