简体   繁体   中英

Finding the labels in panda Dataframe efficiently

I have a large number of png files where each filename is a unique ID with a corresponding data in a large pandas Dataframe. I can find the filenames by os.list and then try to find the corresponfin "ind = df['image_id']==name". However, this is a very slow process. Is there a more efficient approach?

import os
files = os.listdir(path)
for file in files:
    name = file.split(".")[0]
    index = df['image_id']==name
    print(df.loc[index].values[0][1])

Maybe make the filename list into a set then use the isin method to get all the indices at once. It is a little hard as you didn't give us an example DataFrame to work with.

import os
files = os.listdir(path)
names = set((path.split('.')[0] for path in files))
mask = df['image_id'].isin(names)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM