I am looking for the fastest way of comparison of string element in python.
import os, glob, numpy as np
with open ('fname.txt','r') as fi: ##This infile contains 9 thousands of string elements
all_list = fi.read().splitlines()
existing_list = glob.glob('*jpg') ##This contains 5 thousands elements
existing_list = [os.path.basename(f) for f in existing_list]
remaining_list = [f for f in all_list if f not in existing_list]
for i in remaining list:
print i
How to perform it in Numpy?
all_list = np.array(all_list)
existing_list = np.array(existing_list)
remaining_list = ???
You can optimize this without numpy if you'd use a set:
existing_set = {os.path.basename(f) for f in existing_list} # set comprehension, python2.7+
# alternatively: set(os.path.basename(f) for f in existing_list)
remaining_list = [f for f in all_list if f not in existing_set]
I doubt that you'd gain a lot of performance here by using numpy even if you figured out a way to do it...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.