比较Python中字符串元素的最快方法

Question

I am looking for the fastest way of comparison of string element in python. 我正在寻找比较python中字符串元素的最快方法。

import os, glob, numpy as np

with open ('fname.txt','r') as fi:   ##This infile contains 9 thousands of string elements
    all_list = fi.read().splitlines()

existing_list = glob.glob('*jpg') ##This contains 5 thousands elements
existing_list = [os.path.basename(f) for f in existing_list]

remaining_list = [f for f in all_list if f not in existing_list]
for i in remaining list:
    print i

How to perform it in Numpy? 如何在Numpy中执行？

all_list = np.array(all_list)
existing_list = np.array(existing_list)
remaining_list = ???

Answer 1

You can optimize this without numpy if you'd use a set: 如果使用集合，则可以在不使用numpy的情况下对其进行优化：

existing_set = {os.path.basename(f) for f in existing_list}  # set comprehension, python2.7+
# alternatively:  set(os.path.basename(f) for f in existing_list)

remaining_list = [f for f in all_list if f not in existing_set]

I doubt that you'd gain a lot of performance here by using numpy even if you figured out a way to do it... 我怀疑即使您想出一种方法也可以通过使用numpy来获得很多性能...

比较Python中字符串元素的最快方法

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-03-27 15:51:56

比较Python中字符串元素的最快方法

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-03-27 15:51:56

解决方案1
1 已采纳 2014-03-27 15:51:56