[英]How to check if all elements of a list are contained in other one
I've two list and need to check if all elements of list_1 are contained in list_2 id not, save those elements in a new list. 我有两个列表,需要检查list_1的所有元素是否都包含在list_2 id中,请将这些元素保存在新列表中。
SO I've this: 所以我有这个:
list_1 = ['item','item','item']
list_2 = ['item_2','item_2','item_2']
list_3 = []
for i in range(len(list_1)):
flag = True
aux = list_1[i]
for j in range(len(list_2)):
if aux == list_2[j]:
flag == False
break
if flag:
list_3.append(aux)
But this is very slow, there is a way to improve the speed? 但这很慢,有没有提高速度的方法?
maybe some built pandas function? 也许一些内置的熊猫功能?
Edit. 编辑。
I don't need pandas to this, but the list are infact columns two Data Frames, I just write it with list because it's a more general case. 我不需要大熊猫,但是列表是两个数据框的实际列,我只用list来写它,因为这是更一般的情况。
IIUC I think you could use isin
and all
methods of pd.Series
: IIUC我想你可以使用isin
和all
的方法pd.Series
:
import pandas as pd
l1, l2 = map(pd.Series, [list_1, list_2])
In [3]: l1
Out[3]:
0 item
1 item
2 item
dtype: object
In [4]: l2
Out[4]:
0 item_2
1 item_2
2 item_2
dtype: object
In [5]: l1.isin(l2)
Out[5]:
0 False
1 False
2 False
dtype: bool
In [6]: l1.isin(l2).all()
Out[6]: False
如果要在list_1
中找到不在list_2
的元素并将其保存在其他列表中(在本例中为list_3
,则可以使用列表 list_3
>>> list_3 = [i for i in list_1 if i not in list_2]
I find numba to be fun for loops, it really brings up the speed with minimum effort: 我发现numba对于循环很有趣,它确实以最少的努力提高了速度:
from numba import jit
@jit
def jitstack():
list_1 = ['item','item','item']
list_2 = ['item_2','item_2','item_2']
list_3 = []
for i in range(len(list_1)):
flag = True
aux = list_1[i]
for j in range(len(list_2)):
if aux == list_2[j]:
flag == False
break
if flag:
list_3.append(aux)
Timed with timeit in iPython notebook: https://drive.google.com/file/d/0B0KNIF4xMP3UNW93LWlmWUFqbnc/view?usp=sharing 在iPython笔记本中与timeit一起计时: https ://drive.google.com/file/d/0B0KNIF4xMP3UNW93LWlmWUFqbnc/view ? usp = sharing
Original: 3.72 µs per loop 原始值:每个循环3.72 µs
Numba: 22.4 ns per loop Numba:每个循环22.4 ns
and also the list_3 = [i for i in list_1 if i not in list_2]: 1.22 µs per loop 并且list_3 = [如果i不在list_2中,则i在list_1中为i]:每个循环1.22 µs
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.