[英]Comparing two lists and add a new column with the results
Comparing two lists and add a new column with findKB different比较两个列表并添加一个 findKB 不同的新列
df = pd.DataFrame({'A': [['10', '20', '30', '40'],['50', '60', '70', '80']],
'B': [['a', 'b'],['c','d']]})
findKBs = ['10','90']
A B
0 [10, 20, 30, 40] [a, b]
1 [50, 60, 70, 80] [c, d]
This will be the desired behavior这将是期望的行为
A B C
0 [10, 20, 30, 40] [a, b] [90]
1 [50, 60, 70, 80] [c, d] [10,90]
Thanks in advance提前致谢
We can use np.isin
我们可以使用
np.isin
df['C'] = [find_kb[~np.isin(find_kb, a)]
for a, find_kb in zip(df['A'], np.array([findKBs] * len(df)))]
print(df)
A B C
0 [10, 20, 30, 40] [a, b] [90]
1 [50, 60, 70, 80] [c, d] [10, 90]
Or we can use filter
或者我们可以使用
filter
df['C'] = [list(filter(lambda val: val not in a, find_kb))
for a, find_kb in zip(df['A'],[findKBs] * len(df))]
#df['C'] = df['A'].map(lambda list_a: list(filter(lambda val: val not in list_a,
# findKBs)
# )
# )
filter
is more difficult to read but more efficient: filter
更难阅读但更有效:
%%timeit
df['C'] = [list(filter(lambda val: val not in a, find_kb))
for a, find_kb in zip(df['A'],[findKBs] * len(df))]
194 µs ± 10.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
df['C'] = [find[~np.isin(find, a)] for a, find in zip(df['A'], np.array([findKBs] * len(df)))]
334 µs ± 38.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
df['C'] = df['A'].map(lambda x: np.setdiff1d(findKBs,x))
534 µs ± 17.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
You can try this using np.setdiff1d
here.您可以在此处使用
np.setdiff1d
尝试此操作。
df['C'] = df['A'].map(lambda x: np.setdiff1d(findKBs,x))
A B C
0 [10, 20, 30, 40] [a, b] [90]
1 [50, 60, 70, 80] [c, d] [10, 90]
To avoid lambda you can use functools.partial
here.为避免 lambda 您可以在此处使用
functools.partial
。
from functools import partial
diff = partial(np.setdiff1d, findKBs)
df['C'] = df['A'].map(diff)
sub from set
子
set
df['C']=(set(findKBs)-df.A.map(set)).map(list)
df
Out[253]:
A B C
0 [10, 20, 30, 40] [a, b] [90]
1 [50, 60, 70, 80] [c, d] [10, 90]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.