[英]Remove duplicates from array and elements in matching positions in another array
I have two numpy array, I want to remove duplicate values from the first array (including the original value) and remove the items in the matching positions in the second array. 我有两个numpy数组,我想从第一个数组中删除重复值(包括原始值)并删除第二个数组中匹配位置的项目。
For example: 例如:
a = [1, 2, 2, 3]
b = ['a', 'd', 'f', 'c']
Becomes: 变为:
a = [1, 3]
b = ['a', 'c']
I need to do this efficiently and not use the naive solution which is time consuming 我需要有效地做到这一点,而不是使用耗时的天真解决方案
Here's one with np.unique
- 这是np.unique
的一个 -
unq,idx,c = np.unique(a, return_index=True, return_counts=True)
unq_idx = np.sort(idx[c==1])
a_out = a[unq_idx]
b_out = b[unq_idx]
Sample run - 样品运行 -
In [34]: a
Out[34]: array([1, 2, 2, 3])
In [35]: b
Out[35]: array(['a', 'd', 'f', 'c'], dtype='|S1')
In [36]: unq,idx,c = np.unique(a, return_index=1, return_counts=1)
...: unq_idx = idx[c==1]
...: a_out = a[unq_idx]
...: b_out = b[unq_idx]
In [37]: a_out
Out[37]: array([1, 3])
In [38]: b_out
Out[38]: array(['a', 'c'], dtype='|S1')
Since you are open to NumPy, you may wish to consider Pandas, which uses NumPy internally: 由于您对NumPy持开放态度,您可能希望考虑在内部使用NumPy的Pandas:
import pandas as pd
a = pd.Series([1, 2, 2, 3])
b = pd.Series(['a', 'd', 'f', 'c'])
flags = ~a.duplicated(keep=False)
idx = flags[flags].index
a = a[idx].values
b = b[idx].values
Result: 结果:
print(a, b, sep='\n')
array([1, 3], dtype=int64)
array(['a', 'c'], dtype=object)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.