[英]How can I replace recurring values in a numpy array by the index of the unique value from another array?
I have an array a with recurring elements, and a second array, b, containing the sorted, unique values from a (as well as an auxilliary "index array", c): 我有一个带有重复元素的数组a,以及另一个数组b,其中包含a的排序后的唯一值(以及辅助的“索引数组” c):
a = np.array(['Bob', 'Anna', 'Bob', 'Charly', 'Bob'])
b = sorted(np.unique(a))
c = np.arange(len(b))
>>> b
array(['Anna', 'Bob', 'Charly'])
>>> c
array([0, 1, 2])
What I would like to have is an array, d, where the values from a are replaced by their index in b. 我想拥有一个数组d,其中a中的值被b中的索引替换。 The expected result should look like this:
预期结果应如下所示:
>>>d
array([1, 0, 1, 2, 1])
Any suggestions how to get the expected result would be greatly appreciated. 任何建议如何获得预期的结果将不胜感激。
使用以下代码。
d = [b.index(i) for i in a]
No need to create b
or c
, you can use np.unique
and have it return the inverse
: 无需创建
b
或c
,您可以使用np.unique
并使它返回inverse
:
d = np.unique(a, return_inverse = True)[1]
>>> d
array([1, 0, 1, 2, 1])
For reference: 以供参考:
return_inverse : bool, optional
return_inverse:bool,可选
If True, also return the indices of the unique array (for the specified axis, if provided) that can be used to reconstruct ar.
如果为True,则还返回可用于重建ar的唯一数组(对于指定的轴,如果提供)的索引。
import pandas as pd
pd.Categorical(a).codes
array([1, 0, 1, 2, 1], dtype=int8)
Similar to Avin's answer you can do 与Avin的答案类似,您可以执行
a = ['Bob', 'Anna', 'Bob', 'Charly', 'Bob']
b = sorted(list(set(a)))
c = [b.index(x) for x in a]
However, just wanted to add that numpy is a numerical computing library. 但是,只想添加numpy是一个数值计算库。 You can/should just use lists for this.
您可以/应该仅为此使用列表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.