如何用另一个数组中唯一值的索引替换numpy数组中的重复值？

Question

I have an array a with recurring elements, and a second array, b, containing the sorted, unique values from a (as well as an auxilliary "index array", c): 我有一个带有重复元素的数组a，以及另一个数组b，其中包含a的排序后的唯一值（以及辅助的“索引数组” c）：

a = np.array(['Bob', 'Anna', 'Bob', 'Charly', 'Bob'])
b = sorted(np.unique(a))
c = np.arange(len(b))
>>> b
array(['Anna', 'Bob', 'Charly'])
>>> c
array([0, 1, 2])

What I would like to have is an array, d, where the values from a are replaced by their index in b. 我想拥有一个数组d，其中a中的值被b中的索引替换。 The expected result should look like this: 预期结果应如下所示：

>>>d
array([1, 0, 1, 2, 1])

Any suggestions how to get the expected result would be greatly appreciated. 任何建议如何获得预期的结果将不胜感激。

Answer 1

使用以下代码。

d = [b.index(i) for i in a]

Answer 2

No need to create b or c , you can use np.unique and have it return the inverse : 无需创建b或c ，您可以使用np.unique并使它返回inverse ：

d = np.unique(a, return_inverse = True)[1]                                                           
>>> d
array([1, 0, 1, 2, 1])

For reference: 以供参考：

return_inverse : bool, optional return_inverse：bool，可选

If True, also return the indices of the unique array (for the specified axis, if provided) that can be used to reconstruct ar. 如果为True，则还返回可用于重建ar的唯一数组（对于指定的轴，如果提供）的索引。

Alternative pandas solution: 替代熊猫解决方案：

import pandas as pd
pd.Categorical(a).codes
array([1, 0, 1, 2, 1], dtype=int8)

Answer 3

Similar to Avin's answer you can do 与Avin的答案类似，您可以执行

a = ['Bob', 'Anna', 'Bob', 'Charly', 'Bob']
b = sorted(list(set(a)))
c = [b.index(x) for x in a]

However, just wanted to add that numpy is a numerical computing library. 但是，只想添加numpy是一个数值计算库。 You can/should just use lists for this. 您可以/应该仅为此使用列表。

如何用另一个数组中唯一值的索引替换numpy数组中的重复值？

问题描述

3 个解决方案

解决方案1
0 2019-08-16 16:38:17

解决方案2
0 已采纳 2019-08-16 16:42:17

Alternative pandas solution: 替代熊猫解决方案：

解决方案3
0 2019-08-16 16:43:33

如何用另一个数组中唯一值的索引替换numpy数组中的重复值？

问题描述

3 个解决方案

解决方案1 0 2019-08-16 16:38:17

解决方案2 0 已采纳 2019-08-16 16:42:17

Alternative pandas solution: 替代熊猫解决方案：

解决方案3 0 2019-08-16 16:43:33

解决方案1
0 2019-08-16 16:38:17

解决方案2
0 已采纳 2019-08-16 16:42:17

解决方案3
0 2019-08-16 16:43:33