如何将 one-hot 编码转换为整数？

Question

I have a numpy array data set with shape (100,10).我有一个形状为 (100,10) 的 numpy 数组数据集。 Each row is a one-hot encoding.每一行都是一个单热编码。 I want to transfer it into a nd-array with shape (100,) such that I transferred each vector row into a integer that denote the index of the nonzero index.我想将其转换为形状为 (100,) 的 nd 数组，以便将每个向量行转换为表示非零索引的索引的整数。 Is there a quick way of doing this using numpy or tensorflow?有没有使用 numpy 或 tensorflow 的快速方法？

Answer 1

You can use numpy.argmax or tf.argmax .您可以使用numpy.argmax或tf.argmax 。 Example:例子：

import numpy as np  
a  = np.array([[0,1,0,0],[1,0,0,0],[0,0,0,1]])
print('np.argmax(a, axis=1): {0}'.format(np.argmax(a, axis=1)))

output:输出：

np.argmax(a, axis=1): [1 0 3]

You may also want to look at sklearn.preprocessing.LabelBinarizer.inverse_transform .您可能还想查看sklearn.preprocessing.LabelBinarizer.inverse_transform 。

Answer 2

As pointed out by Franck Dernoncourt, since a one hot encoding only has a single 1 and the rest are zeros, you can use argmax for this particular example.正如 Franck Dernoncourt 所指出的，由于 one hot 编码只有一个 1，其余的都是 0，因此您可以在这个特定示例中使用 argmax。 In general, if you want to find a value in a numpy array, you'll probabaly want to consult numpy.where .一般来说，如果你想在一个 numpy 数组中找到一个值，你可能会想咨询numpy.where 。 Also, this stack exchange question:另外，这个堆栈交换问题：

Is there a NumPy function to return the first index of something in an array? 是否有一个 NumPy 函数来返回数组中某物的第一个索引？

Since a one-hot vector is a vector with all 0s and a single 1, you can do something like this:由于 one-hot 向量是一个全 0 和一个 1 的向量，因此您可以执行以下操作：

>>> import numpy as np
>>> a = np.array([[0,1,0,0],[1,0,0,0],[0,0,0,1]])
>>> [np.where(r==1)[0][0] for r in a]
[1, 0, 3]

This just builds a list of the index which is 1 for each row.这只是建立一个索引列表，每行为 1。 The [0][0] indexing is just to ditch the structure (a tuple with an array) returned by np.where which is more than you asked for. [0][0] 索引只是为了np.where返回的结构（带有数组的元组），这比您要求的要多。

For any particular row, you just want to index into a.对于任何特定的行，您只想索引到 a。 For example in the zeroth row the 1 is found in index 1.例如，在第 0 行中，1 在索引 1 中找到。

>>> np.where(a[0]==1)[0][0]
1

Answer 3

Simply use np.argmax(x, axis=1)只需使用np.argmax(x, axis=1)

Example:例子：

import numpy as np
array = np.array([[0, 1, 0, 0], [0, 0, 0, 1]])
print(np.argmax(array, axis=1))
> [1 3]

Answer 4

While I strongly suggest to use numpy for speed, mpu.ml.one_hot2indices(one_hots) shows how to do it without numpy.虽然我强烈建议使用 numpy 来提高速度， mpu.ml.one_hot2indices(one_hots)展示了如何在没有 numpy 的情况下做到这一点。 Simply pip install mpu --user --upgrade .只需pip install mpu --user --upgrade 。

Then you can do然后你可以做

>>> one_hot2indices([[1, 0], [1, 0], [0, 1]])
[0, 0, 1]

Answer 5

What I do in these cases is something like this.我在这些情况下所做的就是这样。 The idea is to interpret the one-hot vector as an index of a 1,2,3,4,5... array.这个想法是将 one-hot 向量解释为 1,2,3,4,5... 数组的索引。

# Define stuff
import numpy as np
one_hots = np.zeros([100,10])
for k in range(100):
    one_hots[k,:] = np.random.permutation([1,0,0,0,0,0,0,0,0,0])

# Finally, the trick
ramp = np.tile(np.arange(0,10),[100,1])
integers = ramp[one_hots==1].ravel()

I prefer this trick because I feel np.argmax and other suggested solutions may be slower than indexing (although indexing may consume more memory)我更喜欢这个技巧，因为我觉得np.argmax和其他建议的解决方案可能比索引慢（尽管索引可能会消耗更多内存）

Answer 6

def int_to_onehot(n, n_classes):
    v = [0] * n_classes
    v[n] = 1
    return v

def onehot_to_int(v):
    return v.index(1)


>>> v = int_to_onehot(2, 5)
>>> v
[0, 0, 1, 0, 0]


>>> i = onehot_to_int(v)
>>> i
2

Answer 7

You can use this simple code:您可以使用这个简单的代码：

a=[[0,0,0,0,0,1,0,0,0,0]]
j=0
for i in a[0]:
    if i==1:
        print(j)
    else:
        j+=1

5 5

Answer 8

def one_hot_decode(encoded_seq):
    return [argmax(vector) for vector in encoded_seq]

如何将 one-hot 编码转换为整数？

问题描述

8 个解决方案

解决方案1
45 2017-02-27 23:35:37

解决方案2
33 已采纳 2017-02-27 23:09:27

解决方案3
6 2020-05-19 12:06:15

解决方案4
1 2018-07-23 07:50:16

解决方案5
1 2019-08-25 00:48:17

解决方案6
0 2018-11-20 09:40:34

解决方案7
0 2019-01-05 14:18:13

解决方案8
0 2022-06-16 01:01:19

如何将 one-hot 编码转换为整数？

问题描述

8 个解决方案

解决方案1 45 2017-02-27 23:35:37

解决方案2 33 已采纳 2017-02-27 23:09:27

解决方案3 6 2020-05-19 12:06:15

解决方案4 1 2018-07-23 07:50:16

解决方案5 1 2019-08-25 00:48:17

解决方案6 0 2018-11-20 09:40:34

解决方案7 0 2019-01-05 14:18:13

解决方案8 0 2022-06-16 01:01:19

解决方案1
45 2017-02-27 23:35:37

解决方案2
33 已采纳 2017-02-27 23:09:27

解决方案3
6 2020-05-19 12:06:15

解决方案4
1 2018-07-23 07:50:16

解决方案5
1 2019-08-25 00:48:17

解决方案6
0 2018-11-20 09:40:34

解决方案7
0 2019-01-05 14:18:13

解决方案8
0 2022-06-16 01:01:19