[英]One-hot encodings in Keras without for loops
I want to generate one-hot encodings for a list of sequences.我想为序列列表生成单热编码。
def encode_output(sequences, vocab_size):
y = np.zeros([sequences.shape[0], sequences.shape[1], vocab_size], dtype='int16')
for i in range(sequences.shape[0]):
y[i] = keras.utils.to_categorical(sequences[i], num_classes=vocab_size, dtype='int16')
return y
Sequences is a 2-D numpy array序列是一个二维 numpy 数组
array([[ 23, 4, 563, ..., 0, 0, 0],
[3480, 3, 86, ..., 0, 0, 0],
[ 9, 930, 6, ..., 0, 0, 0],
...,
[ 507, 1408, 0, ..., 0, 0, 0],
[4447, 13, 642, ..., 0, 0, 0],
[ 1, 195, 2618, ..., 0, 0, 0]], dtype=int32)
My code works fine, but maybe there is a way to make it without for loop?我的代码工作正常,但也许有一种方法可以不用 for 循环?
You can simply use array-assignment
-您可以简单地使用
array-assignment
-
def encode_vectorized(a, n, dtype=int):
out = np.zeros(a.shape + (n,), dtype=dtype)
np.put_along_axis(out, a[...,None], 1, axis=-1)
return out
For OHE exercises, I always use: pd.get_dummies
对于 OHE 练习,我总是使用:
pd.get_dummies
Here is a simple example:这是一个简单的例子:
import pandas as pd
s = pd.Series(list('abca'))
pd.get_dummies(s)
a b c
0 1 0 0
1 0 1 0
2 0 0 1
3 1 0 0
Resource:资源:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.get_dummies.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.get_dummies.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.