简体繁体 English

将 NumPy 中具有不同维度的“N”个二维数组连接成一个 3D 数组

[英]Concatenating 'N' 2D arrays in NumPy with varying dimensions into one 3D array

原文 2019-12-13 01:06:19 6 1 python/ numpy/ keras/ numpy-ndarray

I have N samples of 2D features with variable dimensions along one axis.我有 N 个二维特征样本，沿一个轴具有可变尺寸。 For example:例如：

Sample 1 : (100,20)样本 1 : (100,20)
Sample 2 : (150,20)样本 2 : (150,20)
Sample 3 : (90,20)样本 3 : (90,20)
Is there a way to combine all N samples into a 3D array so that the first dimension (N,?,?) denotes the sample number?有没有办法将所有 N 个样本组合成一个 3D 数组，以便第一个维度 (N,?,?) 表示样本编号？

PS: I wish to avoid padding and reshaping, and want to find a way to input the features with their dimensions intact into an LSTM network in Keras. PS：我希望避免填充和重塑，并希望找到一种方法将尺寸完整的特征输入到 Keras 中的 LSTM 网络中。 Any other suggestions to achieve the same are welcome.欢迎任何其他实现相同目标的建议。

1 个解决方案

Keras does allow for variable length input to an LSTM but within a single batch all inputs must have the same length. Keras 确实允许对 LSTM 进行可变长度输入，但在单个批次中，所有输入必须具有相同的长度。 A way to reduce the padding needed would be to batch your input sequences together based on their length and only pad up to the maximum length within each batch.减少所需填充的一种方法是根据输入序列的长度将输入序列批处理在一起，并且仅填充到每个批次中的最大长度。 For example you could have one batch with sequence length 100 and another with sequence length 150. But I'm afraid there is no way to completely avoid padding.例如，您可以有一个序列长度为 100 的批次和另一个序列长度为 150 的批次。但恐怕没有办法完全避免填充。 During inference you can use any sequence length.在推理过程中，您可以使用任何序列长度。