在python中组合两个不同维度的数组

Question

我正在使用音频和文本进行情绪分类的项目。 我将音频和文本传递给 1D CNN 并得到以下输出数组：

audio_features_shape = (396, 63, 64)
text_features_shape = (52, 1, 64)

现在我想将这两个不同维度的数组堆叠成一个，这样我就可以将一个数组传递给 LSTM。 我想要的形状为：

expected_array_shape = (448, 64, 128)

我尝试了以下方法，但没有人给出我想要的输出。

x = np.column_stack((audio_features, text_features))
x = np.concatenate((audio_features,text_features), axis=2)
x = np.append(audio_features, text_features)
x = np.transpose([np.tile(audio_features, len(text_features)), np.repeat(text_features, len(audio_features))])
x = np.array([np.append(text_features,x) for x in audio_features])

任何帮助，将不胜感激。 谢谢！

Answer 1

2 个数组的值应该如何分布在结果中？

audio_features_shape = (396, 63, 64)
text_features_shape = (52, 1, 64)

text_features应该“扩展”到 (52,63,64)，或者通过在中间轴上重复值 63 次，或者将此数组放入 0 的目标数组中。 在任何一种情况下，它都会大 63 倍。

一旦数组在除第一个维度之外的所有维度上都匹配后，它们就可以被连接起来。

但真正的问题是，LSTM 的使用有何意义？

Answer 2

根据您到底想要什么以及您是否只对使用 Tensorflow 感兴趣，您可以尝试以下操作：

import tensorflow as tf

audio_features = tf.random.normal((396, 63, 64))
text_features = tf.random.normal((52, 1, 64))

text_features = tf.repeat(text_features, repeats=(audio_features.shape[1]-text_features.shape[1]) + 1, axis=1) 
repeat_features = tf.concat([audio_features, text_features], axis=0)
text_features = tf.random.normal((52, 1, 64))

paddings = tf.constant([[0, 0], [0, audio_features.shape[1]-text_features.shape[1]], [0, 0]])
pad_features = tf.concat([audio_features, tf.pad(text_features, paddings, "CONSTANT")], axis=0)

print('Using tf.repeat --> ', audio_features.shape, text_features.shape, repeat_features.shape)
print('Using tf.pad --> ', audio_features.shape, text_features.shape, pad_features.shape)

Using tf.repeat -->  (396, 63, 64) (52, 1, 64) (448, 63, 64)
Using tf.pad -->  (396, 63, 64) (52, 1, 64) (448, 63, 64)

在python中组合两个不同维度的数组

问题描述

2 个解决方案

解决方案1
2 2021-11-03 15:59:07

解决方案2
1 2021-11-03 16:22:12

在python中组合两个不同维度的数组

问题描述

2 个解决方案

解决方案1 2 2021-11-03 15:59:07

解决方案2 1 2021-11-03 16:22:12

解决方案1
2 2021-11-03 15:59:07

解决方案2
1 2021-11-03 16:22:12