[英]python: Navigating a 4-D numpy array
I am working with a 4-D array input to a CNN network.我正在使用 CNN 网络的 4-D 数组输入。 The input array has the following shape
输入数组具有以下形状
print('X_train shape: ', X_train.shape)
X_train shape: (47204, 1, 100, 4)
Data description:资料说明:
The input data consists of a 47204
instances (fixed-length segments as far CNN requirement).输入数据由
47204
实例(CNN 要求的固定长度段)组成。 Each instance (1, 100, 4)
ie 1
segment contains 100-GPS points
, and for each point, 4-
corresponding point kinematics (max_speed, avg_speed, max_acc, avg_acc)
are stored, thus the (1, 100, 4)
.每个实例
(1, 100, 4)
即1
段包含100-GPS points
,并且对于每个点,存储4-
个对应点运动学(max_speed, avg_speed, max_acc, avg_acc)
,因此(1, 100, 4)
。 Labels are stored in a separate y_train
array of shape (47204,)
for 5 classes [0..4]
.标签存储在一个单独的
y_train
形状数组(47204,)
中,用于 5 个类别[0..4]
。
print(y_train)
[3 3 0 ... 2 3 4]
To get a better sense of my X_train
array, I show the first 3 elements below:为了更好地了解我的
X_train
数组,我在下面展示了前 3 个元素:
print(X_train[1:3])
[
[[[ 3.82280987e+00 2.16802350e-01 7.49917451e-02 3.44416369e-04]
[ 3.38707371e+00 2.02210055e-01 1.61751110e-03 1.93745950e-03]
[ 2.49202215e+00 1.60605262e-01 8.43561351e-03 2.40057917e-03]
...
[ 2.00022316e+00 2.70020923e-01 5.40441673e-02 3.57212151e-03]
[ 3.25199744e-01 9.06990382e-02 1.46808316e-02 1.65841315e-03]
[2.96587589e-01 0.00000000e+00 6.13293351e-04 4.16518187e-03]]]
[[[ 1.07209176e+00 7.27038312e-02 6.62777026e-03 2.04611951e-04]
[ 1.06194285e+00 5.05005456e-02 4.05676569e-03 3.72293433e-04]
[ 1.02849748e+00 2.12558178e-02 2.95477005e-03 5.56584054e-04]
...
[ 4.51962909e-03 5.63125736e-04 5.98474074e-04 1.63036715e-05]
[ 2.83026181e-03 2.35855075e-03 1.25789358e-03 2.15331510e-06]
[8.49078543e-03 2.16840434e-19 9.43423077e-04 1.29198906e-05]]]
[[[ 7.51127665e+00 3.14033478e-01 6.85170617e-02 7.73415075e-04]
[ 7.42307262e+00 1.33868251e-01 4.10564823e-02 1.16131460e-03]
[ 7.35818066e+00 1.23886976e-02 3.02312582e-02 1.28312101e-03]
...
[ 7.40826167e+00 1.19388656e-01 4.00874715e-02 2.04909489e-04]
[ 7.23779176e+00 1.33269965e-01 1.20430502e-02 1.58195900e-04]
[ 7.11697001e+00 4.68002105e-02 5.42478400e-02 3.58101318e-05]]]
]
Task:任务:
I am required to create a machine learning model (eg random forest) using the 4 kinematics (max_speed, avg_speed, max_acc, avg_acc)
as features.我需要使用 4 个运动学
(max_speed, avg_speed, max_acc, avg_acc)
作为特征来创建机器学习 model(例如随机森林)。 This requires navigating each instance and getting these features for the 100-points in the instance.这需要导航每个实例并为实例中的 100 分获取这些功能。
Clearly, the number of samples will then be 4720400
(ie 47204 x 100
), so would also match each value to the corresponding label of its instances, ie y_train
will then be (4720400,)
.显然,样本数将为
4720400
(即47204 x 100
),因此还将每个值与其实例的相应 label 匹配,即y_train
将是(4720400,)
。
The expected input would then be like:预期的输入将如下所示:
max_speed avg_speed max_acc avg_acc class
0 3.82280987e+00 2.16802350e-01 7.49917451e-02 3.44416369e-04 3
1 3.38707371e+00 2.02210055e-01 1.61751110e-03 1.93745950e-03 3
2 2.49202215e+00 1.60605262e-01 8.43561351e-03 2.40057917e-03 3
...
I have being thinking about how to do this all through the week, all ideas evaporated.我整个星期都在思考如何做到这一点,所有的想法都烟消云散了。 How do I do this, please?
请问我该怎么做?
You can reshape your X_train
array from (47204, 1, 100, 4)
to (4720400, 4)
simply with:您可以使用以下命令将
X_train
数组从(47204, 1, 100, 4)
重塑为(4720400, 4)
:
X_train_reshaped = X_train.reshape(4720400, 4)
It preserves the data order and the total number of elements will be the same.它保留了数据顺序,并且元素的总数将相同。
Similarly, you can expand y_train
array using repeat
command:同样,您可以使用
repeat
命令扩展y_train
数组:
Y_train_reshaped = numpy.repeat(Y_train, 100)
Note the 100
for repeat
command.注意
repeat
命令的100
。 Since you had one label for 100 data points, we will expand these items 100 times.由于您有一个 label 用于 100 个数据点,因此我们将这些项目扩展 100 倍。 This command will preserve data order too so all instances will have the same original label.
此命令也将保留数据顺序,因此所有实例都将具有相同的原始 label。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.