简体   繁体   English

python:导航 4-D numpy 阵列

[英]python: Navigating a 4-D numpy array

I am working with a 4-D array input to a CNN network.我正在使用 CNN 网络的 4-D 数组输入。 The input array has the following shape输入数组具有以下形状

print('X_train shape: ', X_train.shape)
X_train shape:  (47204, 1, 100, 4)

Data description:资料说明:

The input data consists of a 47204 instances (fixed-length segments as far CNN requirement).输入数据由47204实例(CNN 要求的固定长度段)组成。 Each instance (1, 100, 4) ie 1 segment contains 100-GPS points , and for each point, 4- corresponding point kinematics (max_speed, avg_speed, max_acc, avg_acc) are stored, thus the (1, 100, 4) .每个实例(1, 100, 4)1段包含100-GPS points ,并且对于每个点,存储4-个对应点运动学(max_speed, avg_speed, max_acc, avg_acc) ,因此(1, 100, 4) Labels are stored in a separate y_train array of shape (47204,) for 5 classes [0..4] .标签存储在一个单独的y_train形状数组(47204,)中,用于 5 个类别[0..4]

print(y_train)
[3 3 0 ... 2 3 4]

To get a better sense of my X_train array, I show the first 3 elements below:为了更好地了解我的X_train数组,我在下面展示了前 3 个元素:

print(X_train[1:3])
[
 [[[ 3.82280987e+00 2.16802350e-01  7.49917451e-02  3.44416369e-04]
   [ 3.38707371e+00 2.02210055e-01  1.61751110e-03  1.93745950e-03]
   [ 2.49202215e+00 1.60605262e-01  8.43561351e-03  2.40057917e-03]
   ...
   [ 2.00022316e+00 2.70020923e-01  5.40441673e-02  3.57212151e-03]
   [ 3.25199744e-01 9.06990382e-02  1.46808316e-02  1.65841315e-03]
   [2.96587589e-01  0.00000000e+00  6.13293351e-04 4.16518187e-03]]]

 [[[ 1.07209176e+00 7.27038312e-02 6.62777026e-03  2.04611951e-04]
   [ 1.06194285e+00 5.05005456e-02 4.05676569e-03  3.72293433e-04]
   [ 1.02849748e+00 2.12558178e-02 2.95477005e-03  5.56584054e-04]
   ...
   [ 4.51962909e-03 5.63125736e-04 5.98474074e-04  1.63036715e-05]
   [ 2.83026181e-03 2.35855075e-03  1.25789358e-03 2.15331510e-06]
   [8.49078543e-03  2.16840434e-19 9.43423077e-04 1.29198906e-05]]]

 [[[ 7.51127665e+00 3.14033478e-01  6.85170617e-02  7.73415075e-04]
   [ 7.42307262e+00 1.33868251e-01  4.10564823e-02  1.16131460e-03]
   [ 7.35818066e+00  1.23886976e-02  3.02312582e-02  1.28312101e-03]
   ...
   [ 7.40826167e+00 1.19388656e-01 4.00874715e-02  2.04909489e-04]
   [ 7.23779176e+00 1.33269965e-01  1.20430502e-02  1.58195900e-04]
   [ 7.11697001e+00 4.68002105e-02  5.42478400e-02  3.58101318e-05]]]
]

Task:任务:

I am required to create a machine learning model (eg random forest) using the 4 kinematics (max_speed, avg_speed, max_acc, avg_acc) as features.我需要使用 4 个运动学(max_speed, avg_speed, max_acc, avg_acc)作为特征来创建机器学习 model(例如随机森林)。 This requires navigating each instance and getting these features for the 100-points in the instance.这需要导航每个实例并为实例中的 100 分获取这些功能。

Clearly, the number of samples will then be 4720400 (ie 47204 x 100 ), so would also match each value to the corresponding label of its instances, ie y_train will then be (4720400,) .显然,样本数将为4720400 (即47204 x 100 ),因此还将每个值与其实例的相应 label 匹配,即y_train将是(4720400,)

The expected input would then be like:预期的输入将如下所示:

      max_speed     avg_speed         max_acc       avg_acc   class
0 3.82280987e+00 2.16802350e-01  7.49917451e-02  3.44416369e-04 3
1 3.38707371e+00 2.02210055e-01  1.61751110e-03  1.93745950e-03 3
2 2.49202215e+00 1.60605262e-01  8.43561351e-03  2.40057917e-03 3
...

I have being thinking about how to do this all through the week, all ideas evaporated.我整个星期都在思考如何做到这一点,所有的想法都烟消云散了。 How do I do this, please?请问我该怎么做?

You can reshape your X_train array from (47204, 1, 100, 4) to (4720400, 4) simply with:您可以使用以下命令将X_train数组从(47204, 1, 100, 4)重塑为(4720400, 4)

X_train_reshaped = X_train.reshape(4720400, 4)

It preserves the data order and the total number of elements will be the same.它保留了数据顺序,并且元素的总数将相同。

Similarly, you can expand y_train array using repeat command:同样,您可以使用repeat命令扩展y_train数组:

Y_train_reshaped = numpy.repeat(Y_train, 100)

Note the 100 for repeat command.注意repeat命令的100 Since you had one label for 100 data points, we will expand these items 100 times.由于您有一个 label 用于 100 个数据点,因此我们将这些项目扩展 100 倍。 This command will preserve data order too so all instances will have the same original label.此命令也将保留数据顺序,因此所有实例都将具有相同的原始 label。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM