简体   繁体   English

如何在python中将每个原始数据分成3个矩阵?

[英]How to devide each raw of data into 3 matrixes in python?

I have data with 1034 columns, I want to divide each raw of it into 3 matrixes of 49*7.我有 1034 列的数据,我想将它的每个原始数据分成 3 个 49*7 的矩阵。 It remains 5 columns delete them.它仍然是 5 列删除它们。 How can I do this in python?我怎样才能在python中做到这一点?

First, I removed the last 5 columns from the data.首先,我从数据中删除了最后 5 列。

rawData = pd.read_csv('../input/smartgrid/data/data.csv')#import the data

         #remove the last 5 columns
            rawData.pop('2016/9/9')
            rawData.pop('2016/9/8')
            rawData.pop('2016/9/7')
            rawData.pop('2016/9/6')
            rawData.pop('2016/9/5')            

Then, It happens a preprocessing of the data.然后,它对数据进行预处理。 After that, it is fed to this function which is supposed to divide each row into three matrixes week1 , week2 and week3 .之后,它被馈送到这个函数,该函数应该将每一行分成三个矩阵week1week2week3

def CNN2D(X_train, X_test, y_train, y_test):
    print('2D - Convolutional Neural Network:')
 #Transforming every row of the train set into a 2D array
            n_array_X_train = X_train.to_numpy()
    #devided n_array_Xtrain into 3 matrixes in order to apply it in convolution layer like RGB color
           week1= [] # the first matrix
           week2= [] # the second matrix
           week3= [] # the third matrix

Here's a way to do what you're asking:这是一种执行您要求的方法:

import pandas as pd
import numpy as np
#rawData = pd.read_csv('../input/smartgrid/data/data.csv')#import the data
rawData = pd.DataFrame([[x * 5 + i for x in range(1034)] for i in range(2)], columns=range(1034))

numRowsPerMatrix = len(rawData.columns) // 7 // 3
numColsNeeded = 3 * 7 * numRowsPerMatrix
rawData = rawData.T.iloc[:numColsNeeded].T

for i in range(len(rawData.index)):
    n_array_X_train = rawData.iloc[i].to_numpy()
    week1= np.reshape(n_array_X_train[:49 * 7], (49, 7)) # the first matrix
    week2= np.reshape(n_array_X_train[49 * 7: 2 * 49 * 7], (49, 7)) # the second matrix
    week3= np.reshape(n_array_X_train[2 * 49 * 7:], (49, 7)) # the third matrix

The line rawData = rawData.T.iloc[:numColsNeeded].T transposes the array, slices only the required rows (which were columns in the original df, all but last 5), then transposes it back.rawData = rawData.T.iloc[:numColsNeeded].T转置数组,只切片所需的行(它们是原始 df 中的列,除了最后 5 个),然后将其转回。

The assignments to week1, week2 and week3 slice successive thirds of the 1D numpy array in the current row of rawData and reshape each into a 49 row by 7 column matrix.对第 1 周、第 2 周和第 3 周的分配在 rawData 的当前行中切片 1D numpy 数组的连续三分之一,并将每个数组重新整形为 49 行 x 7 列矩阵。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM