在使用 keras 模型之前按列标准化数据

Question

I'm working with a large dataset whose data I want to standardize to use with a CNN.我正在处理一个大型数据集，我想对其数据进行标准化以与 CNN 一起使用。

Does keras have a quick utility to standardize a block of numbers column-wise that you can use inside a Sequential model? keras 是否具有快速实用程序来按列标准化数字块，您可以在顺序 model 中使用它？ I'm asking this as i expect eventually the data to be used on-line so ideally this standardization feature could be used on incoming data, ie a trailing moving average of mean and std to normalize the incoming data.我问这个是因为我希望最终数据可以在线使用，所以理想情况下，这个标准化功能可以用于传入数据，即均值和标准的尾随移动平均值来规范化传入数据。

import numpy as np
import pandas as pd

np.random.seed(42)

col_names = ['Column' + str(x+1) for x in range(5)]
training_data = pd.DataFrame(np.random.randint(1,10 **6, 50).reshape(-1,5), columns = col_names)

Answer 1

I am not sure about online, but using sklearn 's StandardScaler() should do the right thing, as described here , seems like the right thing.我不确定在线，但使用sklearn的StandardScaler()应该做正确的事情，如此处所述，似乎是正确的事情。

Answer 2

We can do from sklearn我们可以从sklearn做

from sklearn.preprocessing import StandardScaler
training_data[:]= StandardScaler().fit_transform(training_data.T).T

在使用 keras 模型之前按列标准化数据

问题描述

2 个解决方案

解决方案1
1 2020-06-24 23:16:47

解决方案2
1 2020-06-24 23:17:44

在使用 keras 模型之前按列标准化数据

问题描述

2 个解决方案

解决方案1 1 2020-06-24 23:16:47

解决方案2 1 2020-06-24 23:17:44

解决方案1
1 2020-06-24 23:16:47

解决方案2
1 2020-06-24 23:17:44