如何縮放除最后一列之外的所有列？

Question

我正在使用 python 3.7.6。

我正在研究分類問題。

我想縮放我的數據框 ( df ) 特征列。 dataframe包含 56 列（55 個特征列，最后一列是目標列）。

我想縮放特征列。

我這樣做如下：

y = df.iloc[:,-1]
target_name = df.columns[-1]
from FeatureScaling import feature_scaling
df = feature_scaling.scale(df.iloc[:,0:-1], standardize=False)
df[target_name] = y

但它似乎無效，因為我需要重新創建dataframe （將目標列添加到縮放結果中）。

有沒有一種方法可以有效地僅縮放某些列而不更改其他列？ （即，從結果scale將包含所述經縮放的列和一列這是不標度）

Answer 1

使用列索引進行縮放或其他預處理操作不是一個好主意，因為每次創建新功能都會破壞代碼。 而是使用列名。 例如

使用scikit-learn ：

from sklearn.preprocessing import StandardScaler, MinMaxScaler
features = [<featues to standardize>]
scalar = StandardScaler()
# the fit_transform ops returns a 2d numpy.array, we cast it to a pd.DataFrame
standardized_features = pd.DataFrame(scalar.fit_transform(df[features].copy()), columns = features)
old_shape = df.shape
# drop the unnormalized features from the dataframe
df.drop(features, axis = 1, inplace = True)
# join back the normalized features
df = pd.concat([df, standardized_features], axis= 1)
assert old_shape == df.shape, "something went wrong!"

或者，如果您不喜歡拆分和連接數據，則可以使用這樣的函數。

import numpy as np
def normalize(x):
    if np.std(x) == 0:
        raise ValueError('Constant column')
    return (x -np.mean(x)) / np.std(x)

for col in features:
    df[col] = df[col].map(normalize)

Answer 2

您可以切片所需的列：

df.iloc[:, :-1] = feature_scaling.scale(df.iloc[:, :-1], standardize=False)

如何縮放除最后一列之外的所有列？

問題描述

2 個解決方案

解決方案1
1 已采納 2020-04-02 18:43:13

解決方案2
0 2020-04-02 18:38:20

如何縮放除最后一列之外的所有列？

問題描述

2 個解決方案

解決方案1 1 已采納 2020-04-02 18:43:13

解決方案2 0 2020-04-02 18:38:20

解決方案1
1 已采納 2020-04-02 18:43:13

解決方案2
0 2020-04-02 18:38:20