[英]Python/sklearn - preprocessing.MinMaxScaler 1d deprecation
I'd like to scale a column of a dataframe to have values between 0 and 1. For this I'm using a MinMaxScaler
, which works fine, but is sending me mixed messages. 我想缩放一个数据帧的列,使其值介于0和1之间。为此,我使用
MinMaxScaler
,它工作正常,但是向我发送混合消息。 I'm doing: 我正在做:
x = df['Activity'].values #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
df['Activity'] = pd.Series(x_scaled)
Message numero uno for this code is a warning: 此代码的消息numero uno是一个警告:
DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
Okay, so apparentyl having 1d arrays is gonna be a no-no soon, so let's try to reshape it as advised: 好吧,所以具有1d数组的四胞胎将是不久的,所以让我们按照建议重塑它:
x = df['Activity'].values.reshape(-1, 1)
Now the code doesn't even run: Exception: Data must be 1-dimensional
is thrown. 现在代码甚至没有运行:
Exception: Data must be 1-dimensional
的抛出。 So I'm confused. 所以我很困惑。 1d is going to be deprecated soon, but the data also has to be 1d??
1d即将被弃用,但数据也必须是1d ?? How to do this safely?
如何安全地做到这一点? What's the issue here?
这是什么问题?
EDIT as requested by @sascha 按照@sascha的要求编辑
x
looks like this: x
看起来像这样:
array([ 0.00568953, 0.00634314, 0.00718003, ..., 0.01976002,
0.00575024, 0.00183782])
And after reshaping: 重塑后:
array([[ 0.00568953],
[ 0.00634314],
[ 0.00718003],
...,
[ 0.01976002],
[ 0.00575024],
[ 0.00183782]])
The whole warning: 整个警告:
/usr/local/lib/python3.5/dist-packages/sklearn/preprocessing/data.py:321: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
warnings.warn(DEPRECATION_MSG_1D, DeprecationWarning)
/usr/local/lib/python3.5/dist-packages/sklearn/preprocessing/data.py:356: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
warnings.warn(DEPRECATION_MSG_1D, DeprecationWarning)
The error when I reshape: 我重塑时的错误:
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-132-df180aae2d1a> in <module>()
2 min_max_scaler = preprocessing.MinMaxScaler()
3 x_scaled = min_max_scaler.fit_transform(x)
----> 4 telecom['Activity'] = pd.Series(x_scaled)
/usr/local/lib/python3.5/dist-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
225 else:
226 data = _sanitize_array(data, index, dtype, copy,
--> 227 raise_cast_failure=True)
228
229 data = SingleBlockManager(data, index, fastpath=True)
/usr/local/lib/python3.5/dist-packages/pandas/core/series.py in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
2918 elif subarr.ndim > 1:
2919 if isinstance(data, np.ndarray):
-> 2920 raise Exception('Data must be 1-dimensional')
2921 else:
2922 subarr = _asarray_tuplesafe(data, dtype=dtype)
Exception: Data must be 1-dimensional
You can simply drop pd.Series
: 你可以简单地删除
pd.Series
:
import pandas as pd
from sklearn import preprocessing
df = pd.DataFrame({'Activity': [ 0.00568953, 0.00634314, 0.00718003,
0.01976002, 0.00575024, 0.00183782]})
x = df['Activity'].values.reshape(-1, 1) #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
df['Activity'] = x_scaled
or you can explicitly get first column of x_scaled
: 或者您可以显式获取
x_scaled
第一列:
df['Activity'] = pd.Series(x_scaled[:, 0])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.