[英]Input contains NaN, infinity or a value too large for dtype('float64') when I scale my data
I am trying to normalize my data like this :我正在尝试像这样规范化我的数据:
scaler = MinMaxScaler()
trainX=scaler.fit_transform(X_data_train)
and I get this error :我收到此错误:
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
ValueError: 输入包含 NaN、无穷大或对于 dtype('float64') 来说太大的值。
X_data_train
is a pandas DataFrame of size (95538, 550)
. X_data_train
是一个大小为(95538, 550)
的熊猫数据帧。 What is really odd is that when I write真正奇怪的是,当我写
print (X_data_train.min().min())
it gives -5482.4473 and similarly for the max, I get 28738212.0, which does not seem for me to be extra-high values...它给出了 -5482.4473 和类似的最大值,我得到 28738212.0,这对我来说似乎不是特别高的值......
Moreover, based on the command given by the 54+ voted answer , I did check I have no NaN
or Infinity
for sure.此外,根据 54+ 投票答案给出的命令,我确实检查了我没有
NaN
或Infinity
。 Moreover, I don't have blanks in my csv
or things like that, as I checked the dimensions此外,当我检查尺寸时,我的
csv
没有空白或类似的东西
So, where is the problem ??那么,问题出在哪里??
You can also check NaN
s and inf
:您还可以检查
NaN
和inf
:
df = pd.DataFrame({'B':[4,5,4,5,5,np.inf],
'C':[7,8,9,4,2,3],
'D':[np.nan,3,5,7,1,0],
'E':[5,3,6,9,2,4]})
print (df)
B C D E
0 4.000000 7 NaN 5
1 5.000000 8 3.0 3
2 4.000000 9 5.0 6
3 5.000000 4 7.0 9
4 5.000000 2 1.0 2
5 inf 3 0.0 4
nan = df[df.isnull().any(axis=1)]
print (nan)
B C D E
0 4.0 7 NaN 5
inf = df[df.eq(np.inf).any(axis=1)]
print (inf)
B C D E
5 inf 3 0.0 4
If want find all index with at least one NaN
s in rows:如果要查找行中至少有一个
NaN
的所有索引:
print (df.index[np.isnan(df).any(axis=1)])
Int64Index([0], dtype='int64')
And columns:和列:
print (df.columns[np.isnan(df).any()])
Index(['D'], dtype='object')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.