简体   繁体   English

深度学习中的输入数据缩放问题

[英]input data scaling issue in Deep Learning

I am trying to make a project for predicting cancer detection using CSV file data and I have taken the cancer CSV data into 2 files as the name of X_data.csv and Y_data.csv . I am trying to make a project for predicting cancer detection using CSV file data and I have taken the cancer CSV data into 2 files as the name of X_data.csv and Y_data.csv . Please concern below code who are interesting to help me for making the solutions of the problems,请关注下面有兴趣帮助我解决问题的代码,

import all needed libraries and sublibraries:导入所有需要的库和子库:

import tensorflow as tf

import keras.backend as K
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
from keras.callbacks import EarlyStopping
from keras.utils import to_categorical
import keras

import numpy as np

from keras.layers import BatchNormalization
from keras.layers import Dropout
from keras import regularizers

import pandas as pd

import sklearn
from sklearn import preprocessing
from sklearn.model_selection import train_test_split

import matplotlib
from matplotlib import pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format='retina'

Import input (x) and output (y) data, and asign these to df1 and df2:导入输入 (x) 和 output (y) 数据,并将这些分配给 df1 和 df2:

df1 = pd.read_csv('X_data.csv')
df2 = pd.read_csv('Y_data.csv')

Scale input data:缩放输入数据:

df1 = preprocessing.scale(df1)    //I faced error here

Scaling error is given below:缩放误差如下:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-aec70d746687> in <module>
      1 # Scale input data
      2 
----> 3 df1 = preprocessing.scale(df1)

~/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     71                           FutureWarning)
     72         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 73         return f(**kwargs)
     74     return inner_f
     75 

~/anaconda3/lib/python3.8/site-packages/sklearn/preprocessing/_data.py in scale(X, axis, with_mean, with_std, copy)
    139 
    140     """  # noqa
--> 141     X = check_array(X, accept_sparse='csc', copy=copy, ensure_2d=False,
    142                     estimator='the scale function', dtype=FLOAT_DTYPES,
    143                     force_all_finite='allow-nan')

~/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     71                           FutureWarning)
     72         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 73         return f(**kwargs)
     74     return inner_f
     75 

~/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    597                     array = array.astype(dtype, casting="unsafe", copy=False)
    598                 else:
--> 599                     array = np.asarray(array, order=order, dtype=dtype)
    600             except ComplexWarning:
    601                 raise ValueError("Complex data not supported\n"

~/anaconda3/lib/python3.8/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

ValueError: could not convert string to float: 'discrete'

The last line literally says what's the problem.最后一行字面意思是说问题出在哪里。 ValueError: could not convert string to float: 'discrete' . ValueError: could not convert string to float: 'discrete' If you print your data ( df1.head() ) you'll see there're some string data like the error suggests which the preprocessing function can't handle .如果您打印数据( df1.head() ),您会看到一些字符串数据,如错误提示preprocessing function 无法处理

So you must perform data cleaning first (convert string to int/float, handle any missing data, etc.).因此,您必须先执行数据清理(将字符串转换为 int/float,处理任何丢失的数据等)。 You may lookout for something like LabelEncoder() function from sklearn or one hot encoder to take care of your string to int issue.您可以从 sklearn 或one hot encoder寻找诸如LabelEncoder() function 之类的东西来处理您的字符串到 int 问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM