简体   繁体   中英

After normalising my data using DataPreparer while using random forest and SVM, why do my data values become negative?

I am working on predictive modeling where I need to predict whether an online customer ends up purchasing a product on a website or not, and I am using Random Forest Classifier and SVM since it's a classification problem.

After creating the fitting splits for training, testing, and validation sets, I dummify, standardize and normalize my data. However, after I normalize the sets, their values become all negative. Is there a way to change that and why does it happen?

The code that I am using to normalize my fitting sets is as below:

data_preparer = DataPreparer(one_hot_encoder, standard_scaler)
data_preparer.prepare_data(fitting_splits.train_set).head()
data_preparer.prepare_data(fitting_splits.validation_set).head()

I think the documentation from sklearn.preprocessing.StandardScaler can help here:

The standard score of a sample x is calculated as:

z = (x - u) / s

where u is the mean of the training samples or zero if with_mean=False, and s is the standard deviation of the training samples or one if with_std=False.

Based on this equation, if x (the individual value currently being scaled) is less than the mean of the variable, then your scaled value will be negative.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM