简体繁体中英

Scaling data with large range in Machine learning preprocessing

原文 2018-07-16 16:11:21 9 1 machine-learning

I am very much new to Machine Learning. And I am trying to apply ML on data containing nearly 50 features. Some features have range from 0 to 1000000 and some have range from 0 to 100 or even less than that. Now when I use feature scaling by using MinMaxScaler for range (0,1) I think features having large range scales down to very small values and this might affect me to give good predictions.

I would like to know if there is some efficient way to do scaling so that all the features are scaled appropriately.

I also tried standared scaler but accuracy did not improve. Also Can I use different scaling function for some features and another for remaining features.

Thanks in advance!

1 answers

Feature scaling, or data normalization, is an important part of training a machine learning model. It is generally recommended that the same scaling approach is used for all features. If the scales for different features are wildly different, this can have a knock-on effect on your ability to learn (depending on what methods you're using to do it). By ensuring standardized feature values, all features are implicitly weighted equally in their representation.

Two common methods of normalization are:

Rescaling (also known as min-max normalization):

where x is an original value, and x' is the normalized value. For example, suppose that we have the students' weight data, and the students' weights span [160 pounds, 200 pounds]. To rescale this data, we first subtract 160 from each student's weight and divide the result by 40 (the difference between the maximum and minimum weights).
Mean normalization

where x is an original value, and x' is the normalized value.

Preprocessing machine learning training data

Preprocessing raw data in Machine Learning using Python

How to handle categorical data for preprocessing in Machine Learning

Image Preprocessing in Machine Learning

error on preprocessing machine learning

Large Machine Learning on Web Data

Preprocessing Categorical attributes in machine Learning

Text Preprocessing for classification - Machine Learning

Scaling on Machine Learning Models

Scaling features for machine learning

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Preprocessing machine learning training data Preprocessing raw data in Machine Learning using Python How to handle categorical data for preprocessing in Machine Learning Image Preprocessing in Machine Learning error on preprocessing machine learning Large Machine Learning on Web Data Preprocessing Categorical attributes in machine Learning Text Preprocessing for classification - Machine Learning Scaling on Machine Learning Models Scaling features for machine learning

Related Tags

Scaling data with large range in Machine learning preprocessing

Question

1 answers

solution1 2 ACCPTED 2018-07-16 20:06:17

solution1
2 ACCPTED 2018-07-16 20:06:17