Hi I am using random forest to build a model and I am trying to deal with null values. Would anyone happen to know how you could force the random forest model to treat null values as its own separate band? (as in null values never get banded up with other value ranges. Therefore in a decision tree, the null values of a measure always have their own branch).
I don't want to use mean instead of nulls as I don't want the model to band up null values with other values close to the mean and I don't want to remove nulls either.
I want it so that the decision tree always treats null values of a measure as its own branch.
Thanks:)
You could try these.
Example
Let 'feature' be the name of a column with only positive values, then a negative value should suffice for null.
dataframe.loc[dataframe['feature'].isna(), 'feature'] = -100
Example
Let 'feature' be the name of a column with null values
dataframe['feature_isnull'] = 0 #null-tracking column
dataframe.loc[dataframe['feature'].isna(),'feature_isnull'] = 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.