简体繁体中英

Decision Trees combined with Logistic Regression

原文 2017-01-17 08:16:17 8 1 machine-learning/ classification/ linear-regression/ logistic-regression/ decision-tree

Basicly my question is related to the following paper (it is enough to read only sections 1.Introduction , beginning of section 3.Prediction model structure and section 3.1 Decision tree feature transforms , everything else could be skipped)

https://pdfs.semanticscholar.org/daf9/ed5dc6c6bad5367d7fd8561527da30e9b8dd.pdf

This paper suggests that binary classification could show better performance in case of combined decision trees + linear classification (eg logistic regression) compared to using ONLY decision trees or linear classification (not both)

Simply speaking, the trick is that we have several decision trees (assume 2 trees for simplicity, 1st tree with 3 leaf nodes and 2nd tree with 2 leaf nodes) and some real-valued feature vector x which goes as an input to all decision trees

So,
- if first tree's decision is leaf node 1 and second tree's decision is leaf node 2 then linear classifier will receive binary string [ 1 0 0 0 1 ]
- if first tree's decision is leaf node 2 and second tree's decision is leaf node 1 then linear classifier will receive binary string [ 0 1 0 1 0 ]

and so on

If we used only decision trees (without linear classif.), clearly we would have either class 100 / class 010 / class 001 for 1st tree and class 10 / class 01 for 2nd tree, but in this scheme the outputs of trees are combined into binary string which is fed to linear classifier. So it's not clear how to train these decision trees? What we have is aforementioned vector x and click/no-click, which is output of linear classif., not tree

Any ideas?

1 answers

For me, You need to perform boosting decisions trees by minimizing the log-loss criteria (binary classification). Once you trained your trees (assume you have 2 trees with 3 and 2 leaves). Then for each instance you predict the leaf index for each tree.

Example If for an instance you get the leaf 1 for tree 1 and leaf 2 for the second tree. IE you get a vector of (1, 0, 0, 0 , 1) , it is a binary vector not String. Then you have two strategies:

You train a linear classifier (ex: logistic regression) on the result of your trees prediction, your dataset has dimension (N*5), where N is number of your instances. You will train a logistic regression on binary data.
You concatenate your vector for dimension 5 with your initial vector of features, and you perform a linear classifier. You will train logistic regression on both real and binary data.

Plotting decision boundary in logistic regression

Decision boundary logistic regression not correct

choose logistic regression or decision tree

Plot Decision Boundary for Scikit Logistic Regression with 7 Features

plot a decision curve for logistic regression with gaussian Kernel

Can Tensorflow/Deep Learning be used for Gradient Boosted Trees, Logistic regression?

Predicting the class label using logistic regression/decision tree in Orange

Relation between coefficients in linear regression and feature importance in decision trees

slightly different results on scikit-learn decision trees regression

Confused about the decision boundary of logistic regression binary classifier

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Plotting decision boundary in logistic regression Decision boundary logistic regression not correct choose logistic regression or decision tree Plot Decision Boundary for Scikit Logistic Regression with 7 Features plot a decision curve for logistic regression with gaussian Kernel Can Tensorflow/Deep Learning be used for Gradient Boosted Trees, Logistic regression? Predicting the class label using logistic regression/decision tree in Orange Relation between coefficients in linear regression and feature importance in decision trees slightly different results on scikit-learn decision trees regression Confused about the decision boundary of logistic regression binary classifier

Related Tags

Decision Trees combined with Logistic Regression

Question

1 answers

solution1 3 2017-01-17 11:47:02

solution1
3 2017-01-17 11:47:02