简体   繁体   中英

Decision Trees combined with Logistic Regression

Basicly my question is related to the following paper (it is enough to read only sections 1.Introduction , beginning of section 3.Prediction model structure and section 3.1 Decision tree feature transforms , everything else could be skipped)

https://pdfs.semanticscholar.org/daf9/ed5dc6c6bad5367d7fd8561527da30e9b8dd.pdf

This paper suggests that binary classification could show better performance in case of combined decision trees + linear classification (eg logistic regression) compared to using ONLY decision trees or linear classification (not both)

Simply speaking, the trick is that we have several decision trees (assume 2 trees for simplicity, 1st tree with 3 leaf nodes and 2nd tree with 2 leaf nodes) and some real-valued feature vector x which goes as an input to all decision trees

So,
- if first tree's decision is leaf node 1 and second tree's decision is leaf node 2 then linear classifier will receive binary string [ 1 0 0 0 1 ]
- if first tree's decision is leaf node 2 and second tree's decision is leaf node 1 then linear classifier will receive binary string [ 0 1 0 1 0 ]

and so on

If we used only decision trees (without linear classif.), clearly we would have either class 100 / class 010 / class 001 for 1st tree and class 10 / class 01 for 2nd tree, but in this scheme the outputs of trees are combined into binary string which is fed to linear classifier. So it's not clear how to train these decision trees? What we have is aforementioned vector x and click/no-click, which is output of linear classif., not tree

Any ideas?

For me, You need to perform boosting decisions trees by minimizing the log-loss criteria (binary classification). Once you trained your trees (assume you have 2 trees with 3 and 2 leaves). Then for each instance you predict the leaf index for each tree.

Example If for an instance you get the leaf 1 for tree 1 and leaf 2 for the second tree. IE you get a vector of (1, 0, 0, 0 , 1) , it is a binary vector not String. Then you have two strategies:

  1. You train a linear classifier (ex: logistic regression) on the result of your trees prediction, your dataset has dimension (N*5), where N is number of your instances. You will train a logistic regression on binary data.

  2. You concatenate your vector for dimension 5 with your initial vector of features, and you perform a linear classifier. You will train logistic regression on both real and binary data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM