简体   繁体   中英

Does scikit-learn's DecisionTreeRegressor do true multi-output regression?

I have run in to a ML problem that requires us to use a multi-dimensional Y. Right now we are training independent models on each dimension of this output, which does not take advantage of additional information from the fact outputs are correlated.

I have been reading this to learn more about the few ML algorithms which have been truly extended to handle multidimensional outputs. Decision Trees are one of them.

Does scikit-learn use "Multi-target regression trees" in the event fit(X,Y) is given a multidimensional Y, or does it fit a separate tree for each dimension? I spent some time looking at the code but didn't figure it out.

After more digging, the only difference between a tree given points labeled with a single-dimensional Y versus one given points with multi-dimensional labels is in the Criterion object it uses to decide splits. A Criterion can handle multi-dimensional labels, so the result of fitting a DecisionTreeRegressor will be a single regression tree regardless of the dimension of Y.

This implies that, yes, scikit-learn does use true multi-target regression trees, which can leverage correlated outputs to positive effect.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM