简体   繁体   中英

Python Scikit Decision Tree with variable number of outputs

I'm looking to setup a multi-output decision tree using the Python SciKit library. The problem I'm facing however is that it's not a simple "n_outputs" classification. Some samples will have 3 outputs, some 4, some 5. I'm not sure what the best way is to convey this to the library.

I'm considering using the maximum number of outputs and having a "no output" classification. So if I train a set where each sample is coerced to 5 outputs, any sample which originally has only 3 classifications would be changed to 5 by adding that "no output" classification.

Do you think that would work? Any other ways to do a multi-output decision tree with variable number of outputs?

It sounds like you are trying to do multi-label classification, not multi-output classification. Multi-label can be most easily done by providing an indicator vector that says for each sample and each class whether they are in the class or not, so you get a binary array (0 for not in class, 1 for in class) of size (n_samples, n_classes).

Have a look at the multi-label documentation and see if that fits your use-case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM