简体繁体中英

Random Forest for multi-label classification

原文 2015-07-04 23:30:57 1 2 python/ machine-learning/ svm/ random-forest/ text-classification

I am making an application for multilabel text classification . I've tried different machine learning algorithm.

No doubt the SVM with linear kernel gets the best results.

I have also tried to sort through the algorithm Radom Forest and the results I have obtained have been very bad, both the recall and precision are very low.

The fact that the linear kernel to respond better result gives me an idea of the different categories are linearly separable.

Is there any reason the Random Forest results are so low?

2 answers

The ensemble of the random forest performs well across many domains and types of data. They are excellent at reducing error from variance and don't over fit if trees are kept simple enough.

I would expect a forest to perform comparably to a SVM with a linear kernel.

The SVM will tend to overfit more because it does not benefit from being an ensemble.

If you are not using cross validation of some kind. At minimum measuring performance on unseen data using a test/training regimen than i could see you obtaining this type of result.

Go back and make sure performance is measured on unseen data and likelier you'll see the RF performing more comparably.

Good luck.

It is very hard to answer this question without looking at the data in question.

SVM does have a history of working better with text classification - but machine learning by definition is context dependent.

Consider the parameters by which you are running the random forest algorithm. What are your number and depth of trees, are you pruning branches? Are you searching a larger parameter space for SVMs therefore are more likely to find a better optimum.

Getting Random Forest feature_importances_ from OneVsRestClassifier for Multi-label classification

Multi-label classification

nolearn for multi-label classification

multi-label classification in python

Multi-label classification implementation

Doing Multi-Label classification with BERT

Preprocessing data in Multi-label classification Python

sklearn multi-label classification probability calibration

Applying and plotting data with multi-label classification

Multi-label classification for large dataset

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Tags

Random Forest for multi-label classification

Question

2 answers

solution1
2 2015-07-05 20:20:41

solution2
1 2015-07-04 23:44:44

Random Forest for multi-label classification

Question

2 answers

solution1 2 2015-07-05 20:20:41

solution2 1 2015-07-04 23:44:44

solution1
2 2015-07-05 20:20:41

solution2
1 2015-07-04 23:44:44