简体   繁体   English

多类分类中的类不平衡问题

[英]class imbalance issue in multi-class classification

I need a multi-class classifier code which can work in the following class imbalance ploblem: 我需要一个可以在以下类不平衡问题中工作的多类分类器代码:

  • class 1--> 80% 1级-> 80%
  • class 2--> 7.5% 2级-> 7.5%
  • class 3--> 6% 3级-> 6%
  • class 4--> 4% 4级-> 4%
  • class 5--> 2.5% 5级-> 2.5%

there are total of 130 instances only and there are about 5000 features for each instance. 总共只有130个实例,每个实例大约有5000个功能。

I found a multi-class SVM code but I do not think it takes into account the class imbalance problem. 我找到了多类SVM代码,但我认为它没有考虑到类不平衡问题。 Moreover, I also require to do some kind of k-fold cross-validation. 而且,我还需要进行某种k倍交叉验证。

python or matlab codes will really help. python或matlab代码确实有帮助。

I believe most people who want use SVM within MATLAB use libSVM , which has a MATLAB interface. 我相信大多数想要在MATLAB中使用SVM的人都使用libSVM ,它具有MATLAB接口。 It handles multiclass problems. 它处理多类问题。 5000 features and 130 instances should be fine. 5000个功能和130个实例应该很好。

I'm not sure whether you want to treat your class imbalance using class weights/priors, or using cost-sensitive learning, but you can achieve either with a little extra work, see here and here for some ideas. 我不确定您是要使用班级权重/先验水平还是使用成本敏感型学习来解决班级失衡问题,但是您可以通过一些额外的工作来实现,请参见此处此处的一些想法。

k-fold cross-validation can be achieved in MATLAB using cvpartition from Statistics Toolbox (and is pretty straightforward to code even if you don't have Statistics Tolbox). 可以在MATLAB中使用Statistics Toolbox中的cvpartition来实现k倍交叉验证(即使您没有Statistics Tolbox,也非常容易编写代码)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM