简体繁体 English

使用带有单独分类器模型的 Weka 实验选项卡进行结果验证

[英]Result verification with Weka Experiment tab with individual classifier models

原文 2021-11-01 05:53:13 9 1 machine-learning/ weka/ mean-square-error/ kappa

I ran different classifiers on the same dataset.我在同一个数据集上运行了不同的分类器。 I got some statistical values after run the classifiers.运行分类器后，我得到了一些统计值。

This is the summary of all classifiers这是所有分类器的总结

I am using Weka to trained the model.我正在使用 Weka 来训练模型。 Weka itself has a method to compare different algorithms. Weka 本身有一种方法可以比较不同的算法。 For that we need to use the Experiment tab.为此，我们需要使用“ Experiment选项卡。 I have done with this option as well for the same dataset.我也为相同的数据集完成了这个选项。

Weka gave me the result for Kappa statistics when use Experiment tab使用实验选项卡时，Weka 给了我 Kappa 统计的结果

Rootmean squared error is均方根误差是

Relative absolute error相对绝对误差

and so on.....等等.....

Now I am unable to understand that the values I got from Experiment tab how does those are similar to the values that I have shared in the table format in the first picture?现在我无法理解我从“实验”选项卡中获得的值与我在第一张图片中的表格格式中共享的值有何相似之处？

1 个解决方案

I presume that the initial table was populated with statistics obtained from cross-validation runs in the Weka Explorer.我假设初始表中填充了从 Weka Explorer 中的交叉验证运行获得的统计信息。

The Explorer aggregates the predictions across a single cross-validation run so that it appears that you had a single test set of that size.资源管理器在单个交叉验证运行中聚合预测，因此看起来您有一个该大小的测试集。 It is only to be used as an explorative tool, hence the name.它仅用作探索工具，因此得名。

The Experimenter records the metrics (like accuracy, rmse, etc) generated from each fold pair across the number of runs that you perform during your experiment.实验器记录在您在实验期间执行的运行次数中从每个折叠对生成的指标（如准确度、rmse 等）。 The metrics collected across multiple classifiers and/or datasets can then be analyzed using significance tests.然后可以使用显着性测试分析跨多个分类器和/或数据集收集的指标。 By default, 10 runs of 10-fold CV are used, which is recommended for such comparisons.默认情况下，使用 10 次 10 倍 CV 的运行，建议用于此类比较。 This results in 100 individual values for each metric from which mean and standard deviation are generated.这会为每个度量产生 100 个单独的值，从中生成均值和标准差。 */v indicate whether there is a statistically significant loss/win. */v 表示是否存在统计上显着的输/赢。