简体   繁体   English

如何将 Sklearn 的分类报告输出到 csv 文件中?

[英]How to output Classification Report of Sklearn into a csv file?

Does anyone know is there anyway to output the classification report as the text file or CSV file?有谁知道无论如何将分类报告输出为文本文件或CSV文件?

This line print(metrics.classification_report(y_test, y_pred)) in python gives me the classification report. python 中的这行print(metrics.classification_report(y_test, y_pred))给了我分类报告。 I want to have this report in csv format.我想以csv格式获得这份报告。

I tried to copy and paste but the the columns would be lumped together!我试图复制和粘贴,但列会混在一起! Any help appreciated!任何帮助表示赞赏!

The function has a parameter which solves this exact problem.该函数有一个参数可以解决这个确切的问题。

import pandas as pd
from sklearn.metrics import classification_report

report_dict = classification_report(y_true, y_pred, output_dict=True)
pd.DataFrame(report_dict)

After converting the dictionary into a dataframe, you can write it to a csv, easily plot it, do operations on it or whatever.将字典转换为数据框后,您可以将其写入 csv,轻松绘制它,对其进行操作或其他任何操作。

It is possible but you need to create a function.这是可能的,但您需要创建一个函数。

Let's say that I want to write the report to my report.csv file (this need to be created before running the code)假设我想将报告写入我的 report.csv 文件(这需要在运行代码之前创建)

Full Example:完整示例:

from sklearn.metrics import classification_report
import csv
import pandas as pd

y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']

def classifaction_report_csv(report):
    report_data = []
    lines = report.split('\n')
    for line in lines[2:-3]:
        row = {}
        row_data = line.split('      ')
        row['class'] = row_data[0]
        row['precision'] = float(row_data[1])
        row['recall'] = float(row_data[2])
        row['f1_score'] = float(row_data[3])
        row['support'] = float(row_data[4])
        report_data.append(row)
    dataframe = pd.DataFrame.from_dict(report_data)
    dataframe.to_csv('report.csv', index = False)

#call the classification_report first and then our new function

report = classification_report(y_true, y_pred, target_names=target_names)
classifaction_report_csv(report)

Hope this helps.希望这可以帮助。 Open the csv file and see:打开 csv 文件并查看:

Screenshot:截屏:

在此处输入图片说明

I found Rabeez Riaz solution much easier.我发现 Rabeez Riaz 解决方案要容易得多。 I would like to add that you can transpose to the dataframe with report_dict as argument.我想补充一点,您可以使用 report_dict 作为参数转置到数据框。

df = pandas.DataFrame(report_dict).transpose()

From here on, you are free to use the standard pandas methods to generate your desired output formats (CSV, HTML, LaTeX, ...).从这里开始,您可以自由地使用标准的 Pandas 方法来生成您想要的输出格式(CSV、HTML、LaTeX 等)。 your desired output formats (CSV, HTML, LaTeX, ...).您所需的输出格式(CSV、HTML、LaTeX 等)。

Source link: https://intellipaat.com/community/15701/scikit-learn-output-metrics-classificationreport-into-csv-tab-delimited-format来源链接: https : //intellipaat.com/community/15701/scikit-learn-output-metrics-classificationreport-into-csv-tab-delimited-format

Additionally to sera's answer, I find the following way helpful - without having to parse the string of classification report using precision_recall_fscore_support :除了 sera 的回答,我发现以下方法很有帮助 - 无需使用precision_recall_fscore_support解析分类报告字符串:

from sklearn.metrics import precision_recall_fscore_support
from sklearn.utils.multiclass import unique_labels


def classification_report_to_csv_pandas_way(ground_truth,
                                            predictions,
                                            full_path="test_pandas.csv"):
    """
    Saves the classification report to csv using the pandas module.
    :param ground_truth: list: the true labels
    :param predictions: list: the predicted labels
    :param full_path: string: the path to the file.csv where results will be saved
    :return: None
    """
    import pandas as pd

    # get unique labels / classes
    # - assuming all labels are in the sample at least once
    labels = unique_labels(ground_truth, predictions)

    # get results
    precision, recall, f_score, support = precision_recall_fscore_support(ground_truth,
                                                                          predictions,
                                                                          labels=labels,
                                                                          average=None)
    # a pandas way:
    results_pd = pd.DataFrame({"class": labels,
                               "precision": precision,
                               "recall": recall,
                               "f_score": f_score,
                               "support": support
                               })

    results_pd.to_csv(full_path, index=False)


def classification_report_to_csv(ground_truth,
                                 predictions,
                                 full_path="test_simple.csv"):
    """
    Saves the classification report to csv.
    :param ground_truth: list: the true labels
    :param predictions: list: the predicted labels
    :param full_path: string: the path to the file.csv where results will be saved
    :return: None
    """
    # get unique labels / classes
    # - assuming all labels are in the sample at least once
    labels = unique_labels(ground_truth, predictions)

    # get results
    precision, recall, f_score, support = precision_recall_fscore_support(ground_truth,
                                                                          predictions,
                                                                          labels=labels,
                                                                          average=None)

    # or a non-pandas way:
    with open(full_path) as fp:
        for line in zip(labels, precision, recall, f_score, support):
            fp.write(",".join(line))

if __name__ == '__main__':
    # dummy data
    ground_truth = [1, 1, 4, 1, 3, 1, 4]
    prediction = [1, 1, 3, 4, 3, 1, 1]

    # test
    classification_report_to_csv(ground_truth, prediction)
    classification_report_to_csv_pandas_way(ground_truth, prediction)

outputs in either case:两种情况下的输出:

class,f_score,precision,recall,support
1,0.75,0.75,0.75,4
3,0.666666666667,0.5,1.0,1
4,0.0,0.0,0.0,2

To have a csv similar to the output of classification report, you can use this:要获得类似于分类报告输出的 csv,您可以使用:

    labels = list(set(targcol))
    report_dict = classification_report(targcol, predcol, output_dict=True)
    repdf = pd.DataFrame(report_dict).round(2).transpose()
    repdf.insert(loc=0, column='class', value=labels + ["accuracy", "macro avg", "weighted avg"])
    repdf.to_csv("results.csv", index=False)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM