How to get evaluated data in Apache Spark?

Question

I implemented a simple naive bayesian method which is just exactly same with the given example in spark's tutorials. Here is how I implemented it:

public void applyNaiveBayes(String fileWithBinaryLabelsPath){
    Dataset<Row> dataFrame =
            sparkBase.getSpark().read().format("libsvm").load(fileWithBinaryLabelsPath);
    Dataset<Row>[] splits = dataFrame.randomSplit(new double[]{0.8, 0.2}, 1234L);
    Dataset<Row> train = splits[0];
    Dataset<Row> test = splits[1];

    NaiveBayes nb = new NaiveBayes();

    NaiveBayesModel model = nb.fit(train);

    Dataset<Row> predictions = model.transform(test);
    predictions.show();

    MulticlassClassificationEvaluator evaluator = new MulticlassClassificationEvaluator()
            .setLabelCol("label")
            .setPredictionCol("prediction")
            .setMetricName("accuracy");

    double accuracy = evaluator.evaluate(predictions);
    System.out.println("Test set accuracy = " + accuracy);
}

It works well. But I need one more thing. Here I use %20 of my data as test data. After the calculations I want to get the result data, I mean what naive bayes predicted for every row. How can I do that in java?

Answer 1

To save predictions dataset into file, convert Dataset into JavaRDD and write JavaRDD into the file by issuing predictions.javaRDD().saveAsTextFile(<file path>);

Below is the metrics for Multiclass Classification evaluator:
https://spark.apache.org/docs/2.2.0/api/java/org/apache/spark/ml/evaluation/MulticlassClassificationEvaluator.html#metricName--

Since you're using Naive Bayes model with binary classification, you need to use Binary Classification evaluator instead:
https://spark.apache.org/docs/2.0.1/api/java/org/apache/spark/ml/evaluation/BinaryClassificationEvaluator.html

How to get evaluated data in Apache Spark?

Question

1 answers

solution1
0 ACCPTED 2018-12-02 15:07:30

How to get evaluated data in Apache Spark?

Question

1 answers

solution1 0 ACCPTED 2018-12-02 15:07:30

solution1
0 ACCPTED 2018-12-02 15:07:30