簡體   English   中英

Spark ClassCastException:無法將 JavaRDD 轉換為 org.apache.spark.mllib.linalg.Vector

[英]Spark ClassCastException: JavaRDD cannot be cast to org.apache.spark.mllib.linalg.Vector

我想使用 Java 實現 ARIMA 時間序列。 我有以下 Scala 代碼:

object SingleSeriesARIMA {
    def main(args: Array[String]): Unit = {
    // The dataset is sampled from an ARIMA(1, 0, 1) model generated in R.
    val lines = scala.io.Source.fromFile("../data/R_ARIMA_DataSet1.csv").getLines()
    val ts = Vectors.dense(lines.map(_.toDouble).toArray)
    val arimaModel = ARIMA.fitModel(1, 0, 1, ts)
    println("coefficients: " + arimaModel.coefficients.mkString(","))
    val forecast = arimaModel.forecast(ts, 20)
    println("forecast of next 20 observations: " + forecast.toArray.mkString(",")) 
    }
}

我嘗試了以下解決方案:

public class JavaARIMA {

public static void main(String args[])
        {
    System.setProperty("hadoop.home.dir", "C:/winutils");  
    SparkConf conf = new SparkConf().setAppName("Spark-TS Ticker Example").setMaster("local").set("spark.sql.warehouse.dir", "file:///C:/Users/devanshi/Downloads/Spark/sparkdemo/spark-warehouse/");
    JavaSparkContext context = new JavaSparkContext(conf);

    JavaRDD<String> lines = context.textFile("path/inputfile");

    JavaRDD<Vector> ts = lines.map(
              new Function<String, Vector>() {
                public Vector call(String s) {
                  String[] sarray = s.split(",");
                  double[] values = new double[sarray.length];
                  for (int i = 0; i < sarray.length; i++) {
                    values[i] = Double.parseDouble(sarray[i]);
                  }
                  return Vectors.dense(values);
                }
              }
            );
    double[] total = {1.0,0.0,1.0};
    //DenseVector dv = new DenseVector(total);
    //convert(dv,toBreeze());
   //ARIMAModel arimaModel = ARIMA.fitModel(1, 0, 1, dv, true, "css-cgd", null);
    ARIMAModel arimaModel = ARIMA.fitModel(1, 0, 1, (Vector) ts, false, "css-cgd", total);

   //  arimaModel = ARIMA.fitModel(1, 0, 1, ts);
    System.out.println("coefficients: " + arimaModel.coefficients()); 
    Vector forcst = arimaModel.forecast((Vector) ts,20);
    System.out.println("forecast of next 20 observations: " + forcst);
}
}

但我得到了:

Exception in thread "main" java.lang.ClassCastException:
org.apache.spark.api.java.JavaRDD cannot be cast to
org.apache.spark.mllib.linalg.Vector

如果可能,請幫助我。

您不能將 JavaRDD 類型轉換為 Vector,而是需要使用 rdd.foreach 來獲取單個 Vector。 所以代碼可能是這樣的。

ts.foreach(new VoidFunction<Vector>() {
    @Override
    public void call(Vector v) throws Exception {
        double[] total = { 1.0, 0.0, 1.0 };
        ARIMAModel arimaModel = ARIMA.fitModel(1, 0, 1, (Vector) v, false, "css-cgd", total);

        System.out.println("coefficients: " + arimaModel.coefficients());
        Vector forcst = arimaModel.forecast((Vector) v, 20);
        System.out.println("forecast of next 20 observations: " + forcst);
    }
});

希望這可以幫助...

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM