![](/img/trans.png)
[英]Deeplearning4j to spark pipeline: Convert a String type to org.apache.spark.mllib.linalg.VectorUDT
[英]Spark ClassCastException: JavaRDD cannot be cast to org.apache.spark.mllib.linalg.Vector
我想使用 Java 實現 ARIMA 時間序列。 我有以下 Scala 代碼:
object SingleSeriesARIMA {
def main(args: Array[String]): Unit = {
// The dataset is sampled from an ARIMA(1, 0, 1) model generated in R.
val lines = scala.io.Source.fromFile("../data/R_ARIMA_DataSet1.csv").getLines()
val ts = Vectors.dense(lines.map(_.toDouble).toArray)
val arimaModel = ARIMA.fitModel(1, 0, 1, ts)
println("coefficients: " + arimaModel.coefficients.mkString(","))
val forecast = arimaModel.forecast(ts, 20)
println("forecast of next 20 observations: " + forecast.toArray.mkString(","))
}
}
我嘗試了以下解決方案:
public class JavaARIMA {
public static void main(String args[])
{
System.setProperty("hadoop.home.dir", "C:/winutils");
SparkConf conf = new SparkConf().setAppName("Spark-TS Ticker Example").setMaster("local").set("spark.sql.warehouse.dir", "file:///C:/Users/devanshi/Downloads/Spark/sparkdemo/spark-warehouse/");
JavaSparkContext context = new JavaSparkContext(conf);
JavaRDD<String> lines = context.textFile("path/inputfile");
JavaRDD<Vector> ts = lines.map(
new Function<String, Vector>() {
public Vector call(String s) {
String[] sarray = s.split(",");
double[] values = new double[sarray.length];
for (int i = 0; i < sarray.length; i++) {
values[i] = Double.parseDouble(sarray[i]);
}
return Vectors.dense(values);
}
}
);
double[] total = {1.0,0.0,1.0};
//DenseVector dv = new DenseVector(total);
//convert(dv,toBreeze());
//ARIMAModel arimaModel = ARIMA.fitModel(1, 0, 1, dv, true, "css-cgd", null);
ARIMAModel arimaModel = ARIMA.fitModel(1, 0, 1, (Vector) ts, false, "css-cgd", total);
// arimaModel = ARIMA.fitModel(1, 0, 1, ts);
System.out.println("coefficients: " + arimaModel.coefficients());
Vector forcst = arimaModel.forecast((Vector) ts,20);
System.out.println("forecast of next 20 observations: " + forcst);
}
}
但我得到了:
Exception in thread "main" java.lang.ClassCastException:
org.apache.spark.api.java.JavaRDD cannot be cast to
org.apache.spark.mllib.linalg.Vector
如果可能,請幫助我。
您不能將 JavaRDD 類型轉換為 Vector,而是需要使用 rdd.foreach 來獲取單個 Vector。 所以代碼可能是這樣的。
ts.foreach(new VoidFunction<Vector>() {
@Override
public void call(Vector v) throws Exception {
double[] total = { 1.0, 0.0, 1.0 };
ARIMAModel arimaModel = ARIMA.fitModel(1, 0, 1, (Vector) v, false, "css-cgd", total);
System.out.println("coefficients: " + arimaModel.coefficients());
Vector forcst = arimaModel.forecast((Vector) v, 20);
System.out.println("forecast of next 20 observations: " + forcst);
}
});
希望這可以幫助...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.