简体   繁体   English

在Spark中保存到Cassandra,Java中的parallelize方法不可用

[英]Save to cassandra in spark, parallelize method is not availble in java

I am trying to save one row only to a cassandra table using spark in java (this comes as a result after long processing in spark), I am using the new method to connect to cassandra using spark session as follows: 我正在尝试使用java中的spark将一行仅保存到cassandra表中(这是在spark中进行长时间处理之后得出的结果),我正在使用新方法通过spark会话连接到cassandra,如下所示:

     SparkSession spark = SparkSession
          .builder()
          .appName("App")
          .config("spark.cassandra.connection.host", "cassandra1.example.com")
          .config("spark.cassandra.connection.port", "9042")
          .master("spark://cassandra.example.com:7077")
          .getOrCreate();

The connection is successful and works well as I have Spark installed on the same nodes as cassandra, after reading some RDDs from cassandra I want to save to another table in cassandra, so I am following the documentation here , namely, the part to save to cassandra as follows: 从cassandra中读取一些RDD后,连接成功并且运行良好,因为我将Spark与cassandra安装在相同的节点上,我想保存到cassandra中的另一个表中,因此我遵循此处的文档,即要保存到的部分。卡桑德拉如下:

List<Person> people = Arrays.asList(
    new Person(1, "John", new Date()),
    new Person(2, "Troy", new Date()),
    new Person(3, "Andrew", new Date())
);
JavaRDD<Person> rdd = spark.sparkContext().parallelize(people);
javaFunctions(rdd).writerBuilder("ks", "people", mapToRow(Person.class)).saveToCassandra();

The problem which I am facing is that parallelize method is not accepted, and only a scala version looks avaiable, the error is: 我面临的问题是并行化方法不被接受,并且只有scala版本可用,错误是:

The method parallelize(Seq<T>, int, ClassTag<T>) in the type 
SparkContext is not applicable for the arguments (List<Person>) 

How can I use that in Java to save to cassandra table? 如何在Java中使用它保存到cassandra表?

To parallelize java.util.List you can use JavaSparkContext (not SparkContext ) , something like this: parallelize java.util.List ,可以使用JavaSparkContext (不是SparkContext ),如下所示:

import org.apache.spark.api.java.JavaSparkContext;

JavaSparkContext sc = new JavaSparkContext(spark.sparkContext());
sc.parallelize(people);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM