简体   繁体   English

Cassandra Scala Spark-将RDD保存到Cassandra

[英]Cassandra Scala Spark - saving RDD to Cassandra

I have the following RDD 我有以下RDD

RDD[(String, Seq[((String, Double), Int)])]

An example would be: 一个例子是:

RDD["a", Seq[(("b", 2.0), 1), (("c", 3.0), 2)]]

And I want to insert into my Cassandra table with the following schema 我想使用以下架构插入我的Cassandra表中

String (PK), String, Double, Int

In the end, for the given example, I will have the following in my DB 最后,对于给定的示例,我将在数据库中包含以下内容

"a", "b", 2.0, 1
"a", "c", 3.0, 2

What is the Scala code which does this? 这是什么Scala代码? I tried to use saveToCassandra , but my input isn't in the form of RDD[(String, String, Double, Int)] . 我尝试使用saveToCassandra ,但是输入的形式不是RDD[(String, String, Double, Int)] Should I first flatten it? 我应该先弄平吗?

All you need here is a flatMap : 您需要的只是一个flatMap

import org.apache.spark.rdd.RDD

val rdd: RDD[(String, Seq[((String, Double), Int)])] = ???
val flattened: RDD[(String, String, Double, Int)] = rdd.flatMap{
  case (k, vs) => vs.map{case ((v1, v2), v3) => (k, v1, v2, v3)}}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM