简体   繁体   English

更新cassandra表中的一列

[英]update one column in cassandra table

I have a cassandra table person_master (personId: int, customerId: Int, firstName: String, lastName: String, mrids: Set) primaryKey (personId and customerID) 我有一个卡桑德拉表person_master(personId:int,customerId:Int,firstName:String,lastName:String,mrids:Set)primaryKey(personId和customerID)

Suppose I have one input RDD of structure [personId, customerId, firstName, lastname, messageType: String, source: String, sourceType: String] 假设我有一个结构为[personId,customerId,firstName,lastname,messageType:String,source:String,sourceType:String]的输入RDD。

suppose value of RDD:[1001,119,None,None,{abc.xyz} and cassandra row has value [1001,119,Vikash,Singh,{aaa.bbb}] 假设RDD的值是:[1001,119,None,None,{abc.xyz},而cassandra行的值是[1001,119,Vikash,Singh,{aaa.bbb}]

I want on fetch cassandra row based on RDD value and update the mrids column of cassandra table and using all other column from cassandra row. 我想基于RDD值获取cassandra行,并更新cassandra表的mrids列,并使用cassandra行中的所有其他列。

eg in this I want final RDD value as [1001,119,Vikash,Singh,{aaa.bbb,abc.xyz}] which I will update to cassandra later. 例如,在此,我希望最终的RDD值为[1001,119,Vikash,Singh,{aaa.bbb,abc.xyz}],稍后我将其更新为cassandra。

Can anybody give me the solution to do this in Spark using cassandra Connector. 有人可以使用cassandra Connector在Spark中给我解决方案吗?

Assuming sc is sparkContext like, 假设sc是sparkContext之类的,

val sparkConf = new SparkConf().setMaster(SPARK_MASTER)
                            .setAppName(SPARK_SCALA_APP_NAME)
                            .setJars(SPARK_SCALA_JAR)
sparkConf.set("spark.cassandra.connection.host", value)
sparkConf.set("spark.cassandra.auth.username", value)
sparkConf.set("spark.cassandra.auth.password", value)
val sc = new SparkContext(sparkConf)

You can use or ignore where clause (where can be used only if its partition key) 您可以使用或忽略where子句(仅当其分区键可使用where)

val selectedRow = sc.cassandraTable("keyspace", "tableName")
      .select("key", "column2", "column3")
      .where("key IN ?", keys)
      .as((key: String, column2: String, column3: Integer)
          =>(key, column2, column3))

Do filtering and modification on your rdd Then save it like, 对rdd进行过滤和修改,然后将其保存,

selectedRow.saveToCassandra("keyspace",
                           "tableName",
                           SomeColumns("key", "column2", "column3"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM