i have sucess read cassandra from spark, using this syntax :
val rddSelect = sc.cassandraTable("keyspace", "nametable").select("column1", "column2").take(100)
i need to do aggregate group by column1 and column2 in spark
i have tried groupbykey and another transformations, but it's error :
value reduceByKey is not a member of Array[com.datastax.spark.connector.CassandraRow]
maybe anyone can give me a clue, thx
NB : i use scala
You can do like this -
val modifiedRDD = rddSelect.toJavaRDD.rdd
.map { x =>
{
val temp = x.get(0).toString().split(",")
(temp(0), temp(1))
}
}
modifiedRDD.groupBy(func)
modifiedRDD.reduceByKey(func)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.