How aggregate by value in apache spark

Question

i have sucess read cassandra from spark, using this syntax :

val rddSelect = sc.cassandraTable("keyspace", "nametable").select("column1", "column2").take(100)

i need to do aggregate group by column1 and column2 in spark

i have tried groupbykey and another transformations, but it's error :

value reduceByKey is not a member of Array[com.datastax.spark.connector.CassandraRow]

maybe anyone can give me a clue, thx

NB : i use scala

Answer 1

You can do like this -

val modifiedRDD = rddSelect.toJavaRDD.rdd
  .map { x =>
    {

      val temp = x.get(0).toString().split(",")
      (temp(0), temp(1))
    }
  }

modifiedRDD.groupBy(func)
modifiedRDD.reduceByKey(func)

How aggregate by value in apache spark

Question

1 answers

solution1
0 2015-09-25 09:47:50

How aggregate by value in apache spark

Question

1 answers

solution1 0 2015-09-25 09:47:50

solution1
0 2015-09-25 09:47:50