I am using Spark's Java API, and read a lot of data with following schema:
profits (Array of Double values):
---------------------------------
[1.0,2.0,3.0]
[2.0,3.0,4.0]
[4,0,6.0]
Once I have a dataframe, I want to compute a new vector which is the sum of all the vectors:
Result:
[7.0,11.0,7.0]
I see some examples online on doing this in Scala and Python, but nothing for Java.
val withIndex = profits.zipWithIndex // ((a,0),(b,1),(c,2))
We need to use the index as key:
val indexKey = withIndex.map{case (k,v) => (v,k)} //((0,a),(1,b),(2,c))
Finallly,
counts = indexKey.reduceByKey(lambda k, v: k + v)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.