简体   繁体   中英

How to construct a function that can be used for mapping a JavaRDD[org.apache.spark.sql.Row] in spark/scala?

val drdd = Seq(("a", 1), ("b", 2), ("a", 3)).toDF("name", "value").toJavaRDD
drdd.map{ (row: Row) => row.get(0) }

It seems like the anonymous function I passed is Row => Any while it is expecting org.apache.spark.api.java.function.Function[org.apache.spark.sql.Row,?]

<console>:35: error: type mismatch;
found   : org.apache.spark.sql.Row => Any
required: org.apache.spark.api.java.function.Function[org.apache.spark.sql.Row,?]
   drdd.map{ (row: Row) => row.get(0) }
                        ^

What is the difference between those function types and how should I construct it? Thanks!

Example:

drdd.map(new org.apache.spark.api.java.function.Function[Row, String]() {
    override def call(row: Row): String = row.getString(0)
})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM