简体   繁体   English

Scala Spark - 调用createDataFrame时获取重载方法

[英]Scala Spark - Get Overloaded method when calling createDataFrame

I try to create a DataFrame from an Array of Array of Double (Array[Array[Double]]) like below: 我尝试从如下的数组Array of Array(Array [Array [Double]])创建一个DataFrame:

val points : ArrayBuffer[Array[Double]] = ArrayBuffer(
Array(0.19238990024216676, 1.0, 0.0, 0.0),
Array(0.2864319929878242, 0.0, 1.0, 0.0),
Array(0.11160349352921925, 0.0, 2.0, 1.0),
Array(0.3659220026496052, 2.0, 2.0, 0.0),
Array(0.31809629470827383, 1.0, 1.0, 1.0))

val x = Array("__1", "__2", "__3", "__4")
val myschema = StructType(x.map(fieldName ⇒ StructField(fieldName, DoubleType, true)))

points.map(e => Row(e(0), e(1), e(2), e(3)))
val newDF = sqlContext.createDataFrame(points, myschema)

But get this error: 但得到这个错误:

<console>:113: error: overloaded method value createDataFrame with alternatives:
(data: java.util.List[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.api.java.JavaRDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.rdd.RDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rows: java.util.List[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.api.java.JavaRDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
cannot be applied to (scala.collection.mutable.ArrayBuffer[Array[Double]], org.apache.spark.sql.types.StructType)
val newDF = sqlContext.createDataFrame(points, myschema)

I searched over the internet but can't find out how to fix it! 我在互联网上搜索但无法找到解决方法! So if anyone has any idea about this, please help me! 所以,如果有人对此有任何想法,请帮助我!

There is no overload of method createDataFrame that accepts an instance of ArrayBuffer[Array[Double]] . 方法createDataFrame没有重载接受ArrayBuffer[Array[Double]]的实例。 Your call to points.map wasn't being assigned to anything, it returns a new instance rather than operating in-place. 您对points.map调用未被分配给任何内容,它返回一个新实例而不是就地操作。 Try: 尝试:

val points : List[Array[Double]] = List(
    Seq(0.19238990024216676, 1.0, 0.0, 0.0),
    Seq(0.2864319929878242, 0.0, 1.0, 0.0),
    Seq(0.11160349352921925, 0.0, 2.0, 1.0),
    Seq(0.3659220026496052, 2.0, 2.0, 0.0),
    Seq(0.31809629470827383, 1.0, 1.0, 1.0))

val x = Array("__1", "__2", "__3", "__4")
val myschema = StructType(x.map(fieldName ⇒ StructField(fieldName, DoubleType, true)))

val newDF = sqlContext.createDataFrame(
    points.map(Row.fromSeq(_), myschema)

This works for me: 这对我有用:

import org.apache.spark.sql._
import org.apache.spark.sql.types._
import scala.collection.mutable.ArrayBuffer

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val points : ArrayBuffer[Array[Double]] = ArrayBuffer(
  Array(0.19238990024216676, 1.0, 0.0, 0.0),
  Array(0.2864319929878242, 0.0, 1.0, 0.0),
  Array(0.11160349352921925, 0.0, 2.0, 1.0),
  Array(0.3659220026496052, 2.0, 2.0, 0.0),
  Array(0.31809629470827383, 1.0, 1.0, 1.0))

val x = Array("__1", "__2", "__3", "__4")
val myschema = StructType(x.map(fieldName ⇒ StructField(fieldName, DoubleType, true)))

val rdd = sc.parallelize(points.map(e => Row(e(0), e(1), e(2), e(3))))
val newDF = sqlContext.createDataFrame(rdd, myschema)

newDF.show

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 scala.collection.immutable.Iterable [org.apache.spark.sql.Row]到DataFrame吗? 错误:方法值重载createDataFrame及其它替代方法 - scala.collection.immutable.Iterable[org.apache.spark.sql.Row] to DataFrame ? error: overloaded method value createDataFrame with alternatives 无法解析重载方法“createDataFrame” - Cannot resolve overloaded method 'createDataFrame' spark scala 尝试获取最大值时“重载方法值 select 和替代项” - spark scala "Overloaded method value select with alternatives" when trying to get the max value 在 Scala 中调用 Kotlin 的重载方法 - Calling overloaded method of Kotlin in Scala Spark scala 异常重载方法值 foreach - Spark scala exception overloaded method value foreach 从scala调用重载的java泛型方法 - Calling an overloaded java generic method from scala 每次执行 toDF() 或 createDataFrame 时,scala spark 都会引发与 derby 相关的错误 - scala spark raises an error related to derby everytime when doing toDF() or createDataFrame Spark-从Scala代码调用Java方法时出现UnsupportedOperationException - Spark - UnsupportedOperationException when calling Java method from Scala code Spark Scala API:官方示例中spark.createDataFrame中没有typeTag - Spark Scala API: No typeTag available in spark.createDataFrame in official example 无法解决重载方法 IntelliJ 2018.2.8 / Scala 2.11.8/Spark 2.4.5 - Cannot Resolve Overloaded Method IntelliJ 2018.2.8 / Scala 2.11.8/Spark 2.4.5
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM