[英]Spark Scala - converting Dataframe with one record and one column into Double
我写的scala代码给了我数据类型错误。 testpredict_02的主要方法采用Double。
val featuresMD = hiveContext.read.parquet("hdfs://machine01:9000/models/nb/metadata/features")
def testpredict_02(VData: Vector) = { MyModel.predict(VData) }
def outerpredict_02(argincome: String,argage: String,arggender: String) = {
featuresMD.registerTempTable("features_md")
val income = hiveContext.sql("select distinct income_index from features_md where income = argincome")
val age = hiveContext.sql("select distinct age_index from features_md where age = argage")
val gender = hiveContext.sql("select distinct gender_index from features_md where gender = arggender")
testpredict_02(Vectors.dense(income.select("income_index"), age.select("age_index"), gender.select("gender_index")))
Error :
<console>:43: error: type mismatch;
found : org.apache.spark.sql.DataFrame
required: Double
testpredict_02(Vectors.dense(income.select("income_index"), age.select("age_index")))
请帮忙..
如果确定3个数据框中的每一个都只包含一个列和一个记录,则可以为每个数据框获取第一条记录的第一列:
def getFirstCell(df: DataFrame): Double = df.first().getAs[Double](0)
val vector: Vector = Vectors.dense(
getFirstCell(income.select("income_index")),
getFirstCell(age.select("age_index")),
getFirstCell(gender.select("gender_index"))
)
testpredict_02(vector)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.