繁体   English   中英

Scala —使用表达式求值将数据帧写入csv文件

[英]Scala — Use evaluation of an expression to write dataframe to a csv file

这是使用表达式(字符串)的评估(Eval或类似方法)将数据帧写入Scala中的csv文件。

 import org.apache.spark.sql.{SaveMode, SparkSession, SQLContext, Row, DataFrame, Column}
 import scala.reflect.runtime.universe._
 import scala.tools.reflect.ToolBox
 import scala.reflect.runtime.currentMirror

 val df = Seq(("a", "b", "c"), ("a1", "b1", "c1")).toDF("A", "B", "C")
 val df_write = """df.coalesce(1).write.option("delimiter", "\u001F").csv("file:///var/tmp/test")"""

 // This is one of my failed attempts - I have tried using the interpreter as well (code not shown here).    
 val tb = runtimeMirror(getClass.getClassLoader).mkToolBox()  
 toolbox.eval(toolbox.parse(df_write))

 Errors are: 
 object coalesce is not a member of package df ....

湿婆神,尝试下面的代码。 问题在于对象变量不在工具箱范围内,因此无法评估表达式。

package com.mansoor.test

import org.apache.spark.sql.{DataFrame, SparkSession}

object Driver extends App {

  def evalCode[T](code: String): T = {
    import scala.tools.reflect.ToolBox
    import scala.reflect.runtime.{currentMirror => m}
    val toolbox = m.mkToolBox()
    toolbox.eval(toolbox.parse(code)).asInstanceOf[T]
  }

  val sparkSession: SparkSession = SparkSession.builder().appName("Test")
    .master("local[2]")
    .getOrCreate()

  import sparkSession.implicits._
  val df: DataFrame = Seq(("a", "b", "c"), ("a1", "b1", "c1")).toDF("A", "B", "C")

  val df_write =
    s"""
       |import com.mansoor.test.Driver._
       |
       |df.coalesce(1).write.option("delimiter", "\u001F").csv("file:///var/tmp/test")
       """.stripMargin

  evalCode[Unit](df_write)

  sparkSession.sparkContext.stop()
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM