[英]Make spark-sql UDF available in Scala spark data frame DSL API
How can I access geomesas UDF in spark scala dataframe (not textual) api? 如何在Spark Scala DataFrame(非文本)API中访问Geomesas UDF? Ie how to convert 即如何转换
How can I make sql UDFs available in the textual spark-sql API available in the scala data frame DSL? 如何在scala数据帧DSL中的文本spark-sql API中使sql UDF可用? Ie how to enable instead of this expression 即如何启用而不是此表达式
spark.sql("select st_asText(st_bufferPoint(geom,10)) from chicago where case_number = 1")
something similar to 类似于
df.select(st_asText(st_bufferPoint('geom, 10))).filter('case_number === 1)
How to register geomesas UDF in a way that these are not only available to the sql text mode. 如何以不仅仅适用于sql文本模式的方式注册geomesas UDF。 SQLTypes.init(spark.sqlContext)
from https://github.com/locationtech/geomesa/blob/f13d251f4d8ad68f4339b871a3283e43c39ad428/geomesa-spark/geomesa-spark-sql/src/main/scala/org/apache/spark/sql/SQLTypes.scala#L59-L66 only seems to register textual expressions. 来自https://github.com/locationtech/geomesa/blob/f13d251f4d8ad68f4339b871a3283e43c39ad428/geomesa-spark/geomesa-spark-sql/src/main/scala/org/apache/spark/sql/sql/SQLTypes的 SQLTypes.init(spark.sqlContext)
.scala#L59-L66似乎仅注册文本表达式。
I am already importing 我已经在汇入
import org.apache.spark.sql.functions._
so these functions 所以这些功能
https://github.com/locationtech/geomesa/blob/828822dabccb6062118e36c58df8c3a7fa79b75b/geomesa-spark/geomesa-spark-sql/src/main/scala/org/apache/spark/sql/SQLSpatialFunctions.scala#L31-L41 https://github.com/locationtech/geomesa/blob/828822dabccb6062118e36c58df8c3a7fa79b75b/geomesa-spark/geomesa-spark-sql/src/main/scala/org/apache/spark/sql/SQLSpatialFunctions.scala#L31-L41
should be available. 应该可用。
You can use the udf
function in the org.apache.spark.sql.functions
you're importing eg 您可以在要导入的org.apache.spark.sql.functions
使用udf
函数,例如
val myUdf = udf((x: String) => doSomethingWithX(x))
you can then use myUdf in the DSL as in df.select(myUdf($"field")) 然后可以像在df.select(myUdf($“ field”))中那样在DSL中使用myUdf
Take a look at the callUDF
function from org.apache.spark.sql.functions
看一下org.apache.spark.sql.functions
中的callUDF
函数
val spark = SparkSession.builder()
.appName("callUDF")
.master("local[*]")
.getOrCreate()
import spark.implicits._
val df = spark.createDataset(List("abcde", "bcdef", "cdefg")).toDF("str")
df.createTempView("view")
spark.sql("select length(substring(str, 2, 3)) from view").show()
df.select(callUDF("length", callUDF("substring", $"str", lit(2), lit(3)))).show()
spark.stop()
Tested with Spark 2.1 经过Spark 2.1测试
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.