简体   繁体   English

Scala-Spark:不能使用 UDF

[英]Scala-Spark: Cannot use UDF

I am having an issue using UDF in Spark (Scala).我在 Spark (Scala) 中使用 UDF 时遇到问题。 This is a sample code:这是一个示例代码:

import org.apache.spark.sql.{SparkSession, DataFrame}
import org.apache.spark.sql.functions.{col, udf}

val spark = SparkSession.builder.appName("test") 
             .master("local[*]")
             .getOrCreate()
import spark.implicits._

def func(a: Array[Int]): Array[Int] = a
val funcUDF = udf((a: Array[Int]) => func(a))

var data = Seq(Array(1, 2, 3), Array(3, 4, 5), Array(6, 2, 4)).toDF("items")
data = data.withColumn("a", funcUDF(col("items")))
data.show()

The error I get is related to a ClassCastException, saying that it is impossible to cast from scala.collection.mutable.WrappedArray$ofRef to org.apache.spark.sql.catalyst.expressions.ScalaUDF.$anonfun$f$2 .我得到的错误与 ClassCastException 相关,说不可能从scala.collection.mutable.WrappedArray$ofReforg.apache.spark.sql.catalyst.expressions.ScalaUDF.$anonfun$f$2 I add a part of the stack below.我在下面添加了堆栈的一部分。 If it can help, I am using https://community.cloud.databricks.com/ .如果有帮助,我正在使用https://community.cloud.databricks.com/

Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [I at org.apache.spark.sql.catalyst.expressions.ScalaUDF.$anonfun$f$2(ScalaUDF.scala:155) at org.apache.spark.sql.catalyst.expressions.ScalaUDF.eval(ScalaUDF.scala:1125) at org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:156) at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(InterpretedMutableProjection.scala:83) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$15.$anonfun$applyOrElse$70(Optimizer.scala:1557) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike.map(TraversableLike.scala:238) at scala.collection.TraversableLike.map$(TraversableLike.scala:231) at scala.collection.immutable.List.map(List.scala:298) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$引起:java.lang.ClassCastException:scala.collection.mutable.WrappedArray$ofRef 无法转换为 [I at org.apache.spark.sql.catalyst.expressions.ScalaUDF.$anonfun$f$2(ScalaUDF.scala:155) ) 在 org.apache.spark.sql.catalyst.expressions.ScalaUDF.eval(ScalaUDF.scala:1125) 在 org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:156) 在 org. apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(InterpretedMutableProjection.scala:83) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$15.$anonfun$applyOrElse$70(calaOptimizer. :1557) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike.map(TraversableLike. scala:238) at scala.collection.TraversableLike.map$(TraversableLike.scala:231) at scala.collection.immutable.List.map(List.scala:298) at org.apache.spark.sql.catalyst.optimizer。转换为本地关系$$ anonfun$apply$15.applyOrElse(Optimizer.scala:1557) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$15.applyOrElse(Optimizer.scala:1552) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:322) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:80) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:322) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:153) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:151) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.sp anonfun$apply$15.applyOrElse(Optimizer.scala:1557) 在 org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$15.applyOrElse(Optimizer.scala:1552) 在 org.apache.spark.sql .catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:322) 在 org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:80) 在 org.apache.spark。 sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:322) 在 org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$ super$transformDown(LogicalPlan.scala:29) 在 org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:153) 在 org.apache.spark.sql.catalyst.plans.logical。 AnalysisHelper.transformDown$(AnalysisHelper.scala:151) 在 org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) 在 org.apache.spark.sql.catalyst.plans.logical .LogicalPlan.transformDown(LogicalPlan.scala:29) 在 org.apache.sp ark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:327) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:412) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:250) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:410) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:363) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:327) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:153) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:151) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.log ark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:327) 在 org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:412) 在org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:250) 在 org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:410) 在 org.apache。 spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:363) 在 org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:327) 在 org.apache.spark.sql。 catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) 在 org.apache.spark.sql.catalyst.plans.logical。 AnalysisHelper.transformDown(AnalysisHelper.scala:153) 在 org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:151) 在 org.apache.spark.sql.catalyst.plans.logical .LogicalPlan.transformDown(LogicalPlan.scala:29) 在 org.apache.spark.sql.catalyst.plans.log ical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:327) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:412) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:250) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:410) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:363) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:327) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:153) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:151) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transform ical.LogicalPlan.transformDown(LogicalPlan.scala:29) 在 org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:327) 在 org.apache.spark.sql.catalyst。 tree.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:412) 在 org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:250) 在 org.apache.spark.sql.catalyst。 tree.TreeNode.mapChildren(TreeNode.scala:410) 在 org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:363) 在 org.apache.spark.sql.catalyst.trees.TreeNode。 transformDown(TreeNode.scala:327) 在 org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala: 29) 在 org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:153) 在 org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala :151) 在 org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transform Down(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:327) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:412) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:250) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:410) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:363) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:327) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:153) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:151) Down(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun $transformDown$3(TreeNode.scala:327) 在 org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:412) 在 org.apache.spark.sql.catalyst.trees。 TreeNode.mapProductIterator(TreeNode.scala:250) 在 org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:410) 在 org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren( TreeNode.scala:363) 在 org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:327) 在 org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$ spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) 在 org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:153) 在 org .apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:151) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:311) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$.apply(Optimizer.scala:1552) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$.apply(Optimizer.scala:1551) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:152) at scala.collection.IndexedSeqOptimized.foldLeft(IndexedSeqOptimized.scala:60) at scala.collection.IndexedSeqOptimized.foldLeft$(IndexedSeqOptimized.scala:68) at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:38) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:149) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:141) at scala.collect在 org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) 在 org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)在 org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:311) 在 org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$.apply(Optimizer.scala:1552) 在 org. apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$.apply(Optimizer.scala:1551) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:152) at scala .collection.IndexedSeqOptimized.foldLeft(IndexedSeqOptimized.scala:60) at scala.collection.IndexedSeqOptimized.foldLeft$(IndexedSeqOptimized.scala:68) at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:38) at org .spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:149) 在 org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:第141话ion.immutable.List.foreach(List.scala:392) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:141) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:119) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:119) at org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:107) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:171) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:836) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:171) at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:104) at org.apache.spark.sq ion.immutable.List.foreach(List.scala:392) 在 org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:141) 在 org.apache.spark.sql.catalyst.rules。 RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:119) 在 org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88) 在 org.apache.spark.sql.catalyst.rules.RuleExecutor .executeAndTrack(RuleExecutor.scala:119) 在 org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:107) 在 org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker) .scala:111) 在 org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:171) 在 org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:836) 在org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:171) 在 org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:104) 在 org.apache.spark。平方l.execution.QueryExecution.optimizedPlan(QueryExecution.scala:104) at org.apache.spark.sql.execution.QueryExecution.$anonfun$writePlans$4(QueryExecution.scala:246) at org.apache.spark.sql.catalyst.plans.QueryPlan$.append(QueryPlan.scala:466) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$writePlans(QueryExecution.scala:246) at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:256) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:109) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:249) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:101) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:836) at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQ l.execution.QueryExecution.optimizedPlan(QueryExecution.scala:104) at org.apache.spark.sql.execution.QueryExecution.$anonfun$writePlans$4(QueryExecution.scala:246) at org.apache.spark.sql.catalyst。 plan.QueryPlan$.append(QueryPlan.scala:466) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$writePlans(QueryExecution.scala:246) at org. apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:256) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:109) at org.apache.spark .sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:249) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:101) at org.apache.spark.sql .SparkSession.withActive(SparkSession.scala:836) 在 org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77) 在 org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQ) LExecution.scala:199) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3700) at org.apache.spark.sql.Dataset.head(Dataset.scala:2711) at org.apache.spark.sql.Dataset.take(Dataset.scala:2918) at org.apache.spark.sql.Dataset.getRows(Dataset.scala:305) at org.apache.spark.sql.Dataset.showString(Dataset.scala:342) at org.apache.spark.sql.Dataset.show(Dataset.scala:838) at org.apache.spark.sql.Dataset.show(Dataset.scala:797) at org.apache.spark.sql.Dataset.show(Dataset.scala:806) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:14) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:164) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$ LExecution.scala:199) 在 org.apache.spark.sql.Dataset.withAction(Dataset.scala:3700) 在 org.apache.spark.sql.Dataset.head(Dataset.scala:27​​11) 在 org.apache.spark .sql.Dataset.take(Dataset.scala:2918) 在 org.apache.spark.sql.Dataset.getRows(Dataset.scala:305) 在 org.apache.spark.sql.Dataset.showString(Dataset.scala:342) ) 在 org.apache.spark.sql.Dataset.show(Dataset.scala:838) 在 org.apache.spark.sql.Dataset.show(Dataset.scala:797) 在 org.apache.spark.sql.Dataset。 show(Dataset.scala:806) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw $iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw。(命令-1114467142343660:14) 在 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$iw$iw $iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:164) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$ $iw$$iw$$iw$$iw$$iw$$iw$ $iw$$iw$$iw$$iw$$iw.(command-1114467142343660:166) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:168) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:170) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:172) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:174) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:176) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$ $iw$$iw$$iw$$iw$$iw.(command-1114467142343660:166) 在 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$iw$ iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$ $iw.(command-1114467142343660:168) 在 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$ iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:170) 位于 lineedcf33d032244134ad7262d$iw$ $$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$ iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:172) 在 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$iw$iw$iw$iw$ $$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw。( command-1114467142343660:174) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$iw$iw$ $$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:176) 在 lineedcf33d032244134ad784ac9de826d3b$265.$read$i$iw$iw $iw$$iw$$iw$$iw$$iw$$iw$ $iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:178) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:180) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:182) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:184) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:186) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:188) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:190) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$ $iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:178) 在 lineedcf33d032244134ad784ac9de826d3b26$iw$iw$55 iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw。 (command-1114467142343660:180) 在 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$iw$ iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:182) 在 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$iw$iw$iw$iw$ $$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:184) 在 lineedcf33d032244134ad784$ac9de826d3b$2$6 $iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:186) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw。 (command-1114467142343660:188) 在 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$iw$ iw$$iw.(command-1114467142343660:190) 在 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$ $iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:192) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:194) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:196) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:198) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:200) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:202) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:204) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:206) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:208) at lineed $iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:192) 位于 lineedcf33d032244134ad786d3bread282$5$ iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:194) 在 lineedcf33d032244134ad784ac9de265w$iw$ $$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:196) 在 lineedcf33d032244134ad784$ac9de826d3b$26 $iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:198) 在 lineedcf33d032244134ad784ac9de826d3b$265.$read$iw$iw$ iw$$iw$$iw$$iw$$iw.(command-1114467142343660:200) 在 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$iw$iw$iw$ (命令1114467142343660:202)。在lineedcf33d032244134ad784ac9de826d3b265 $ $$读IW $$ IW $$ IW $$ IW $$ IW $$ IW $$ IW(命令1114467142343660:204)。在lineedcf33d032244134ad784ac9de826d3b265 $ $$读IW $$iw$$iw$$iw$$iw$$iw.(command-1114467142343660:206) 位于 lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$iw$1.240343660:206(command-1114467142343660:206) ) 内衬cf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw$$iw.(command-1114467142343660:210) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw$$iw.(command-1114467142343660:212) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw$$iw.(command-1114467142343660:214) at lineedcf33d032244134ad784ac9de826d3b265.$read$$iw.(command-1114467142343660:216) at lineedcf33d032244134ad784ac9de826d3b265.$read.(command-1114467142343660:218) at lineedcf33d032244134ad784ac9de826d3b265.$read$.(command-1114467142343660:222) at lineedcf33d032244134ad784ac9de826d3b265.$read$.(command-1114467142343660) at lineedcf33d032244134ad784ac9de826d3b265.$eval$.$print$lzycompute(:7) at lineedcf33d032244134ad784ac9de826d3b265.$eval$.$print(:6) at lineedcf33d032244134ad784ac9de826d3b265.$eval.$print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI cf33d032244134ad784ac9de826d3b265 $ $$读IW $$ IW $$ IW $$ IW(命令1114467142343660:210)。在lineedcf33d032244134ad784ac9de826d3b265 $ $$读IW $$ IW $$ IW(命令1114467142343660:212)。在lineedcf33d032244134ad784ac9de826d3b265 $在lineedcf33d032244134ad784ac9de826d3b265 $ $读读$$ IW $$ IW(命令1114467142343660:214)。在lineedcf33d032244134ad784ac9de826d3b265 $ $$读IW(命令1114467142343660::216)。在lineedcf33d032244134ad784ac9de826d3b265 $读取(218命令1114467142343660)。 (命令1114467142343660:222)。在lineedcf33d032244134ad784ac9de826d3b265 $在lineedcf33d032244134ad784ac9de826d3b265读取$(命令1114467142343660)$ EVAL $ $打印$ lzycompute。(7)。在lineedcf33d032244134ad784ac9de826d3b265 $ EVAL $ $打印。(6)在lineedcf33d032244134ad784ac9de826d3b265 .$eval.$print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI) mpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:745) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1021) at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:574) at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:41) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:37) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:600) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:570) at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:219) at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:204) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV mpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:745) at scala.tools.nsc .interpreter.IMain$Request.loadAndRun(IMain.scala:1021) 在 scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:574) 在 scala.reflect.internal.util.ScalaClassLoader.asContext (ScalaClassLoader.scala:41) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:37) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41) at scala.tools .nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:600) at scala.tools.nsc.interpreter.IMain.interpret(IMain. scala:570) 在 com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:219) 在 com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:204)在 scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV $sp.java:23) at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:773) at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:726) at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:204) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:431) at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:239) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:234) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:231) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:48) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:276) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:269) at com.da $sp.java:23) 在 com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:773) 在 com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal .scala:726) 在 com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:204) 在 com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:431) ) 在 com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:239) 在 scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) 在 com.databricks.logging.UsageLogging.withAttribution.Logging.withAttribution.Logging scala:234) 在 com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:231) 在 com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:48) 在 com.databricks.logging。 UsageLogging.withAttributionTags(UsageLogging.scala:27​​6) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:269) at com.da tabricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:48) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:408) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:645) at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219) at java.lang.Thread.run(Thread.java:748) tabricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:48) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:408) at com.databricks.backend.daemon.driver。 DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653) 在 scala.util.Try$.apply(Try.scala:213) 在 com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper45scala:6) ) 在 com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486) 在 com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598) 在 com.databricks.backend daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337) at com.databricks.backend.daemon.driver.DriverWrapper.run DriverWrapper.scala:219) 在 java.lang.Thread.run(Thread.java:748)

The problem is that your "items" column is of type WrappedArray (which is the Spark type for every array like type).问题是您的“项目”列是 WrappedArray 类型(这是每个类似数组的类型的 Spark 类型)。 And there is no implicit conversion between Array and WrappedArray. Array 和 WrappedArray 之间没有隐式转换。 So I would suggest to use Seq because WrappedArray is a subclass of Seq but it is not a subclass of Array.所以我建议使用 Seq,因为 WrappedArray 是 Seq 的子类,但它不是 Array 的子类。

This works :这有效:

import org.apache.spark.sql.{SparkSession, DataFrame}
import org.apache.spark.sql.functions.{col, udf}

val spark = SparkSession.builder.appName("test") 
             .master("local[*]")
             .getOrCreate()
import spark.implicits._

def func(a: Array[Int]): Array[Int] = a
val funcUDF = udf((a: Seq[Int]) => func(a.toArray))

var data = Seq(Array(1, 2, 3), Array(3, 4, 5), Array(6, 2, 4)).toDF("items")
data = data.withColumn("a", funcUDF(col("items")))
data.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM