![](/img/trans.png)
[英]Simple join of two Spark DataFrame failing with “org.apache.spark.sql.AnalysisException: Cannot resolve column name”
[英]org.apache.spark.sql.AnalysisException: cannot resolve 'Column_name` Exception in Spark SQL
我正在嘗試從Hive表中讀取數據,然后添加具有null值的加法列。通過此操作,我得到以下錯誤:
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`address_1`' given input columns: [postalcode, first_name, organization_name, application_number, type, last_name, country];;
實際上address_1不是Hive中的列,請嘗試使用默認值“ null ”添加此列。
到目前為止,我嘗試過的是:
val ipa_agent = hiveContext.sql("select * from agent")
val df1 = ipa_agent.withColumn("address_1",lit("null"))
除了withColumn之外,還有其他添加列的方法嗎?
列出查詢中的所有列和其他列:
val ipa_agent = hiveContext.sql("select postalcode, first_name, organization_name, application_number, type, last_name, country, cast(null as string) as address_1 from agent")
我嘗試使用withColumn添加一個新列,它對我來說很好
import sqlContext.implicits._
import org.apache.spark.sql.functions._
val df = sc.parallelize(Array(("101",1),("102",2))).toDF("id","rank")
val df_added_column = df.withColumn("address1", lit("null"));
df_added_column.show
+---+----+--------+
| id|rank|address1|
+---+----+--------+
|101| 1| null|
|102| 2| null|
+---+----+--------+
其他選項,您可以嘗試按leftjoin的方式提及所有列名,並在末尾添加address1為null。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.