[英]How to add assign value to empty dataframe existing column in scala?
I am reading a csv file which has | 我正在读取具有| delimiter at last , while load method make last column in dataframe with no name and no values in Spark 1.6
最后一个定界符,而load方法在Spark 1.6中使数据帧中的最后一列没有名称且没有值
df.withColumnRenamed(df.columns(83),"Invalid_Status").drop(df.col("Invalid_Status")) df.withColumnRenamed(df.columns(83),“ Invalid_Status”)。drop(df.col(“ Invalid_Status”))
val df = sqlContext.read.format("com.databricks.spark.csv").option("delimiter","|").option("header","true").load("filepath")
val df2 = df.withColumnRenamed(df.columns(83),"Invalid_Status").
I expected result
root
|-- FddCell: string (nullable = true)
|-- Trn_time: string (nullable = true)
|-- CELLNAME.FddCell: string (nullable = true)
|-- Invalid_Status: string (nullable = true)
but actual output is
root
|-- FddCell: string (nullable = true)
|-- Trn_time: string (nullable = true)
|-- CELLNAME.FddCell: string (nullable = true)
|-- : string (nullable = true)
with no value in column so I have to drop this column and again make new column.
It is not completely clear what you want to do, to just rename the column to Invalid_Status or to drop the column entirely. 只是将列重命名为Invalid_Status还是完全删除该列,并不清楚要做什么。 What I understand is, you are trying to operate (rename/drop) on the last column which has no name.
我了解的是,您正在尝试对没有名称的最后一列进行操作(重命名/删除)。
But I will try to help you with both the solution - 但我会尽力为您提供两种解决方案-
To Rename the column with same values (blanks) as it is: 用与之相同的值(空白)重命名该列:
val df2 = df.withColumnRenamed(df.columns.last,"Invalid_Status")
Only To Drop the last column without knowing its name, use: 仅要删除最后一列而不知道其名称,请使用:
val df3 = df.drop(df.columns.last)
And then add the "Invalid_Status" column with default values: 然后使用默认值添加“ Invalid_Status”列:
val requiredDf = df3.withColumn("Invalid_Status", lit("Any_Default_Value"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.