简体   繁体   English

如何在scala中的空数据框现有列上添加赋值?

[英]How to add assign value to empty dataframe existing column in scala?

I am reading a csv file which has | 我正在读取具有| delimiter at last , while load method make last column in dataframe with no name and no values in Spark 1.6 最后一个定界符,而load方法在Spark 1.6中使数据帧中的最后一列没有名称且没有值

df.withColumnRenamed(df.columns(83),"Invalid_Status").drop(df.col("Invalid_Status")) df.withColumnRenamed(df.columns(83),“ Invalid_Status”)。drop(df.col(“ Invalid_Status”))

val df = sqlContext.read.format("com.databricks.spark.csv").option("delimiter","|").option("header","true").load("filepath") 
val df2 = df.withColumnRenamed(df.columns(83),"Invalid_Status").

I expected result 
root
 |-- FddCell: string (nullable = true)
 |-- Trn_time: string (nullable = true)
 |-- CELLNAME.FddCell: string (nullable = true)
 |-- Invalid_Status: string (nullable = true)

but actual output is
root
 |-- FddCell: string (nullable = true)
 |-- Trn_time: string (nullable = true)
 |-- CELLNAME.FddCell: string (nullable = true)
 |-- : string (nullable = true)

with no value in column so I have to drop this column and again make new column.

It is not completely clear what you want to do, to just rename the column to Invalid_Status or to drop the column entirely. 只是将列重命名为Invalid_Status还是完全删除该列,并不清楚要做什么。 What I understand is, you are trying to operate (rename/drop) on the last column which has no name. 我了解的是,您正在尝试对没有名称的最后一列进行操作(重命名/删除)。

But I will try to help you with both the solution - 但我会尽力为您提供两种解决方案-

To Rename the column with same values (blanks) as it is: 用与之相同的值(空白)重命名该列:

val df2 = df.withColumnRenamed(df.columns.last,"Invalid_Status")

Only To Drop the last column without knowing its name, use: 仅要删除最后一列而不知道其名称,请使用:

val df3 = df.drop(df.columns.last)

And then add the "Invalid_Status" column with default values: 然后使用默认值添加“ Invalid_Status”列:

val requiredDf = df3.withColumn("Invalid_Status", lit("Any_Default_Value"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 关于如何在 Scala 中使用随机值向现有 DataFrame 添加新列 - About how to add a new column to an existing DataFrame with random values in Scala 如何进行 groupby 排名并将其作为列添加到 spark scala 中的现有 dataframe? - How to do a groupby rank and add it as a column to existing dataframe in spark scala? 如何使用Scala / Spark 2.2将列添加到现有DataFrame并使用window函数在新列中添加特定行 - How to add a column to the existing DataFrame and using window function to add specific rows in the new column using Scala/Spark 2.2 如何在 scala/python 中将计算列添加到 dataframe? - how to add a calculated column to a dataframe in scala/python? 如何为复杂类型的列分配空值? - How to assign an empty value to a column of complex type? #SPARK #需要从spark Scala中的其他dataframe列分配dataframe列值 - #SPARK #Need to assign dataframe column value from other dataframe column in spark Scala 将具有文字值的新列添加到 Spark Scala 中 Dataframe 中的结构列 - Add new column with literal value to a struct column in Dataframe in Spark Scala 如何基于Spark Scala中的现有列添加新列 - How add new column based on existing column in spark scala 如何在 DataFrame 中添加一个空的 map 类型列? - How to add an empty map type column to DataFrame? Scala Spark,如何为列添加值 - Scala Spark, how to add value to the column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM