简体   繁体   English

Spark,在Scala中添加具有相同值的新列

[英]Spark, add new Column with the same value in Scala

I have some problem with the withColumn function in Spark-Scala environment. 我在Spark-Scala环境中的withColumn函数有一些问题。 I would like to add a new Column in my DataFrame like that: 我想在我的DataFrame中添加一个新列,如下所示:

+---+----+---+
|  A|   B|  C|
+---+----+---+
|  4|blah|  2|
|  2|    |  3|
| 56| foo|  3|
|100|null|  5|
+---+----+---+

became: 成为:

+---+----+---+-----+
|  A|   B|  C|  D  |
+---+----+---+-----+
|  4|blah|  2|  750|
|  2|    |  3|  750|
| 56| foo|  3|  750|
|100|null|  5|  750|
+---+----+---+-----+

the column D in one value repeated N-time for each row in my DataFrame. 对于我的DataFrame中的每一行,一列中的列D重复N次。

The code are this: 代码是这样的:

var totVehicles : Double = df_totVehicles(0).getDouble(0); //return 750

The variable totVehicles returns the correct value, it's works! 变量totVehicles返回正确的值,它的工作原理!

The second DataFrame has to calculate 2 fields (id_zipcode, n_vehicles), and add the third column (with the same value -750): 第二个DataFrame必须计算2个字段(id_zipcode,n_vehicles),并添加第三列(具有相同的值-750):

var df_nVehicles =
df_carPark.filter(
      substring($"id_time",1,4) < 2013
    ).groupBy(
      $"id_zipcode"
    ).agg(
      sum($"n_vehicles") as 'n_vehicles
    ).select(
      $"id_zipcode" as 'id_zipcode,
      'n_vehicles
    ).orderBy(
      'id_zipcode,
      'n_vehicles
    );

Finally, I add the new column with withColumn function: 最后,我使用withColumn函数添加新列:

var df_nVehicles2 = df_nVehicles.withColumn(totVehicles, df_nVehicles("n_vehicles") + df_nVehicles("id_zipcode"))

But Spark returns me this error: 但Spark告诉我这个错误:

 error: value withColumn is not a member of Unit
         var df_nVehicles2 = df_nVehicles.withColumn(totVehicles, df_nVehicles("n_vehicles") + df_nVehicles("id_zipcode"))

Can you help me? 你能帮助我吗? Thank you very much! 非常感谢你!

lit function is for adding literal values as a column lit函数用于将文字值添加为列

import org.apache.spark.sql.functions._
df.withColumn("D", lit(750))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将具有文字值的新列添加到 Spark Scala 中 Dataframe 中的结构列 - Add new column with literal value to a struct column in Dataframe in Spark Scala Spark Scala DF。 在处理同一列的某些行时将新列添加到DF - Spark Scala DF. add a new Column to DF based in processing of some rows of the same column Scala Spark,如何为列添加值 - Scala Spark, how to add value to the column Spark Scala:添加新列而不循环多次抛出同一个表 - Spark Scala : Add new column without looping throw same table many times spark scala:添加一个以列表为值的新列; 该列表应验证特定条件 - spark scala: add a new column that has a list as a value; the list should verify a specific condition 如何基于Spark Scala中的现有列添加新列 - How add new column based on existing column in spark scala 将 Map Datatype 的新列添加到 Scala 中的 Spark Dataframe - Add new column of Map Datatype to Spark Dataframe in scala 我们如何使用 Scala 在 spark 中添加列值? - How do we add column value in spark using Scala? 在Spark DataFrame中添加一个新列,其中包含一个列的所有值的总和-Scala / Spark - Add a new Column in Spark DataFrame which contains the sum of all values of one column-Scala/Spark 如何在spark中使用逗号分隔符将相同的列值连接到新列 - how to concat the same column value to a new column with comma delimiters in spark
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM