简体   繁体   English

Scala Spark - Select 列按名称和列表

[英]Scala Spark - Select columns by name and list

I'm trying to select columns from a Scala Spark DataFrame using both single column names and names extracted from a List.我正在尝试使用从列表中提取的单列名称和名称从 Scala Spark DataFrame 中的 select 列。 My current solutions looks like:我目前的解决方案如下:

var cols_list = List("d", "e")

df
.select(
    col("a"),
    col("b"),
    col("c"),
    cols_list.map(col): _*)

However, it throws an error:但是,它会引发错误:

<console>:81: error: no `: _*' annotation allowed here
(such annotations are only allowed in arguments to *-parameters)
               cols_list.map(col): _*
                                        ^

Any help will be appreciated任何帮助将不胜感激

select accepts a List[Column] , so you need to construct and provide that list, eg select接受List[Column] ,因此您需要构建并提供该列表,例如

df.select(col("a") :: col("b") :: col("c") :: cols_list.map(col): _*)

Your code is working fine for me, you can also use the $ notation.您的代码对我来说运行良好,您也可以使用 $ 表示法。

scala> df.select(cols_list.map(col):_*)
res8: org.apache.spark.sql.DataFrame = [d: int, e: int]

scala> df.select(cols_list.map(c => $"$c"):_*)
res9: org.apache.spark.sql.DataFrame = [d: int, e: int]

Maybe you just need to import spark.implicits._也许你只需要import spark.implicits._

EXTRA: Also check your variable names, it's a naming convention in scala to use camel case and try to avoid var (this is just a good practise issue, not related to your error at all)额外:还要检查您的变量名称,这是 scala 中的命名约定,使用驼峰式大小写并尽量避免使用 var (这只是一个很好的实践问题,与您的错误完全无关)

val colsList = List("d", "e")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM