简体   繁体   English

Scala / Spark:当列列表> 0时,如何选择仅读取列

[英]Scala/Spark: How to select columns to read ONLY when list of columns > 0

I'm passing in a parameter fieldsToLoad: List[String] and I want to load ALL columns if this list is empty and load only the columns specified in the list if the list has more one or more columns. 我传递了一个参数fieldsToLoad: List[String]并且如果此列表为空,我想加载所有列,如果列表具有一个或多个列,则仅加载列表中指定的列。 I have this now which reads the columns passed in the list: 我现在有这个读取列表中传递的列:

    val parquetDf = sparkSession.read.parquet(inputPath:_*).select(fieldsToLoad.head, fieldsToLoadList.tail:_*)

But how do I add a condition to load * (all columns) when the list is empty? 但是,当列表为空时,如何添加条件以加载*(所有列)?

You could use an if statement first to replace the empty with just * : 您可以先使用if语句,用*替换空:

val cols = if (fieldsToLoadList.nonEmpty) fieldsToLoadList else Array("*")
sparkSession.read.parquet(inputPath:_*).select(cols.head, cols.tail:_*).

@Andy Hayden answer is correct but I want to introduce how to use selectExpr function to simplify the selection @Andy Hayden的答案是正确的,但我想介绍如何使用selectExpr函数简化选择

scala> val df = Range(1, 4).toList.map(x => (x, x + 1, x + 2)).toDF("c1", "c2", "c3")
df: org.apache.spark.sql.DataFrame = [c1: int, c2: int ... 1 more field]

scala> df.show()
+---+---+---+
| c1| c2| c3|
+---+---+---+
|  1|  2|  3|
|  2|  3|  4|
|  3|  4|  5|
+---+---+---+


scala> val fieldsToLoad = List("c2", "c3")
fieldsToLoad: List[String] = List(c2, c3)                                                  ^

scala> df.selectExpr((if (fieldsToLoad.nonEmpty) fieldsToLoad else List("*")):_*).show()
+---+---+
| c2| c3|
+---+---+
|  2|  3|
|  3|  4|
|  4|  5|
+---+---+


scala> val fieldsToLoad = List()
fieldsToLoad: List[Nothing] = List()

scala> df.selectExpr((if (fieldsToLoad.nonEmpty) fieldsToLoad else List("*")):_*).show()
+---+---+---+
| c1| c2| c3|
+---+---+---+
|  1|  2|  3|
|  2|  3|  4|
|  3|  4|  5|
+---+---+---+

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM