简体   繁体   English

根据Apache Spark Scala中的列数据类型将数据框中的列选择为另一个数据框

[英]Select columns from a dataframe into another dataframe based on column datatype in Apache Spark Scala

I have a spark dataframe 我有一个火花数据框

inputDF: org.apache.spark.sql.DataFrame = [_id: string, Frequency:              double, Monterary: double, Recency: double, CustID: string]
        root
     |-- _id: string (nullable = false)
     |-- Frequency: double (nullable = false)
     |-- Monterary: double (nullable = false)
     |-- Recency: double (nullable = false)
     |-- CustID: string (nullable = false)

I want to create a new dataframe by dropping string columns from this. 我想通过从中删除字符串列来创建一个新的数据框。 Specific condition is not to iterate over the column names . 具体条件是不要迭代列名。 Anyone has any idea ? 有人有什么主意吗?

If schema is flat and contains only simple types you can filter over fields but unless you have a crystal ball you cannot really avoid iteration: 如果模式是平面的并且仅包含简单类型,则可以过滤字段,但是除非您拥有水晶球,否则您不能真正避免迭代:

import org.apache.spark.sql.types.StringType
import org.apache.spark.sql.functions.col

df.select(df.schema.fields.flatMap(f => f.dataType match {
  case StringType => Nil
  case _ => col(f.name) :: Nil
}): _*)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据 scala/spark 中的案例 class 更改 dataframe 中列的数据类型 - How to change datatype of columns in a dataframe based on a case class in scala/spark 使用 Scala 将某个 DataType 的所有列的 DataType 转换为 Spark DataFrame 中的另一个 DataType - Convert DataType of all columns of certain DataType to another DataType in Spark DataFrame using Scala 如何根据另一列的值从 Spark DataFrame 中选择特定列? - How to select specific columns from Spark DataFrame based on the value of another column? scala/spark - 将数据框分组并从其他列中选择值作为数据框 - scala/spark - group dataframe and select values from other column as dataframe DataFrame select方法内的Scala Apache Spark和动态列列表 - Scala Apache Spark and dynamic column list inside of DataFrame select method Select 基于 Spark 中另一列值的列 Dataframe 使用 Scala - Select a column based on another column's value in Spark Dataframe using Scala Scala Spark基于数据帧中的另一列增加一列而不使用for循环 - Scala Spark Incrementing a column based on another column in dataframe without for loops Spark DataFrame根据列条件更改数据类型 - Spark DataFrame change datatype based on column condition Spark - 基于另一个数据帧中一列的值查询数据帧 - Spark - query dataframe based on values from a column in another dataframe 从数据帧 spark scala 中选择列数组和 expr - select array of columns and expr from dataframe spark scala
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM