简体   繁体   English

如何从列名列表中动态创建结构列?

[英]How to dynamically create a struct column from a list of column names?

I have a dataframe with 100's of columns:我有一个 dataframe 有 100 列:

root
 |-- column1
 |-- column2
 |-- column3
 |-- column4
 |-- column5

I have a list of the column names:我有一个列名列表:

struct_list = ['column4','column3','column2'] struct_list = ['column4','column3','column2']

Expected Schema:预期架构:

root
 |-- column1
 |-- column2
 |-- column3
 |-- column4
 |-- column5
 |-- prev_val 
       |-- column4
       |-- column3
       |-- column2

Currently I am hardcoding the values like:目前我正在硬编码以下值:

df=df.withColumn("prev_val",f.struct(f.col("column4"),f.col("column3"),f.col("column2"))

Is there a way we can dynamically pass the values from the list?有没有一种方法可以动态传递列表中的值?

You can use a list comprehension:您可以使用列表推导:

import pyspark.sql.functions as f

struct_list = ['column4','column3','column2']

df2 = df.withColumn(
    "prev_val",
    f.struct(*[f.col(c) for c in struct_list])
)

And actually you don't even need f.col .实际上你甚至不需要f.col You can just pass the column names directly:您可以直接传递列名:

df2 = df.withColumn(
    "prev_val",
    f.struct(*struct_list)
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用过滤的列名动态创建数据框 - how to dynamically create dataframes with filtered column names 如何使用列表中的列名创建SQLite3表? - How to create a SQLite3 table with column names from a list? 如何从熊猫的数据框中创建列名,行名和值的列表? - How to create a list of column names, row names, and values from data frame in panda? 如何根据形状为数据框动态创建列名 - How to dynamically create column names for dataframe based on shape 如何使用 SQLAlchemy Postgres ORM 的声明性基础从字典动态创建具有列名和约束的表? - How to Dynamically Create Tables With Column Names and Constraints From Dictionary Using SQLAlchemy Postgres ORM's Declarative Base? 如何从熊猫列表中重命名列名? - How to rename column names from a list in pandas? 如何使用另一列的名称基于 ttest 创建列? - How to create a column based on ttest using names from another column? 熊猫自动从具有列名称的系列列表中创建数据框 - pandas automatically create dataframe from list of series with column names 从没有索引和列名称的pandas单元格创建列表列表 - Create list of lists from pandas cells without index and column names python pandas - 如何为每一行创建一个带有条件的列名列表? - python pandas - how to create for each row a list of column names with a condition?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM