如何使用文本文件中的列名在数据块中创建增量表的模式

Question

I have a text file which stores all the column names for ex: text file contains following data我有一个文本文件，其中存储所有列名，例如：文本文件包含以下数据

table1=['column1','2_column2','3_column3']
table2=['column4','5_column5','6_column6']

I need to fetch all the values and create a schema for tables mentioned in the text file.我需要获取所有值并为文本文件中提到的表创建模式。 and also some of the the columns starts with number as well as mentioned above.还有一些列以数字开头，如上所述。 output needed:需要 output：

table1 = StructType([
        StructField("column1", StringType(), True),
        StructField("2_column2", StringType(), True),
        StructField("3_column3", StringType(), True)
    ]

table2 = StructType([
        StructField("column4", StringType(), True),
        StructField("5_column5", StringType(), True),
        StructField("6_column6", StringType(), True)
    ]

all the columns will be string type .所有列都是字符串类型。

how to achieve this using python/pyspark?如何使用 python/pyspark 实现这一点？

Answer 1

table1 = ['column1', '2_column2', '3_column3']
data = []
for i in table1:
    data.append(f'StructField({i}, StringType(), True)')
table1 = f'StructType({data})'
print(table1)
>>> StructType(['StructField(column1, StringType(), True)', 'StructField(2_column2, StringType(), True)', 'StructField(3_column3, StringType(), True)'])

如何使用文本文件中的列名在数据块中创建增量表的模式

问题描述

1 个解决方案

解决方案1
0 2022-08-25 10:34:07

如何使用文本文件中的列名在数据块中创建增量表的模式

问题描述

1 个解决方案

解决方案1 0 2022-08-25 10:34:07

解决方案1
0 2022-08-25 10:34:07