简体   繁体   English

从 pyspark dataframe 中提取模式标签

[英]Extract schema labels from pyspark dataframe

From a pyspark dataframe I want to create a python list with the schema labels for a specific schema "level".从 pyspark dataframe 我想创建一个 python 列表,其中包含特定模式“级别”的模式标签。

The schema is:架构是:

root
 |-- DISPLAY: struct (nullable = true)
 |    |-- 1WO: struct (nullable = true)
 |    |    |-- JPY: struct (nullable = true)
 |    |    |    |-- CHANGE24HOUR: string (nullable = true)
 |    |    |    |-- CHANGEDAY: string (nullable = true)
 |    |-- AAVE: struct (nullable = true)
 |    |    |-- JPY: struct (nullable = true)
 |    |    |    |-- CHANGE24HOUR: string (nullable = true)
 |    |    |    |-- CHANGEDAY: string (nullable = true)

The expected output is:预期的 output 为:

list = 1WO, AAVE

The following code print everything in the schema:以下代码打印架构中的所有内容:

df.schema.jsonValue()

Is there an easy way to extract those labels pls?有没有一种简单的方法来提取这些标签?

Select the first layer using the asterisk notation, and the n list the columns: Select 第一层使用星号表示法,n 列出列:

df.select('DISPLAY.*').columns

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM