![](/img/trans.png)
[英]How To Read XML File from Azure Data Lake In Synapse Notebook without Using Spark
[英]extract json column using spark.sql in azure synapse notebook
我有一个镶木地板文件作为源,我使用 PySpark 笔记本加载了该镶木地板文件,如下所示:
df_Employee = spark.read.parquet(<filename>)
df_Employee .createOrReplaceTempView("employee_data")
这是桌子的样子
Employee Table:
-Name
-Salary
-Company
-Address (datatype=string)
--street.name
--street.number
--postalcode
-JoiningDate
我有以下代码,但我坚持如何从上面的 SQL 表中提取 street.name & street.number,这就是我所拥有的
df=spark.sql(f'''
select Name, Salary, Company, json_extract(Address,'$."street.name"') as StreetName
from employee_data
''')
但是json_extract(Address,'$."street.name"') 作为 StreetName会抛出错误。 如何提取这个嵌套的 json 字段?
我根据 Employee 表在我的环境中重现了相同的创建示例数据框:
dat1= [("vamsi", 20000, "MID", '{"street.name": "App socity", "street.number": "912", "postalcode": "523112"}', "2023-01-20"),
("rakesh",30000, "MID", '{"street.name": "Mind space", "street.number": "456", "postalcode": "600062"}', "2023-01-19")]
col = ["Name", "Salary", "Company", "Address", "JoiningDate"]
df1 = spark.createDataFrame(dat1, col)
df1.createOrReplaceTempView("sample_table")
您可以使用以下代码实现相同的要求。
df1 = spark.sql(f'''select Name, Salary, Company, json_tuple(Address, 'street.name', 'street.number') as (StreetName, StreetNumber)
from sample_table''')
df1.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.