[英]how to convert pyspark data frame to json list without double quotes for numbers and extra // for date columns
I am trying to to convert pyspark data frame to json list which i need to pass the json values to api, when am trying to convert all json values populating with "" like valuue =12 but when converting its coming like value ="12" and date with 3\/7\/2022 how can i avoid these extra \ and double quotes.我正在尝试将 pyspark 数据帧转换为我需要将 json 值传递给 api 的 json 列表,当我尝试转换填充有“”的所有 json 值时,比如 value = 12,但是当转换它的时候就像 value =“12”和日期 3\/7\/2022 我怎样才能避免这些额外的 \ 和双引号。
am using below code:我正在使用以下代码:
loan = loan_detail_df.toPandas()
results = loan.to_json(orient='records')
My output is :我的输出是:
[{"Lead_Number":"","Number":"123456","user1_DOB":"3\/20\/1943""ExpectedRate":"0"}]
My desired output :我想要的输出:
{"Lead_Number":'',"Number":123456,"user1_DOB":"3/20/1943""ExpectedRate":0}
If you use the following syntax:如果您使用以下语法:
df.write.json(<storage path>)
The output will look like this if the date
column is in DateType()
format:如果
date
列是DateType()
格式,输出将如下所示:
{"number":1,"date":"2021-03-22"}
If the date
column is in StringType()
format with /
as the separator between date components, the output will be:如果
date
列是StringType()
格式,其中/
作为日期组件之间的分隔符,则输出将是:
{"number":1,"date":"2022/03/12"}
To get the json string values to write out to api获取 json 字符串值以写入 api
you can use the pyspark.sql.functions.to_json() transformation.您可以使用 pyspark.sql.functions.to_json() 转换。
import pyspark.sql.functions as f
df = spark.createDataFrame([
(1, '2022/03/12')
], ['number', 'date'])
df = (
df
.withColumn('json_value', f.to_json(f.struct(f.col('number'), f.col('date'))))
)
print(df.collect()[0]['json_value'])
And the output is:输出是:
{"number":1,"date":"2022/03/12"}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.