[英]Convert data frame to nested json in python
I'm trying to convert df to nested json with the following code:我正在尝试使用以下代码将 df 转换为嵌套的 json:
nested_json = (df.groupby(['prediction_probability','id','ts','prediction_value'], as_index=False)
.apply(lambda x:x[[
"first_create_date",
"create_date",
"update_timestamp",
"revenue",
"col",
"x"]].to_dict('r'))
.reset_index()
.rename(columns={0:'features'})
.to_json(orient='records'))
My problem is that the nested dict (key ='features') wrapped with square brackets.我的问题是嵌套的 dict (key ='features') 用方括号包裹。 How can I avoid the square brackets?
如何避免方括号? I know that I can treat my output as a string and replace the square brackets but of course, this is a bad practice
我知道我可以将我的输出视为字符串并替换方括号,但当然,这是一个不好的做法
Output:输出:
[
{
"pred": 0.50726,
"id": "0030X00002qMwFrQAKxxxx",
"ts": "2020-02-19T20:32:15.016586",
"value": "A",
"features": [
{
"first_create_date": 1582089665000,
"create_date": 1582089665000,
"update_timestamp": 1582142462000,
"revenue": null,
"col":"aaaa",
"x": null
}
]
},
{
"pred": 0.50895,
"id": "0030X00002qMvfHQASxxxxx",
"ts": "2020-02-19T20:32:15.016586",
"value": "A",
"features": [
{
"first_create_date": 1582077985000,
"create_date": 1582077985000,
"update_timestamp": 1582142462000,
"revenue": null,
"col":"aaaa",
"x": null
}
]
}
]
Desired output:期望的输出:
[
{
"pred": 0.50726,
"id": "0030X00002qMwFrQAKxxxx",
"ts": "2020-02-19T20:32:15.016586",
"value": "A",
"features":
{
"first_create_date": 1582089665000,
"create_date": 1582089665000,
"update_timestamp": 1582142462000,
"revenue": null,
"col":"aaaa",
"x": null
}
},
{
"pred": 0.50895,
"id": "0030X00002qMvfHQASxxxxx",
"ts": "2020-02-19T20:32:15.016586",
"value": "A",
"features":
{
"first_create_date": 1582077985000,
"create_date": 1582077985000,
"update_timestamp": 1582142462000,
"revenue": null,
"col":"aaaa",
"x": null
}
}
]
Simple dict comprehension will do the trick: Say you can reach a nested json shaped like your output and call it output
.简单的 dict 理解可以解决问题:假设您可以访问一个形状类似于输出的嵌套 json 并将其称为
output
。 Then, to reach your desired output, the only thing you need to do is to take the first element of the features
list:然后,要达到您想要的输出,您唯一需要做的就是获取
features
列表的第一个元素:
desired_output = [{k: v if k!='features' else v[0]} for x in output for k,v in x.items()]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.