简体   繁体   English

如何从 Python 中的 JSON 字符串中删除冗余元素

[英]How to remove redundant elements from a JSON string in Python

I have the below JSON string which I converted from a Pandas data frame.我有以下 JSON 字符串,它是从 Pandas 数据帧转换而来的。

[
   {
      "ID":"1",
      "Salary1":69.43,
      "Salary2":513.0,
      "Date":"2022-06-09",
      "Name":"john",
      "employeeId":12,
      "DateTime":"2022-09-0710:57:55"
   },
   {
      "ID":"2",
      "Salary1":691.43,
      "Salary2":5123.0,
      "Date":"2022-06-09",
      "Name":"john",
      "employeeId":12,
      "DateTime":"2022-09-0710:57:55"
   }
]

I want to change the above JSON to the below format.我想把上面的 JSON 改成下面的格式。

[
   {
      "Date":"2022-06-09",
      "Name":"john",
      "DateTime":"2022-09-0710:57:55",
      "employeeId":12,
      "Results":[
         {
            "ID":1,
            "Salary1":69.43,
            "Salary2":513
         },
         {
            "ID":"2",
            "Salary1":691.43,
            "Salary2":5123
         }
      ]
   }
]

Kindly let me know how we can achieve this in Python.请让我知道我们如何在 Python 中实现这一目标。

Original Dataframe:原装 Dataframe:

ID  Salary1  Salary2  Date        Name  employeeId  DateTime   
1   69.43     513.0   2022-06-09  john   12         2022-09-0710:57:55
2   691.43    5123.0  2022-06-09  john   12         2022-09-0710:57:55

Thank you.谢谢你。

As @Harsha pointed, you can adapt one of the answers from another question , with just some minor tweaks to make it work for OP's case:正如@Harsha 指出的那样,您可以调整另一个问题的答案之一,只需进行一些小的调整即可使其适用于 OP 的情况:

(
  df.groupby(["Date","Name","DateTime","employeeId"])[["ID","Salary1","Salary2"]]

    # to_dict(orient="records") - returns list of rows, where each row is a dict,
    # "oriented" like [{column -> value}, … , {column -> value}]
    .apply(lambda x: x.to_dict(orient="records")) 

    # groupBy makes a Series: with grouping columns as index, and dict as values. 
    # This structure is no good for the next to_dict() method. 
    # So here we create new DataFrame out of grouped Series, 
    # with Series' indexes as columns of DataFrame,
    # and also renamimg our Series' values to "Results" while we are at it.
    .reset_index(name="Results")

    # Finally we can achieve the desired structure with the last call to to_dict():
    .to_dict(orient="records")
)
# [{'Date': '2022-06-09', 'Name': 'john', 'DateTime': '2022-09-0710:57:55', 'employeeId': 12, 
# 'Results': [
#   {'ID': 1, 'Salary1': 69.43, 'Salary2': 513.0}, 
#   {'ID': 2, 'Salary1': 691.43, 'Salary2': 5123.0}
# ]}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM