简体   繁体   English

将 Pandas 数据框转换为嵌套 JSON(不作为列表嵌套)

[英]Convert Pandas dataframe to nested JSON (without nesting as lists)

I'm aware there are other threads on this topic, but I'm running into an issue that no other answer seems to address.我知道这个主题还有其他主题,但我遇到了一个似乎没有其他答案可以解决的问题。

Given the following Pandas dataframe:给定以下 Pandas 数据框:

a  b  c  d
a1 b1 c1 d1
a2 b2 c2 d2

I would like to create a nested JSON object with the following structure:我想创建一个具有以下结构的嵌套 JSON 对象:

[
    {
        "a": "a1",
        "b": "b1",
        "nested_group":
            {
                "c": "c1",
                "d": "d1"
            }
    },
    {
        "a": "a2",
        "b": "b2",
        "nested_group":
            {
                "c": "c2",
                "d": "d2"
            }
    }
]

The solution I've found in other threads is to use the following code:我在其他线程中找到的解决方案是使用以下代码:

j = (df.groupby(['a','b']) 
      .apply(lambda x: x[['c','d']].to_dict('records')) 
      .reset_index() 
      .rename(columns={0:'nested_group'}) 
      .to_json(orient='records'))

However, the issue I'm running into is that the above code places each nested_group object inside square brackets, like so:但是,我遇到的问题是上面的代码将每个nested_group对象放在方括号内,如下所示:

"nested_group": [
    {
        "c": "c2",
        "d": "d2"
    }
]

The object I'm trying to generate is intended to be the payload for an API call, and unfortunately the square brackets around each inner dictionary cause the API to return an error.我试图生成的对象旨在作为 API 调用的有效负载,不幸的是,每个内部字典周围的方括号会导致 API 返回错误。 (The outermost brackets at the very top/bottom of the object are fine.) How can I make Python NOT treat the inner dictionaries as lists? (对象顶部/底部的最外面的括号很好。)如何让 Python 不将内部字典视为列表?

Code to reproduce the example dataframe:重现示例数据框的代码:

import numpy as np
import pandas as pd

array = np.array([['a1', 'b1', 'c1', 'd1'], ['a2', 'b2', 'c2', 'd2']])
df = pd.DataFrame(data=array, columns=['a','b','c','d'])

Thank you in advance :)先感谢您 :)

Let us try让我们试试

out = [{'a':x['a'],'b':x['b'],'nested_group':x[['c','d']].to_dict()} for _,x in df.iterrows() ]
Out[284]: 
[{'a': 'a1', 'b': 'b1', 'nested_group': {'c': 'c1', 'd': 'd1'}},
 {'a': 'a2', 'b': 'b2', 'nested_group': {'c': 'c2', 'd': 'd2'}}]

Looking at the docs for to_dict it seems like we still have to use the records option, and if we assume it will always be a list of 1 element, just take the 0 th element using your original code查看to_dict的文档,似乎我们仍然必须使用records选项,如果我们假设它始终是1个元素的列表,只需使用原始代码获取第0个元素

>>> import numpy as np
>>> import pandas as pd
>>> array = np.array([['a1', 'b1', 'c1', 'd1'], ['a2', 'b2', 'c2', 'd2']])
>>> df = pd.DataFrame(data=array, columns=['a','b','c','d'])
>>> (df.groupby(['a','b']) 
      .apply(lambda x: x[['c','d']].to_dict('records')[0]) 
      .reset_index() 
      .rename(columns={0:'nested_group'}) 
      .to_json(orient='records'))
'[{"a":"a1","b":"b1","nested_group":{"c":"c1","d":"d1"}},{"a":"a2","b":"b2","nested_group":{"c":"c2","d":"d2"}}]'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM