简体   繁体   English

将扁平化的 excel 转换为 pandas 中的嵌套 json

[英]Convert a flattened excel to nested json in pandas

I am fairly new to this and have spent the entire day reading numerous posts and figuring out how i can convert this flattened excel table to a nested json.我对此相当陌生,并且花了一整天的时间阅读大量帖子并弄清楚如何将这个扁平的 excel 表转换为嵌套的 json。 Here is an example of the flattened nested table:以下是扁平嵌套表的示例:

    {'Sample': {0: '1A',
  1: '1A',
  2: '1A',
  3: '1A',
  4: '1A',
  5: '1A',
  6: '1A',
  7: '2A',
  8: '2A',
  9: '2A',
  10: '2A',
  11: '2A',
  12: '2A',
  13: '2A'},
 'Substance category': {0: 'Additive',
  1: 'Additive',
  2: 'Alkali',
  3: 'Alkali',
  4: 'Alkali',
  5: 'Alkali',
  6: 'Alkali',
  7: 'Additive',
  8: 'Additive',
  9: 'Alkali',
  10: 'Alkali',
  11: 'Alkali',
  12: 'Alkali',
  13: 'Alkali'},
 'Substance': {0: 'Irgafos 168',
  1: 'Alkylphenylphosphate',
  2: 'Calcium',
  3: 'Kalium',
  4: 'Lithium',
  5: 'Magnesium',
  6: 'Natrium',
  7: 'Irgafos 168',
  8: 'Alkylphenylphosphate',
  9: 'Calcium',
  10: 'Kalium',
  11: 'Lithium',
  12: 'Magnesium',
  13: 'Natrium'},
 'Value': {0: 0,
  1: 0,
  2: 2,
  3: 2,
  4: 1,
  5: 2,
  6: 3,
  7: 2,
  8: 3,
  9: 2,
  10: 3,
  11: 1,
  12: 2,
  13: 3}}

This table looks like this Sample table这个表看起来像这个示例表

I used the following code to get a nested json, which was taken from this answer .我使用下面的代码来获取一个嵌套的 json,它取自这个答案

j = (df.groupby(['Sample','Substance category'])
       .apply(lambda x: x[['Substance','Value']].to_dict('records'))
       .reset_index()
       .rename(columns={0:'Substance'})
       .to_json(orient='records'))

I am getting the following json.我得到以下json。

[
  {
    "Sample": "1A",
    "Substance": [
      {
        "Substance": "Irgafos 168",
        "Value": 0
      },
      {
        "Substance": "Alkylphenylphosphate",
        "Value": 0
      }
    ],
    "Substance category": "Additive"
  },
  {
    "Sample": "1A",
    "Substance": [
      {
        "Substance": "Calcium",
        "Value": 2
      },
      {
        "Substance": "Kalium",
        "Value": 2
      },
      {
        "Substance": "Lithium",
        "Value": 1
      },
      {
        "Substance": "Magnesium",
        "Value": 2
      },
      {
        "Substance": "Natrium",
        "Value": 3
      }
    ],
    "Substance category": "Alkali"
  },
  {
    "Sample": "2A",
    "Substance": [
      {
        "Substance": "Irgafos 168",
        "Value": 2
      },
      {
        "Substance": "Alkylphenylphosphate",
        "Value": 3
      }
    ],
    "Substance category": "Additive"
  },
  {
    "Sample": "2A",
    "Substance": [
      {
        "Substance": "Calcium",
        "Value": 2
      },
      {
        "Substance": "Kalium",
        "Value": 3
      },
      {
        "Substance": "Lithium",
        "Value": 1
      },
      {
        "Substance": "Magnesium",
        "Value": 2
      },
      {
        "Substance": "Natrium",
        "Value": 3
      }
    ],
    "Substance category": "Alkali"
  }
]

However what I actually want is to define an addition level for the 'Substance category'.但是我真正想要的是为“物质类别”定义一个添加级别。 Despite all my efforts, I just could not figure that out and none of the answers could help me.尽管我付出了所有努力,但我还是想不通,没有一个答案可以帮助我。

Thank you very much in advance.非常感谢您提前。

This would be my process:这将是我的过程:

  • convert the dict to a dataframe.将字典转换为数据框。
  • writing to 'json' from a dataframe can be done with to_json()可以使用to_json()从数据帧写入“json”

so the code looks like this:所以代码看起来像这样:

#%%
import pandas as pd

d = {'Sample': {0: '1A',
  1: '1A',
  2: '1A',
  3: '1A',
  4: '1A',
  5: '1A',
  6: '1A',
  7: '2A',
  8: '2A',
  9: '2A',
  10: '2A',
  11: '2A',
  12: '2A',
  13: '2A'},
 'Substance category': {0: 'Additive',
  1: 'Additive',
  2: 'Alkali',
  3: 'Alkali',
  4: 'Alkali',
  5: 'Alkali',
  6: 'Alkali',
  7: 'Additive',
  8: 'Additive',
  9: 'Alkali',
  10: 'Alkali',
  11: 'Alkali',
  12: 'Alkali',
  13: 'Alkali'},
 'Substance': {0: 'Irgafos 168',
  1: 'Alkylphenylphosphate',
  2: 'Calcium',
  3: 'Kalium',
  4: 'Lithium',
  5: 'Magnesium',
  6: 'Natrium',
  7: 'Irgafos 168',
  8: 'Alkylphenylphosphate',
  9: 'Calcium',
  10: 'Kalium',
  11: 'Lithium',
  12: 'Magnesium',
  13: 'Natrium'},
 'Value': {0: 0,
  1: 0,
  2: 2,
  3: 2,
  4: 1,
  5: 2,
  6: 3,
  7: 2,
  8: 3,
  9: 2,
  10: 3,
  11: 1,
  12: 2,
  13: 3}}


# make dataframe
df = pd.DataFrame(d)

# %%  send to excel
json_path = "C:\\test\\test.json"
df.to_json(json_path)

The dataframe (before the json) looks like this:数据框(在 json 之前)如下所示:

在此处输入图像描述

You can manipulate the dataframe as you wish from here.您可以从这里随心所欲地操作数据框。

Are you asking to create a multilevel dataframe ?您是否要求创建多级数据框? if so, then the final part is answered here:如果是这样,那么最后一部分在这里回答:

How to create a multilevel dataframe in pandas? 如何在熊猫中创建多级数据框?

Well creating a multilevel df was not a problem.那么创建多级df不是问题。 But when I exported that to a json, it did not maintain the nested structure of the indexes.但是当我将它导出到 json 时,它并没有维护索引的嵌套结构。 Anyway, I finally found an answer here.无论如何,我终于在这里找到了答案。 It was just a matter to searching on google with the right keywords link用正确的关键字链接在谷歌上搜索只是一个问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM