简体   繁体   English

如何使用 Pandas 将 Excel 文件转换为嵌套的 JSON 文件?

[英]How do I use Pandas to convert an Excel file to a nested JSON?

I am a rookie programmer and I'm trying to convert an excel file into a nested JSON using Pandas.我是一名新手程序员,我正在尝试使用 Pandas 将 excel 文件转换为嵌套的 JSON。

I am posting my code and the expected output, which I am not able to achieve so far.我正在发布我的代码和预期的 output,到目前为止我无法实现。 The problem is that the excel columns which I transform into nested info, should actually fall under the name "addresses" and I can't figure out how to do that.问题是我转换为嵌套信息的 excel 列实际上应该属于“地址”名称,我不知道该怎么做。 Will be grateful for any advice.将不胜感激任何建议。

This is how the excel file looks like:这是 excel 文件的样子:

在此处输入图像描述

import pandas as pd
import json

df = pd.read_excel("...", encoding = "utf-8-sig")
df.fillna('', inplace = True)

def get_nested_entry(key, grp):
    entry = {}
    entry['Forename'] = key[0]
    entry['Middle Name'] = key[1]
    entry['Surname'] = key[2]

    for field in ['Address - Country']:
        entry[field] = list(grp[field].unique())
    return entry

entries = []
for key, grp in df.groupby(['Forename', 'Middle Name', 'Surname']):
    entry = get_nested_entry(key, grp)
    entries.append(entry)

print(entries)
with open("excel_to_json_output.json", "w", encoding = "utf-8-sig") as f:
    json.dump(entries, f, indent = 4)    

This is the expected outcome这是预期的结果

 [
        {
            "firstName": "Angela",
            "lastName": "L.",
            "middleName": "Johnson",
            "addresses": [
                {
                    "postcode": "32807",
                    "city": "Orlando",
                    "state": "FL",
                    "country": "United States of America"
                }
            ],

What I get is this我得到的是这个

[
    {
        "Forename": "Angela",
        "Middle Name": "L.",
        "Surname": "Johnson",
        "Address - Country": [
            "United States of America"
        ]
    },

Try this尝试这个

b = {'First_Name': ["Angela","Peter","John"],
 'Middle_Name': ["L","J","A"], 
 'Last_Name': ["Johnson","Roth","Williams"], 
 'City': ["chicago","seattle","st.loius"],
 'state': ["IL","WA","MO"],
 'zip': [60007,98105,63115], 
 'country': ["USA","USA","USA"]}

df = pd.DataFrame(b)

predict = df.iloc[:,:3].to_dict(orient='records')
postdict = df.iloc[:,3:].to_dict(orient='records')
entities=[]
for i in range(df.shape[0]):
    tm = predict[i]
    tm["addresses"] = [postdict[i]]
    entities.append(tm)

output output

[{'First_Name': 'Angela',
  'Middle_Name': 'L',
  'Last_Name': 'Johnson',
  'addresses': [{'City': 'chicago',
    'state': 'IL',
    'zip': 60007,
    'country': 'USA'}]},
 {'First_Name': 'Peter',
  'Middle_Name': 'J',
  'Last_Name': 'Roth',
  'addresses': [{'City': 'seattle',
    'state': 'WA',
    'zip': 98105,
    'country': 'USA'}]},
 {'First_Name': 'John',
  'Middle_Name': 'A',
  'Last_Name': 'Williams',
  'addresses': [{'City': 'st.loius',
    'state': 'MO',
    'zip': 63115,
    'country': 'USA'}]}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM