將嵌套的 Json 文件展平為 pandas dataframe

Question

我有這個 json 文件

    {
    "OrderMaster": {
        "Order": {
            "item": [{
                "row_id": "1-2LDPVI0",
                "sequence_id": "3851101",
                "end_date": "",
                "name": "TV-Discount",
                "orderable": "Y",
                "period": "",
                "period_uom": "",
                "phone_number_flag": "N",
                "price_type": "Recurring",
                "product_category": "mobilepackage",
                "product_sub_category": "Discount",
                "product_type_code": "Product",
                "type": "PhoneOrder",
                "vendor_part_number": "",
                "created_date": "2018-02-16 09:09:24",
                "created_by": "id123",
                "last_updated_date": "2020-09-14 09:39:24",
                "last_updated_by": "id123",
                "ts_event_notification_time": "2020-09-14 09:40:69",
                "OrderItems": {
                    "item": [{
                        "original_list_price": "0",
                        "order_list_id": "1-4ABU",
                        "order_list_name": "SEK Pricelist",
                        "product_id": "1-2LDPUKX",
                        "start_date": "2018-02-17 00:00:00"
                    },
                    {
                        "original_list_price": "45",
                        "order_list_id": "1-4AFU",
                        "order_list_name": "SEK Pricelist",
                        "product_id": "1-2LGSDFUKX",
                        "start_date": "2018-02-18 00:04:20"
                    }]
                }
            },
            {
                "row_id": "1-2LDPVI0",
                "sequence_id": "3851101",
                "end_date": "",
                "name": "TV-Discount",
                "orderable": "Y",
                "period": "",
                "period_uom": "",
                "phone_number_flag": "N",
                "price_type": "Recurring",
                "product_category": "mobilepackage",
                "product_sub_category": "Discount",
                "product_type_code": "Product",
                "type": "PhoneOrder",
                "vendor_part_number": "",
                "created_date": "2018-02-16 09:19:24",
                "created_by": "id123",
                "last_updated_date": "2020-09-15 09:39:24",
                "last_updated_by": "id123",
                "ts_event_notification_time": "2020-09-14 09:40:28",
                "OrderItems": {
                    "item": [{
                        "original_list_price": "42",
                        "order_list_id": "1-4ABU",
                        "order_list_name": "SEK Pricelist",
                        "product_id": "1-2LDPUKX",
                        "start_date": "2018-02-19 00:00:00"
                    },
                    {
                        "original_list_price": "42",
                        "order_list_id": "1-4ASU",
                        "order_list_name": "SEK Pricelist",
                        "product_id": "1-2LDDAKX",
                        "start_date": "2018-02-12 00:00:00"
                    },
                    {
                        "original_list_price": "43",
                        "order_list_id": "1-4FDBU",
                        "order_list_name": "SEK Pricelist",
                        "product_id": "1-2LDFSDFKX",
                        "start_date": "2018-02-11 00:00:00"
                    }]
                }
            }]
        }
    }
}

這就是我想要實現的目標：

到目前為止，我已經設法做到這一點但是我對最后一個嵌套列“OrderItem”列有問題。 我設法提取了它，但很難弄清楚如何將它們連接在一起，就像在目標結果中一樣。

Answer 1

我設法通過使用帶有正確參數集的 json_normalise 來解決這個問題

with open(file_path) as f:
    data = json.load(f)

# Define feature list for dataframe
features = [
    "row_id",
    "sequence_id",
    "end_date",
    "name",
    "orderable",
    "period",
    "period_uom",
    "phone_number_flag",
    "price_type",
    "product_category",
    "product_sub_category",
    "product_type_code",
    "type",
    "vendor_part_number",
    "created_date",
    "created_by",
    "last_updated_date",
    "last_updated_by",
    "ts_event_notification_time"
]

# Create dataframe using json_normalize pandas function with necessary parameters
df = pd.json_normalize(data['OrderMaster']['Order']['item'],['OrderItems', 'item'], features)

結果是每個項目的完整數據行：

將嵌套的 Json 文件展平為 pandas dataframe

問題描述

1 個解決方案

解決方案1
0 2021-02-10 21:26:41

將嵌套的 Json 文件展平為 pandas dataframe

問題描述

1 個解決方案

解決方案1 0 2021-02-10 21:26:41

解決方案1
0 2021-02-10 21:26:41