简体   繁体   English

json 数据格式化使用 pandas

[英]json data formatting using pandas

Here, having the input data in this format (json):在这里,具有这种格式的输入数据(json):

[
    {
        "timestamp": "2019-05-25T00:00:00",
        "name": "sample_name",
        "keys": ["Field 1", "Field 2", "Field 3", "Field 4", "Field 5", "Field 6"],
        "values": ["value1", "value2", "value3", "value4", "value5", "value6"],
        "is_accepted": false,
    },
    {
        "timestamp": "2018-05-26T00:00:00",
        "name": "sample_name",
        "keys": ["Field 1", "Field 2", "Field 3", "Field 4", "Field 5", "Field 6"],
        "values": ["value11", "value21", "value31", "value41", "value51", "value61"],
        "is_accepted": false,
    }
]

and I need to reformat it as per the following我需要按照以下内容重新格式化它

{
    "info": {
        "timeColumn": "date",
        "name": "sample_name",
        "segments": ["Field 1", "Field 2", "Field 3", "Field 4", "Field 5", "Field 6"]
    },
    "data": [
        {
            "Field 1": "value1",
            "Field 2": "value2",
            "Field 3": "value3",
            "Field 4": "value4",
            "Field 5": "value5",
            "Field 6": "value6",
            "date": "2019-05-25T00:00:00",
            "is_accepted": false
        },
        {
            "Field 1": "value11",
            "Field 2": "value21",
            "Field 3": "value31",
            "Field 4": "value41",
            "Field 5": "value51",
            "Field 6": "value61",
            "date": "2018-05-26T00:00:00",
            "is_accepted": false
        }
    ]
}

Need to combine the values into the data field from the input json data.需要将输入 json 数据中的值组合到数据字段中。 Since I'm a newbie in coding, is there any effective approach that can be adopted here using pandas由于我是编码新手,这里有什么有效的方法可以使用 pandas

You can simply use Python here to reformat your json file, like this:您可以在此处简单地使用 Python 重新格式化您的 json 文件,如下所示:

# Define a helper function
def reformat(input_item):
    output_item = {
        key: value for key, value in zip(input_item["keys"], input_item["values"])
    }
    output_item["date"] = input_item["timestamp"]
    output_item["is_accepted"] = input_item["is_accepted"]
    return output_item

# Build a new dictionnary
import json

with open("data.json", "r") as f:
    data = json.load(f)

new_data = {
    "info": {
        "timeColumn": "date",
        "name": "sample_name",
        "segments": ["Field 1", "Field 2", "Field 3", "Field 4", "Field 5", "Field 6"],
    },
    "data": [reformat(input_item) for input_item in data],
}

# Save to a new json file
with open("new_data.json", "w", encoding="utf-8") as f:
    json.dump(new_data, f, ensure_ascii=False, indent=4)

In new_data.json , you find the expected result:new_data.json中,您会找到预期的结果:

{
    "info": {
        "timeColumn": "date",
        "name": "sample_name",
        "segments": [
            "Field 1",
            "Field 2",
            "Field 3",
            "Field 4",
            "Field 5",
            "Field 6"
        ]
    },
    "data": [
        {
            "Field 1": "value1",
            "Field 2": "value2",
            "Field 3": "value3",
            "Field 4": "value4",
            "Field 5": "value5",
            "Field 6": "value6",
            "date": "2019-05-25T00:00:00",
            "is_accepted": "false"
        },
        {
            "Field 1": "value11",
            "Field 2": "value21",
            "Field 3": "value31",
            "Field 4": "value41",
            "Field 5": "value51",
            "Field 6": "value61",
            "date": "2018-05-26T00:00:00",
            "is_accepted": "false"
        }
    ]
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM