简体   繁体   English

我正在尝试使用 Pandas 将 json 文件转为特定格式。 我想在某些列上旋转它

[英]I am trying to pivot a json file using pandas to be in a specific format. I want to pivot it on certain columns

So I tried searching on the web to find a way in which I can convert the following json.所以我尝试在网上搜索以找到一种可以转换以下json的方法。

{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Forecast%": "67",
"Sponsor": "Jon R",
"IsActive": "1",
"InternalOrder": "null",
"Forecast": "null",
"BidStatus": "null",
"ProjectNotes": "null",
"EstimateTypeCode": "null",
"Start": "null",
"SponsoringDistrict": "null",
"LocationState": "null",
"Finish": "null",
"AreaManager": "null",
"CTG Vendor": "null"
}

to the one like below.到下面的那个。

{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Attribute":"Forecast%",
"AttrValue":"67",
},
{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Attribute":"Sponsor",
"AttrValue":"Jon R",
},
{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Attribute":"IsActive",
"AttrValue":"1",
},
...

Now here if you see all the attributes apart from the first four are getting converted into Attribute and AttributeValue and getting their own records.现在在这里,如果您看到除前四个属性之外的所有属性都被转换为 Attribute 和 AttributeValue 并获得它们自己的记录。

I have tried searching for a solution on the web but I am still trying to find a solution.我曾尝试在网络上搜索解决方案,但我仍在努力寻找解决方案。

Please help if anyone can.如果有人可以,请帮助。

Thank you in advance.先感谢您。

Use pd.melt :使用pd.melt

import json

with open('data.json') as json_data:
    data = json.load(json_data)
    
    out = pd.DataFrame.from_dict(data, orient='index').T \
            .melt(['eTask_ID', 'Organization', 'BidID', 'Project'], 
                  var_name='Attribute', value_name='AttrValue') \
            .to_json(orient='records', indent=4)

Output:输出:

>>> print(out)
[
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"Forecast%",
        "AttrValue":67
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"Sponsor",
        "AttrValue":"Jon R"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"IsActive",
        "AttrValue":1
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"InternalOrder",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"Forecast",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"BidStatus",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"ProjectNotes",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"EstimateTypeCode",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"Start",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"SponsoringDistrict",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"LocationState",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"Finish",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"AreaManager",
        "AttrValue":"null"
    },
    {
        "eTask_ID":100,
        "Organization":"Power",
        "BidID":2,
        "Project":"IPP - C",
        "Attribute":"CTG Vendor",
        "AttrValue":"null"
    }
]

With only standard lib:只有标准库:

import json
from typing import Dict

json_data = """
{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Forecast%": "67",
"Sponsor": "Jon R",
"IsActive": "1",
"InternalOrder": "null",
"Forecast": "null",
"BidStatus": "null",
"ProjectNotes": "null",
"EstimateTypeCode": "null",
"Start": "null",
"SponsoringDistrict": "null",
"LocationState": "null",
"Finish": "null",
"AreaManager": "null",
"CTG Vendor": "null"
}
"""

parsed_json: Dict[str, str] = json.loads(json_data)

repeat_keys = ["eTask_ID", "Organization", "BidID", "Project"]

result = []
for key, value in parsed_json.items():
    if key in repeat_keys:
        continue
    attr_dict = {rep_key: parsed_json[rep_key] for rep_key in repeat_keys}
    attr_dict.update({"Attribute": key})
    attr_dict.update({"AttrValue": value})
    result.append(attr_dict)

result_json = json.dumps(result)

print(result_json)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我正在尝试使用pandas旋转数据集 - I am trying to pivot a dataset with pandas 我想合并两个 Pandas 数据透视表,但在维护列时遇到问题 - I want to merge two pandas pivot tables but am having trouble maintaining columns 重新格式化熊猫表 - 我想要一个支点吗? - Reformatting pandas table - do I want a pivot? Pivot 基于 pandas 中的特定条件的特定行到列 - Pivot specific rows to columns based on certain conditions in pandas 在 Pandas pivot 表的列中格式化时间格式 - Format time format in columns for Pandas pivot table 我有一个json格式的数据文件。 如何找到并打印前 20 个 eij_max 值和相关的 Pretty_formula? 我正在使用蟒蛇 - I have a data file in json format. How can I find and print the top 20 eij_max values and the associated pretty_formula? I am using python Python Pandas:透视数据帧的某些列 - Python Pandas: Pivot certain columns of a dataframe 我在尝试使用CSV文件和列格式将数字(0-10)读入Python 3的列表时遇到麻烦。 - I'm having trouble trying to read the numbers (0 - 10) into a list in Python 3 using CSV file and in a column format. pandas pivot 表问题 - 假设我是如何构建它的? - pandas pivot table issue - assuming it is how i am structuring it? 将列添加到 Pandas 中的特定级别数据透视表 - add columns to specific level pivot tables in pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM