[英]I am trying to pivot a json file using pandas to be in a specific format. I want to pivot it on certain columns
So I tried searching on the web to find a way in which I can convert the following json.所以我尝试在网上搜索以找到一种可以转换以下json的方法。
{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Forecast%": "67",
"Sponsor": "Jon R",
"IsActive": "1",
"InternalOrder": "null",
"Forecast": "null",
"BidStatus": "null",
"ProjectNotes": "null",
"EstimateTypeCode": "null",
"Start": "null",
"SponsoringDistrict": "null",
"LocationState": "null",
"Finish": "null",
"AreaManager": "null",
"CTG Vendor": "null"
}
to the one like below.到下面的那个。
{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Attribute":"Forecast%",
"AttrValue":"67",
},
{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Attribute":"Sponsor",
"AttrValue":"Jon R",
},
{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Attribute":"IsActive",
"AttrValue":"1",
},
...
Now here if you see all the attributes apart from the first four are getting converted into Attribute and AttributeValue and getting their own records.现在在这里,如果您看到除前四个属性之外的所有属性都被转换为 Attribute 和 AttributeValue 并获得它们自己的记录。
I have tried searching for a solution on the web but I am still trying to find a solution.我曾尝试在网络上搜索解决方案,但我仍在努力寻找解决方案。
Please help if anyone can.如果有人可以,请帮助。
Thank you in advance.先感谢您。
Use pd.melt
:使用pd.melt
:
import json
with open('data.json') as json_data:
data = json.load(json_data)
out = pd.DataFrame.from_dict(data, orient='index').T \
.melt(['eTask_ID', 'Organization', 'BidID', 'Project'],
var_name='Attribute', value_name='AttrValue') \
.to_json(orient='records', indent=4)
Output:输出:
>>> print(out)
[
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"Forecast%",
"AttrValue":67
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"Sponsor",
"AttrValue":"Jon R"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"IsActive",
"AttrValue":1
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"InternalOrder",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"Forecast",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"BidStatus",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"ProjectNotes",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"EstimateTypeCode",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"Start",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"SponsoringDistrict",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"LocationState",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"Finish",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"AreaManager",
"AttrValue":"null"
},
{
"eTask_ID":100,
"Organization":"Power",
"BidID":2,
"Project":"IPP - C",
"Attribute":"CTG Vendor",
"AttrValue":"null"
}
]
With only standard lib:只有标准库:
import json
from typing import Dict
json_data = """
{
"eTask_ID": "100",
"Organization": "Power",
"BidID": "2.00",
"Project": "IPP - C",
"Forecast%": "67",
"Sponsor": "Jon R",
"IsActive": "1",
"InternalOrder": "null",
"Forecast": "null",
"BidStatus": "null",
"ProjectNotes": "null",
"EstimateTypeCode": "null",
"Start": "null",
"SponsoringDistrict": "null",
"LocationState": "null",
"Finish": "null",
"AreaManager": "null",
"CTG Vendor": "null"
}
"""
parsed_json: Dict[str, str] = json.loads(json_data)
repeat_keys = ["eTask_ID", "Organization", "BidID", "Project"]
result = []
for key, value in parsed_json.items():
if key in repeat_keys:
continue
attr_dict = {rep_key: parsed_json[rep_key] for rep_key in repeat_keys}
attr_dict.update({"Attribute": key})
attr_dict.update({"AttrValue": value})
result.append(attr_dict)
result_json = json.dumps(result)
print(result_json)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.