i have a deep nested json file as shown below:
dict = [
{
"date":"2017-05-31",
"sections":[
{
"item":"BalanceSheetFormat2Heading",
"value":"None",
"sections":[
{
"item":"TotalAssets",
"value":"None",
"sections":[
{
"item":"FixedAssets",
"value":"None",
"sections":[
{
"item":"IntangibleAssets",
"value":"None",
"sections":[
]
},
{
"item":"PropertyPlantEquipment",
"value":"None",
"sections":[
]
},
{
"item":"InvestmentsFixedAssets",
"value":"None",
"sections":[
{
"item":"LoansToGroupUndertakings",
"value":"None",
"sections":[
]
},
{
"item":"OwnShares",
"value":"None",
"sections":[
]
}
]
},
{
"item":"InvestmentProperty",
"value":"None",
"sections":[
]
},
{
"item":"BiologicalAssetsNon-current",
"value":"None",
"sections":[
]
}
]
},
{
"item":"CurrentAssets",
"value":"None",
"sections":[
{
"item":"TotalInventories",
"value":"None",
"sections":[
]
},
{
"item":"BiologicalAssetsCurrent",
"value":"None",
"sections":[
]
},
{
"item":"Debtors",
"value":"None",
"sections":[
{
"item":"PrepaymentsAccruedIncome",
"value":"None",
"sections":[
]
},
{
"item":"DeferredTaxAssetDebtors",
"value":"None",
"sections":[
]
}
]
},
{
"item":"CurrentAssetInvestments",
"value":"None",
"sections":[
{
"item":"InvestmentsInGroupUndertakings",
"value":"None",
"sections":[
]
},
{
"item":"OwnShares",
"value":"None",
"sections":[
]
}
]
},
{
"item":"CashBankOnHand",
"value":"None",
"sections":[
]
}
]
},
{
"item":"PrepaymentsAccruedIncome",
"value":"None",
"sections":[
]
}
]
},
{
"item":"TotalLiabilities",
"value":"None",
"sections":[
{
"item":"Equity",
"value":9014904.0,
"sections":[
]
},
{
"item":"ProvisionsFor",
"value":"None",
"sections":[
{
"item":"RetirementBenefitObligationsSurplus",
"value":"None",
"sections":[
]
}
]
},
{
"item":"Creditors",
"value":"None",
"sections":[
{
"item":"UseCurrentNon",
"value":"None",
"sections":[
]
},
{
"item":"TradeCreditorsTradePayables",
"value":"None",
"sections":[
]
}
]
},
{
"item":"AccruedLiabilitiesNot",
"value":"None",
"sections":[
]
}
]
}
]
}
]
}
]
what i want to achieve is removing the object that have an empty sections
and value
equal to None
, the whole object should be removed from the dictionary for instance
{
"item":"IntangibleAssets",
"value":"None",
"sections":[]
}
The final output should looks like this:
[
{
"date":"2017-05-31",
"sections":[
{
"item":"BalanceSheetFormat2Heading",
"value":"None",
"sections":[
{
"item":"TotalLiabilities",
"value":"None",
"sections":[
{
"item":"Equity",
"value":9014904.0,
"sections":[
]
}
]
}
]
}
]
}
]
i have tried to check if an object is empty or not using this function:
def is_single_element(obj):
# print(obj)
if isinstance(obj, dict):
if "item" in obj and "value" in obj and "sections" in obj:
if obj["value"] == "None" and len(obj["sections"]) == 0:
return True
return False
and recursively walk the json and remove those obj using:
def remove_single_obj(dict_):
if isinstance(dict_, dict):
for k, v in list(dict_.items()):
if is_single_element(v):
remove_single_obj(v)
if isinstance(dict_, list):
for index in range(len(dict_)):
if is_single_element(dict_[index]):
dict_.pop(index)
remove_single_obj(dict_)
return dict_
but i still cannot get the needed result. Any help is much appreciated Best
For starters, this looks suspicious:
for index in range(len(dict_)):
if is_single_element(dict_[index]):
dict_.pop(index)
Notice that dict_
here isn't a dict, but a list... Then, let's say you have a list of 3 elements, [A, B, C], of which the first two should be removed. First you will remove item A, making the list [B, C]. Then the loop starts over with index 1, thus failing to ever look at item B! And finally, it looks at item 2, which is out of range!
Here is a working code.
It assumes data root is a list, and are list items are dict, and all dict has 'sections' list, recursively. So no need check type via isinstance
.
def delete_emtpy_from_l(l):
len0 = len(l)
l[:] = [d for d in l if 'value' in d and d['value'] != 'None' or d['sections']]
cnt = len0 - len(l)
for d in l:
cnt += delete_emtpy_from_l(d['sections'])
# cnt is how many dict are deleted
return cnt
# loop until no new dict is deleted
while delete_emtpy_from_l(data):
pass
pprint(data)
output:
[{'date': '2017-05-31',
'sections': [{'item': 'BalanceSheetFormat2Heading',
'sections': [{'item': 'TotalLiabilities',
'sections': [{'item': 'Equity',
'sections': [],
'value': 9014904.0}],
'value': 'None'}],
'value': 'None'}]}]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.