简体   繁体   中英

How to iterate over JSON array to find value with changing keys?

I am trying to extract many values from a JSON array so I am iterating through it to extract the values based on their keys, however one of the keys changes depending on the item and I am getting a KeyError when the loop comes across the different key.

I've tried using try and except to catch this but since I am looping through the entire array it will throw the same exception for the other key this time.

Here is my code to extract the values:

df = []

for item in json_response["items"]:
    df.append({
        'AccountName': item["accountName"],
        'Action': item["action"],
        'Application': item["application"],
        'AppID': item["attributes"]["appId"],
        'AppName': item["attributes"]["AppName"],
        'Errors': item["attributes"]["errors"],
        'ContextID': item["contextid"],
        'Created': item["created"],
        'HostName': item["hostname"],
        'EventID': item["id"],
        'Info': item["info"],
        'ipaddr': item["ipaddr"],
        'EventSource': item["source"],
        'Stack': item["stack"],
        'Target': item["target"],
        'TrackingID': item["trackingId"],
        'Type': item["type"]
        })

Here is an example JSON from a larger array I am extracting from:

{
    "accountName": null,
    "action": "Disable",
    "application": "Application1",
    "attributes": {
        "appId": "7d264050024",
        "AppName": "Application1",
        "errors": [
            "Rule: Rule not found."
        ]
    },
    "contextid": null,
    "created": 1553194821098,
    "hostname": null,
    "id": "ac09ea0082",
    "info": null,
    "ipaddr": null,
    "source": "System1",
    "stack": null,
    "target": "TargetName1.",
    "trackingId": null,
    "type": null
}

This would work but sometimes the "attributes" looks like:

    "attributes": {
        "appId": "7d2451684288",
        "cloudAppName": "Application1",
        "RefreshFailure": true
    }

How can I extract either the "errors" value or the "RefreshFailure" value when iterating over the entire array?

Test key existence in attributes to retrieve the different values:

df = []

for item in json_response["items"]:
    errors = "NA" 
    if "errors" in item["attributes"]
        errors = item["attributes"]["errors"]
    elif "RefreshFailure" in item["attributes"]:
        errors = item["attributes"]["RefreshFailure"] 

    df.append({
        'AccountName': item["accountName"],
        'Action': item["action"],
        'Application': item["application"],
        'AppID': item["attributes"]["appId"],
        'AppName': item["attributes"]["AppName"],
        'Errors': errors,
        'ContextID': item["contextid"],
        'Created': item["created"],
        'HostName': item["hostname"],
        'EventID': item["id"],
        'Info': item["info"],
        'ipaddr': item["ipaddr"],
        'EventSource': item["source"],
        'Stack': item["stack"],
        'Target': item["target"],
        'TrackingID': item["trackingId"],
        'Type': item["type"]
    })

I tried to emulate your data to make the code work.

import json
from pprint import pprint


json_data = '''
{
    "items": [
        {
            "accountName": null,
            "action": "Disable",
            "application": "Application1",
            "attributes": {
                "appId": "7d264050024",
                "AppName": "Application1",
                "errors": [
                    "Rule: Rule not found."
                ]
            },
            "contextid": null,
            "created": 1553194821098,
            "hostname": null,
            "id": "ac09ea0082",
            "info": null,
            "ipaddr": null,
            "source": "System1",
            "stack": null,
            "target": "TargetName1.",
            "trackingId": null,
            "type": null
        },
        {
            "accountName": null,
            "action": "Disable",
            "application": "Application1",
            "attributes": {
                "appId": "7d2451684288",
                "cloudAppName": "Application1",
                "RefreshFailure": true
            },
            "contextid": null,
            "created": 1553194821098,
            "hostname": null,
            "id": "ac09ea0082",
            "info": null,
            "ipaddr": null,
            "source": "System1",
            "stack": null,
            "target": "TargetName1.",
            "trackingId": null,
            "type": null
        }
    ]
}'''
json_response = json.loads(json_data)


def capitalize(s):
    return s[0].upper() + s[1:]


df = []

for item in json_response["items"]:
    d = {}
    # Iterate over the items in the dictionary/json object and add them one by one using a loop
    # This will work even if the items in the json_response changes without having to change the code
    for key, value in item.items():
        # "attributes" is itself a dictionary/json object
        # Its items have to be unpacked and added instead of adding it as a raw object
        if isinstance(value, dict):
            for k, v in value.items():
                d[capitalize(k)] = v
        else:
            d[capitalize(key)] = value

    df.append(d)

pprint(df)

Output:

[{'AccountName': None,
  'Action': 'Disable',
  'AppId': '7d264050024',
  'AppName': 'Application1',
  'Application': 'Application1',
  'Contextid': None,
  'Created': 1553194821098,
  'Errors': ['Rule: Rule not found.'],
  'Hostname': None,
  'Id': 'ac09ea0082',
  'Info': None,
  'Ipaddr': None,
  'Source': 'System1',
  'Stack': None,
  'Target': 'TargetName1.',
  'TrackingId': None,
  'Type': None},
 {'AccountName': None,
  'Action': 'Disable',
  'AppId': '7d2451684288',
  'Application': 'Application1',
  'CloudAppName': 'Application1',
  'Contextid': None,
  'Created': 1553194821098,
  'Hostname': None,
  'Id': 'ac09ea0082',
  'Info': None,
  'Ipaddr': None,
  'RefreshFailure': True,
  'Source': 'System1',
  'Stack': None,
  'Target': 'TargetName1.',
  'TrackingId': None,
  'Type': None}]

If you want the key name to be Errors even when the actual key name is RefreshFailure , you can add these lines of code before df.append(d)

...
if 'RefreshFailure' in d:
    d['Errors'] = d['RefreshFailure']
    del d['RefreshFailure']

df.append(d)

With these few extra lines of code, the output would look like this:

[{'AccountName': None,
  'Action': 'Disable',
  'AppId': '7d264050024',
  'AppName': 'Application1',
  'Application': 'Application1',
  'Contextid': None,
  'Created': 1553194821098,
  'Errors': ['Rule: Rule not found.'],
  'Hostname': None,
  'Id': 'ac09ea0082',
  'Info': None,
  'Ipaddr': None,
  'Source': 'System1',
  'Stack': None,
  'Target': 'TargetName1.',
  'TrackingId': None,
  'Type': None},
 {'AccountName': None,
  'Action': 'Disable',
  'AppId': '7d2451684288',
  'Application': 'Application1',
  'CloudAppName': 'Application1',
  'Contextid': None,
  'Created': 1553194821098,
  'Errors': True,
  'Hostname': None,
  'Id': 'ac09ea0082',
  'Info': None,
  'Ipaddr': None,
  'Source': 'System1',
  'Stack': None,
  'Target': 'TargetName1.',
  'TrackingId': None,
  'Type': None}]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM