简体   繁体   中英

Python (json.load) - How to extract required information

I have some JSON that looks like this:

{
"Volumes": [
    {
        "Attachments": [],
        "Tags": [
            {
                "Value": "snapshot",
                "Key": "Name"
            },
            {
                "Value": "00:00",
                "Key": "Start"
            },
            {
                "Value": "00:20",
                "Key": "Finish"
            },
            {
                "Value": "2",
                "Key": "Retention"
            }
        ],
        "VolumeId": "vol-11111111"
    },
    {
        "Attachments": [],
        "Tags": [
            {
                "Value": "snapshot",
                "Key": "Name"
            },
            {
                "Value": "00:00",
                "Key": "Start"
            },
            {
                "Value": "00:20",
                "Key": "Finish"
            },
            {
                "Value": "2",
                "Key": "Retention"
            }
        ],
        "VolumeId": "vol-22222222"
    },
    {
        "Attachments": [],
        "Tags": [
            {
                "Value": "snapshot",
                "Key": "Name"
            },
            {
                "Value": "00:00",
                "Key": "Start"
            },
            {
                "Value": "00:20",
                "Key": "Finish"
            },
            {
                "Value": "2",
                "Key": "Retention"
            }
        ],
        "VolumeId": "vol-33333333"
    }
]
}

If you're familiar with AWS, it's a redacted bit of JSON that contains some values that I need, specifically:

VolumeId and values of Start/Finish/Retention tags

I get the JSON from bash, then put it in a file for python to read. The python code is:

# Determines the number of snapshots that have been discovered. This will be used to iterate over volumes later
snap_num=$(python -c '
import json,sys
obj=json.load(sys.stdin)
print (len(obj["Volumes"]))
' <tmp)

# Shifts number of volumes integer to the left by 1, so that 1 becomes 0, 2 becomes 1 etc.
snap_num=$(( snap_num - 1 ))

# Iterates over volumes and pulls required properties
for ((i=0;i<=snap_num;i++)); do
        # Exports the current iteration (bash var) to the python child process
        export ITER=$i

a=$(python -c '
import json
import sys
import os
iter=os.environ["ITER"]
iter=int(iter)
obj=json.load(sys.stdin)
print obj["Volumes"][iter]["VolumeId"]
' <tmp)

# Converts all discovered volumes properties into an array
mapfile -t b <<< "$a"

echo "$a"
done

Which I'm sure is not the best way to do it, but I'm able to pull the VolumeId from each volume:

vol-11111111
vol-22222222
vol-33333333

The Challenge

I cannot now figure out how to pull the Start/Finish/Retention bits of information out of the JSON and join them to the VolumeIds. The ideal output would be something like :

vol-1111111_Start-00:00_Finish-00:20_Retention:2
vol-2222222_Start-00:00_Finish-00:20_Retention:2
vol-3333333_Start-00:00_Finish-00:20_Retention:2

The format is not so important, as long as it is on one line and I can extract the information I need per volume.

Please let me know if I haven't been clear or you need any further information. I am beginning the transition from Bash to Python (enter: boto) however have a pressing need to get this written, and I know I can do that in bash if I can get python to do the JSON parsing.

Cheers!

Edit: value of obj

{u'Volumes': [{u'VolumeId': u'vol-11111111', u'Attachments': [], u'Tags': [{u'Ke y': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u' Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-22222222', u'Attachments': [], u'Tags': [{u'Key': u'Name', u 'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish ', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-33333333', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'sna pshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u '00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}]} {u'Volumes': [{u'VolumeId': u'vol-11111111', u'Attachments': [], u'Tags': [{u'Ke y': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u' Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-22222222', u'Attachments': [], u'Tags': [{u'Key': u'Name', u 'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish ', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-33333333', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'sna pshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u '00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}]} {u'Volumes': [{u'VolumeId': u'vol-11111111', u'Attachments': [], u'Tags': [{u'Ke y': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u' Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-22222222', u'Attachments': [], u'Tags': [{u'Key': u'Name', u 'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish ', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-33333333', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'sna pshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u '00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}]}

Edit2 - error message

Traceback (most recent call last):
File "<string>", line 7, in <module>
NameError: name 'Volumes' is not defined
{u'Volumes': [{u'VolumeId': u'vol-11111111', u'Attachments': [], u'Tags':     [{u'Key': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-22222222', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-33333333', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}]}
Traceback (most recent call last):
File "<string>", line 7, in <module>
NameError: name 'Volumes' is not defined
{u'Volumes': [{u'VolumeId': u'vol-11111111', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-22222222', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-33333333', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}]}
Traceback (most recent call last):
  File "<string>", line 7, in <module>
NameError: name 'Volumes' is not defined
{u'Volumes': [{u'VolumeId': u'vol-11111111', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-22222222', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}, {u'VolumeId': u'vol-33333333', u'Attachments': [], u'Tags': [{u'Key': u'Name', u'Value': u'snapshot'}, {u'Key': u'Start', u'Value': u'00:00'}, {u'Key': u'Finish', u'Value': u'00:20'}, {u'Key': u'Retention', u'Value': u'2'}]}]}

Using the following python code.

Assuming that your json is stored in variable a :-

volume = a['Volumes']
vol = []
for i in volume:
    r = []
    r.append(i['VolumeId'])
    for tags in i['Tags']:
        if tags['Key'] in ['Start', 'Finish', 'Retention']:
            r.append(tags['Key'] + '-' + tags['Value'])
    vol.append("_".join(r))

This will give the required output you want..

['vol-11111111_Start-00:00_Finish-00:20_Retention-2', 'vol-22222222_Start-00:00_Finish-00:20_Retention-2', 'vol-33333333_Start-00:00_Finish-00:20_Retention-2']

Your Code:

a=$(python -c '
import json
import sys
import os
iter=os.environ["ITER"]
iter=int(iter)
obj=json.load(sys.stdin)
print obj["Volumes"][iter]["VolumeId"]
' <tmp)

new code:-

a=$(python -c '
import json
import sys
import os
obj=json.load(sys.stdin)
print obj
volume = obj['Volumes']
vol = []
for i in volume:
    r = []
    r.append(i['VolumeId'])
    for tags in i['Tags']:
        if tags['Key'] in ['Start', 'Finish', 'Retention']:
            r.append(tags['Key'] + '-' + tags['Value'])
    vol.append("_".join(r))

print vol
' <tmp)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM