I am trying to write a script that will print the unique keys of a JSON file in dot notation so as to quickly profile the structure.
For example let's say I have 'myfile.json' with the following format:
{
"a": "one",
"b": "two",
"c": {
"d": "four",
"e": "five",
"f": [
{
"x": "six",
"y": "seven"
},
{
"x": "eight",
"y": "nine"
}
]
}
Running the following will produce a unique set of keys, but it is missing the lineage.
import json
json_data = open("myfile.json")
jdata = json.load(json_data)
def get_keys(dl, keys_list):
if isinstance(dl, dict):
keys_list += dl.keys()
map(lambda x: get_keys(x, keys_list), dl.values())
elif isinstance(dl, list):
map(lambda x: get_keys(x, keys_list), dl)
keys = []
get_keys(jdata, keys)
all_keys = list(set(keys))
print '\n'.join([str(x) for x in sorted(all_keys)])
The following output doesn't indicate that 'x', 'y' are nested within the 'f' array.
a
b
c
d
e
f
x
y
I can't figure out how to loop through the nested structure to append the parent keys.
The ideal output would be:
a
b
c.d
c.e
c.f.x
c.f.y
I'd recommend using a recursive generator function, using the yield statement rather than building a list internally. In Python 2.6+, the following works:
import json
json_data = json.load(open("myfile.json"))
def walk_keys(obj, path=""):
if isinstance(obj, dict):
for k, v in obj.iteritems():
for r in walk_keys(v, path + "." + k if path else k):
yield r
elif isinstance(obj, list):
for i, v in enumerate(obj):
s = "[" + str(i) + "]"
for r in walk_keys(v, path + s if path else s):
yield r
else:
yield path
for s in sorted(walk_keys(json_data)):
print s
In Python 3.x, you can use yield from as syntactic sugar for recursive generation, as follows:
import json
json_data = json.load(open("myfile.json"))
def walk_keys(obj, path=""):
if isinstance(obj, dict):
for k, v in obj.items():
yield from walk_keys(v, path + "." + k if path else k)
elif isinstance(obj, list):
for i, v in enumerate(obj):
s = "[" + str(i) + "]"
yield from walk_keys(v, path + s if path else s)
else:
yield path
for s in sorted(walk_keys(json_data)):
print(s)
Drawing off of MTADD's guidance I put together the following:
import json
json_file_path = "myfile.json"
json_data = json.load(open(json_file_path))
def walk_keys(obj, path = ""):
if isinstance(obj, dict):
for k, v in obj.iteritems():
for r in walk_keys(v, path + "." + k if path else k):
yield r
elif isinstance(obj, list):
for i, v in enumerate(obj):
s = ""
for r in walk_keys(v, path if path else s):
yield r
else:
yield path
all_keys = list(set(walk_keys(json_data)))
print '\n'.join([str(x) for x in sorted(all_keys)])
The results match as expected
a
b
c.d
c.e
c.f.x
c.f.y
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.