简体   繁体   English

如何按键查找特定的 JSON 值?

[英]How to find a particular JSON value by key?

There is a JSON like this:有一个像这样的 JSON:

{
  "P1": "ss",
  "Id": 1234,
  "P2": {
      "P1": "cccc"
  },
  "P3": [
      {
          "P1": "aaa"
      }
  ]
}

How can I find all P1 's value without it iterating all JSON?如何在不迭代所有 JSON 的情况下找到所有P1的值?

PS: P1 can be anywhere in the JSON. PS: P1可以在 JSON 中的任何位置

If no method can do this, can you tell me how to iterate through the JSON?如果没有方法可以做到这一点,你能告诉我如何遍历 JSON 吗?

As I said in my other answer , I don't think there is a way of finding all values associated with the "P1" key without iterating over the whole structure.正如我在其他答案中所说,我认为没有办法在不遍历整个结构的情况下找到与"P1"键关联的所有值。 However I've come up with even better way to do that which came to me while looking at @Mike Brennan's answer to another JSON-related question How to get string objects instead of Unicode from JSON?不过我已经想出更好的方法来做到这一点它来找我一边看着@Mike布伦南的回答另一个JSON相关的问题, 如何获得字符串从JSON对象,而不是Unicode的?

The basic idea is to use the object_hook parameter that json.loads() accepts just to watch what is being decoded and check for the sought-after value.基本思想是使用json.loads()接受的object_hook参数来观察正在解码的内容并检查寻找的值。

Note: This will only work if the representation is of a JSON object (ie something enclosed in curly braces {} ), as in your sample.注意:这仅在表示是 JSON object (即括在花括号{} )时才有效,如您的示例中所示。

from __future__ import print_function
import json

def find_values(id, json_repr):
    results = []

    def _decode_dict(a_dict):
        try:
            results.append(a_dict[id])
        except KeyError:
            pass
        return a_dict

    json.loads(json_repr, object_hook=_decode_dict) # Return value ignored.
    return results

json_repr = '{"P1": "ss", "Id": 1234, "P2": {"P1": "cccc"}, "P3": [{"P1": "aaa"}]}'
print(find_values('P1', json_repr))

(Python 3) output: (Python 3)输出:

['cccc', 'aaa', 'ss']

I had the same issue just the other day.前几天我遇到了同样的问题。 I wound up just searching through the entire object and accounted for both lists and dicts.我最终只是搜索了整个对象并考虑了列表和字典。 The following snippets allows you to search for the first occurrence of a multiple keys.以下片段允许您搜索多个键的第一次出现。

import json

def deep_search(needles, haystack):
    found = {}
    if type(needles) != type([]):
        needles = [needles]

    if type(haystack) == type(dict()):
        for needle in needles:
            if needle in haystack.keys():
                found[needle] = haystack[needle]
            elif len(haystack.keys()) > 0:
                for key in haystack.keys():
                    result = deep_search(needle, haystack[key])
                    if result:
                        for k, v in result.items():
                            found[k] = v
    elif type(haystack) == type([]):
        for node in haystack:
            result = deep_search(needles, node)
            if result:
                for k, v in result.items():
                    found[k] = v
    return found

deep_search(["P1", "P3"], json.loads(json_string))

It returns a dict with the keys being the keys searched for.它返回一个字典,其中的键是搜索的键。 Haystack is expected to be a Python object already, so you have to do json.loads before passing it to deep_search. Haystack 应该已经是一个 Python 对象,所以你必须在将它传递给 deep_search 之前执行 json.loads。

Any comments for optimization are welcomed!欢迎任何优化意见!

My approach to this problem would be different.我对这个问题的处理方法会有所不同。

As JSON doesn't allow depth first search, so convert the json to a Python Object, feed it to an XML decoder and then extract the Node you are intending to search由于 JSON 不允许深度优先搜索,因此将 json 转换为 Python 对象,将其提供给 XML 解码器,然后提取您要搜索的节点

from xml.dom.minidom import parseString
import json        
def bar(somejson, key):
    def val(node):
        # Searches for the next Element Node containing Value
        e = node.nextSibling
        while e and e.nodeType != e.ELEMENT_NODE:
            e = e.nextSibling
        return (e.getElementsByTagName('string')[0].firstChild.nodeValue if e 
                else None)
    # parse the JSON as XML
    foo_dom = parseString(xmlrpclib.dumps((json.loads(somejson),)))
    # and then search all the name tags which are P1's
    # and use the val user function to get the value
    return [val(node) for node in foo_dom.getElementsByTagName('name') 
            if node.firstChild.nodeValue in key]

bar(foo, 'P1')
[u'cccc', u'aaa', u'ss']
bar(foo, ('P1','P2'))
[u'cccc', u'cccc', u'aaa', u'ss']

Using json to convert the json to Python objects and then going through recursively works best.使用jsonjson转换为 Python 对象,然后递归执行效果最佳。 This example does include going through lists.这个例子确实包括遍历列表。

import json
def get_all(myjson, key):
    if type(myjson) == str:
        myjson = json.loads(myjson)
    if type(myjson) is dict:
        for jsonkey in myjson:
            if type(myjson[jsonkey]) in (list, dict):
                get_all(myjson[jsonkey], key)
            elif jsonkey == key:
                print myjson[jsonkey]
    elif type(myjson) is list:
        for item in myjson:
            if type(item) in (list, dict):
                get_all(item, key)

Converting the JSON to Python and recursively searching is by far the easiest:将 JSON 转换为 Python 并递归搜索是迄今为止最简单的:

def findall(v, k):
  if type(v) == type({}):
     for k1 in v:
         if k1 == k:
            print v[k1]
         findall(v[k1], k)

findall(json.loads(a), 'P1')

(where a is the string) (其中 a 是字符串)

The example code ignores arrays.示例代码忽略数组。 Adding that is left as an exercise.添加它作为练习。

Bearing in mind that json is simply a string, using regular expressions with look-ahead and look-behind can accomplish this task very quickly.记住 json 只是一个字符串,使用带有前瞻和后视的正则表达式可以非常快速地完成这项任务。

Typically, the json would have been extracted from a request to external api, so code to show how that would work has been included but commented out.通常,json 是从对外部 api 的请求中提取的,因此包含但已注释掉显示其工作方式的代码。

import re
#import requests
#import json

#r1 = requests.get( ... url to some api ...)
#JSON = str(json.loads(r1.text))
JSON = """
 {
  "P1": "ss",
  "Id": 1234,
  "P2": {
      "P1": "cccc"
  },
  "P3": [
     {
          "P1": "aaa"
     }
  ]
 }
"""
rex1  = re.compile('(?<=\"P1\": \")[a-zA-Z_\- ]+(?=\")')
rex2 = rex1.findall(JSON)  
print(rex2)

#['ss', 'cccc', 'aaa']

I don't think there's any way of finding all values associated with P1 without iterating over the whole structure.我认为没有任何方法可以在不迭代整个结构的情况下找到与 P1 相关的所有值。 Here's a recursive way to do it that first deserializes the JSON object into an equivalent Python object.这是一种递归方法,首先将 JSON 对象反序列化为等效的 Python 对象。 To simplify things most of the work is done via a recursive private nested function.为了简化事情,大部分工作是通过递归私有嵌套函数完成的。

import json

try:
    STRING_TYPE = basestring
except NameError:
    STRING_TYPE = str  # Python 3

def find_values(id, obj):
    results = []

    def _find_values(id, obj):
        try:
            for key, value in obj.items():  # dict?
                if key == id:
                    results.append(value)
                elif not isinstance(value, STRING_TYPE):
                    _find_values(id, value)
        except AttributeError:
            pass

        try:
            for item in obj:  # iterable?
                if not isinstance(item, STRING_TYPE):
                    _find_values(id, item)
        except TypeError:
            pass

    if not isinstance(obj, STRING_TYPE):
        _find_values(id, obj)
    return results

json_repr = '{"P1": "ss", "Id": 1234, "P2": {"P1": "cccc"}, "P3": [{"P1": "aaa"}]}'

obj = json.loads(json_repr)
print(find_values('P1', obj))

You could also use a generator to search the object after json.load().您还可以使用生成器在 json.load() 之后搜索对象。

Code example from my answer here: https://stackoverflow.com/a/39016088/5250939我在此处回答的代码示例: https : //stackoverflow.com/a/39016088/5250939

def item_generator(json_input, lookup_key):
    if isinstance(json_input, dict):
        for k, v in json_input.iteritems():
            if k == lookup_key:
                yield v
            else:
                for child_val in item_generator(v, lookup_key):
                    yield child_val
    elif isinstance(json_input, list):
        for item in json_input:
            for item_val in item_generator(item, lookup_key):
                yield item_val

The question is old, but no answer answered 100%, so this was my solution:这个问题很老,但没有100%的答案,所以这是我的解决方案:

what it does:它能做什么:

  • recursive algorithm;递归算法;
  • list search;列表搜索;
  • object search; object 搜索;
  • returns all the results it finds in the tree;返回它在树中找到的所有结果;
  • returns the id of the parent in the key返回键中父级的 id

suggestions:建议:

  • study Depth First Search and Breadth First Search;学习深度优先搜索和广度优先搜索;
  • if your json is too big, recursion may be a problem, research stack algorithm如果你的 json 太大,递归可能是个问题,研究堆栈算法
   @staticmethod
    def search_into_json_myversion(jsondata, searchkey, parentkeyname: str = None) -> list:
        found = []

        if type(jsondata) is list:
            for element in jsondata:
                val = Tools.search_into_json_myversion(element, searchkey, parentkeyname=parentkeyname)
                if len(val) != 0:
                    found = found + val
        elif type(jsondata) is dict:
            if searchkey in jsondata.keys():
                pathkey = parentkeyname + '->' + searchkey if parentkeyname != None else searchkey
                found.append({pathkey: jsondata[searchkey]})
            else:
                for key, value in jsondata.items():
                    val = Tools.search_into_json_myversion(value, searchkey, parentkeyname=key)
                    if len(val) != 0:
                        found = found + val

        return found

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM