简体   繁体   English

遍历复杂的字典时获取对Python对象的引用

[英]Get a reference to a Python object while iterating over a complex dictionary

I would appreciate it if someone could help me with this little problem. 如果有人可以帮助我解决这个小问题,我将不胜感激。 I want to iterate over a complex Python data structure (dict, list, tuple, strings, bytes, ...) and replace all bytes (byte strings) with a base64 encoded version. 我想遍历一个复杂的Python数据结构(字典,列表,元组,字符串,字节...),并用base64编码版本替换所有字节(字节字符串)。 This is necessary to convert the original complex data structure to JSON (eg json.dumps(complex_data_structure) ) since JSON does not support binary data. 由于JSON不支持二进制数据,因此必须将原始的复杂数据结构转换为JSON(例如json.dumps(complex_data_structure) )。 My code already does the right thing but there is one Python-specific problem. 我的代码已经做对了,但是有一个特定于Python的问题。 Here is my code: 这是我的代码:

import sys
import json
import base64


def iter_object(obj):
    if type(obj) is tuple:
        iter_tuple(obj)
    elif type(obj) is dict:
        iter_dict(obj)
    elif type(obj) is list:
        iter_list(obj)
    else: # non iterable types except of string and bytes etc.
        if type(obj) is bytes:
           # THE PROBLEM IS THE COPY OF OBJ!
           obj = base64.b64encode(obj).decode("ascii")
        else:
            pass # we don't care about other data types


def iter_tuple(obj_tuple):
    for t in obj_tuple:
        iter_object(t)


def iter_list(obj_list):
    for l in obj_list:
        iter_object(l)


def iter_dict(obj_dict):
    for k, v in obj_dict.items():
        iter_object(v)


def main():

    test_dict = {
        "foo": [1, 3, 4, 5, 6, 7],
        "bar": 1,
        "baz": (1, 2),
        "blub": {
            "bla": b"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41",
            "ble": {
                "blu": [
                    1, 3, b"\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42",
                    (1, [b"\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43"])
                ]
            }
        }
    }

    iter_object(test_dict)

    print(json.dumps(test_dict))

    return 0


if __name__ == "__main__":
    sys.exit(main())

The problem is the line obj = base64.b64encode(obj).decode("ascii") because it works on a copy not a reference (to say it in C++). 问题是obj = base64.b64encode(obj).decode("ascii")因为它适用于副本而不是引用(在C ++中是引用)。 Here is my question: Is there a workaround to make the above code work? 这是我的问题: 是否有使上述代码正常工作的解决方法?

Thank you very much! 非常感谢你!

Works on a copy?? 适用于副本?? No. What is happening is that the function is returning a value instead of changing it in place. 否。发生的事情是该函数正在返回一个值,而不是在原位置更改它。 This is because byte strings are immutable. 这是因为字节字符串是不可变的。 There is no concept of pass by value or reference in python. 在python中没有按值传递或引用传递的概念。 The variables are not boxes which hold objects, they are rather names of some object. 变量不是保存对象的盒子,而是某些对象的名称 And object can be 和对象可以是

  1. Mutable - list , set , dict 可变listsetdict
  2. Immutable - tuple , str , bytes 不可变tuplestrbytes

So if a function acts on Immutable object it has to return another object. 因此,如果函数对不可变对象起作用,则它必须返回另一个对象。 The memory usage is optimized. 内存使用已优化。 And this is the defacto way in languages like Haskell. 这是像Haskell这样的语言中的事实方式。

I found a solution to my problem: 我找到了解决问题的方法:

import sys
import json
import base64

class BinaryToBase64Encoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, bytes):
            return base64.b64encode(o).decode("ascii")
        return super(BinaryToBase64Encoder, self).default(o)


def main():

    test_dict = {
        "foo": [1, 3, 4, 5, 6, 7],
        "bar": 1,
        "baz": (1, 2),
        "blub": {
            "bla": b"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41",
            "ble": {
                "blu": [
                    1, 3, b"\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42\x42",
                    (1, [b"\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43"])
                ]
            }
        }
    }

    print(json.dumps(test_dict, cls=BinaryToBase64Encoder))

    return 0


if __name__ == "__main__":
    sys.exit(main())

The JSON output is: JSON输出为:

{
    "foo": [1, 3, 4, 5, 6, 7],
    "baz": [1, 2],
    "bar": 1,
    "blub": {
        "ble": {
            "blu": [
                1,
                3,
                "QkJCQkJCQkJCQkJCQkJCQkJCQkI=",
                [
                    1,
                    ["Q0NDQ0NDQ0NDQ0NDQ0NDQ0NDQ0M="]
                ]
            ]
        },
        "bla": "QUFBQUFBQUFBQUFBQUFBQUFBQUE="
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM