简体   繁体   English

递归地将python对象图转换为字典

[英]Recursively convert python object graph to dictionary

I'm trying to convert the data from a simple object graph into a dictionary.我正在尝试将数据从简单的对象图转换为字典。 I don't need type information or methods and I don't need to be able to convert it back to an object again.我不需要类型信息或方法,也不需要能够再次将其转换回对象。

I found this question about creating a dictionary from an object's fields , but it doesn't do it recursively.我发现了这个关于从对象的字段创建字典的问题,但它不是递归地做的。

Being relatively new to python, I'm concerned that my solution may be ugly, or unpythonic, or broken in some obscure way, or just plain old NIH.作为 python 的新手,我担心我的解决方案可能很丑陋,或者不符合 pythonic,或者以某种晦涩的方式被破坏,或者只是普通的旧 NIH。

My first attempt appeared to work until I tried it with lists and dictionaries, and it seemed easier just to check if the object passed had an internal dictionary, and if not, to just treat it as a value (rather than doing all that isinstance checking).我的第一次尝试似乎奏效了,直到我用列表和字典尝试它,而且检查传递的对象是否有内部字典似乎更容易,如果没有,就把它当作一个值(而不是做所有的实例检查). My previous attempts also didn't recurse into lists of objects:我以前的尝试也没有递归到对象列表中:

def todict(obj):
    if hasattr(obj, "__iter__"):
        return [todict(v) for v in obj]
    elif hasattr(obj, "__dict__"):
        return dict([(key, todict(value)) 
            for key, value in obj.__dict__.iteritems() 
            if not callable(value) and not key.startswith('_')])
    else:
        return obj

This seems to work better and doesn't require exceptions, but again I'm still not sure if there are cases here I'm not aware of where it falls down.这似乎工作得更好并且不需要例外,但我仍然不确定这里是否有案例我不知道它在哪里失败。

Any suggestions would be much appreciated.任何建议将不胜感激。

An amalgamation of my own attempt and clues derived from Anurag Uniyal and Lennart Regebro's answers works best for me:结合我自己的尝试和从 Anurag Uniyal 和 Lennart Regebro 的答案中得出的线索,对我来说效果最好:

def todict(obj, classkey=None):
    if isinstance(obj, dict):
        data = {}
        for (k, v) in obj.items():
            data[k] = todict(v, classkey)
        return data
    elif hasattr(obj, "_ast"):
        return todict(obj._ast())
    elif hasattr(obj, "__iter__") and not isinstance(obj, str):
        return [todict(v, classkey) for v in obj]
    elif hasattr(obj, "__dict__"):
        data = dict([(key, todict(value, classkey)) 
            for key, value in obj.__dict__.items() 
            if not callable(value) and not key.startswith('_')])
        if classkey is not None and hasattr(obj, "__class__"):
            data[classkey] = obj.__class__.__name__
        return data
    else:
        return obj

One line of code to convert an object to JSON recursively.一行代码将对象递归转换为 JSON。

import json

def get_json(obj):
  return json.loads(
    json.dumps(obj, default=lambda o: getattr(o, '__dict__', str(o)))
  )

obj = SomeClass()
print("Json = ", get_json(obj))

I don't know what is the purpose of checking for basestring or object is?我不知道检查 basestring 或 object 的目的是什么? also dict will not contain any callables unless you have attributes pointing to such callables, but in that case isn't that part of object? dict也不会包含任何可调用对象,除非您有指向此类可调用对象的属性,但在这种情况下,这不是对象的一部分吗?

so instead of checking for various types and values, let todict convert the object and if it raises the exception, user the orginal value.因此,与其检查各种类型和值,不如让 todict 转换对象,如果它引发异常,则使用原始值。

todict will only raise exception if obj doesn't have dict eg如果 obj 没有dict例如,todict 只会引发异常

class A(object):
    def __init__(self):
        self.a1 = 1

class B(object):
    def __init__(self):
        self.b1 = 1
        self.b2 = 2
        self.o1 = A()

    def func1(self):
        pass

def todict(obj):
    data = {}
    for key, value in obj.__dict__.iteritems():
        try:
            data[key] = todict(value)
        except AttributeError:
            data[key] = value
    return data

b = B()
print todict(b)

it prints {'b1': 1, 'b2': 2, 'o1': {'a1': 1}} there may be some other cases to consider, but it may be a good start它打印 {'b1': 1, 'b2': 2, 'o1': {'a1': 1}} 可能还有一些其他情况需要考虑,但这可能是一个好的开始

special cases if a object uses slots then you will not be able to get dict eg特殊情况下,如果对象使用插槽,那么您将无法获得dict例如

class A(object):
    __slots__ = ["a1"]
    def __init__(self):
        self.a1 = 1

fix for the slots cases can be to use dir() instead of directly using the dict修复插槽情况可以使用 dir() 而不是直接使用dict

一种缓慢但简单的方法是使用jsonpickle将对象转换为 JSON 字符串,然后使用json.loads将其转换回 Python 字典:

dict = json.loads(jsonpickle.encode( obj, unpicklable=False ))

I realize that this answer is a few years too late, but I thought it might be worth sharing since it's a Python 3.3+ compatible modification to the original solution by @Shabbyrobe that has generally worked well for me:我意识到这个答案为时已晚,但我认为它可能值得分享,因为它是 @Shabbyrobe 对原始解决方案的 Python 3.3+ 兼容修改,通常对我来说效果很好:

import collections
try:
  # Python 2.7+
  basestring
except NameError:
  # Python 3.3+
  basestring = str 

def todict(obj):
  """ 
  Recursively convert a Python object graph to sequences (lists)
  and mappings (dicts) of primitives (bool, int, float, string, ...)
  """
  if isinstance(obj, basestring):
    return obj 
  elif isinstance(obj, dict):
    return dict((key, todict(val)) for key, val in obj.items())
  elif isinstance(obj, collections.Iterable):
    return [todict(val) for val in obj]
  elif hasattr(obj, '__dict__'):
    return todict(vars(obj))
  elif hasattr(obj, '__slots__'):
    return todict(dict((name, getattr(obj, name)) for name in getattr(obj, '__slots__')))
  return obj

If you're not interested in callable attributes, for example, they can be stripped in the dictionary comprehension:例如,如果您对可调用属性不感兴趣,则可以在字典理解中删除它们:

elif isinstance(obj, dict):
  return dict((key, todict(val)) for key, val in obj.items() if not callable(val))

In Python there are many ways of making objects behave slightly differently, like metaclasses and whatnot, and it can override getattr and thereby have "magical" attributes you can't see through dict , etc. In short, it's unlikely that you are going to get a 100% complete picture in the generic case with whatever method you use.在 Python 中,有很多方法可以使对象的行为略有不同,例如元类等等,并且它可以覆盖getattr从而具有您无法通过dict等看到的“神奇”属性。简而言之,您不太可能会使用您使用的任何方法在一般情况下获得 100% 的完整图片。

Therefore, the answer is: If it works for you in the use case you have now, then the code is correct.因此,答案是:如果它在您现在拥有的用例中对您有用,那么代码就是正确的。 ;-) ;-)

To make somewhat more generic code you could do something like this:要制作更通用的代码,您可以执行以下操作:

import types
def todict(obj):
    # Functions, methods and None have no further info of interest.
    if obj is None or isinstance(subobj, (types.FunctionType, types.MethodType))
        return obj

    try: # If it's an iterable, return all the contents
        return [todict(x) for x in iter(obj)]
    except TypeError:
        pass

    try: # If it's a dictionary, recurse over it:
        result = {}
        for key in obj:
            result[key] = todict(obj)
        return result
    except TypeError:
        pass

    # It's neither a list nor a dict, so it's a normal object.
    # Get everything from dir and __dict__. That should be most things we can get hold of.
    attrs = set(dir(obj))
    try:
        attrs.update(obj.__dict__.keys())
    except AttributeError:
        pass

    result = {}
    for attr in attrs:
        result[attr] = todict(getattr(obj, attr, None))
    return result            

Something like that.类似的东西。 That code is untested, though.不过,该代码未经测试。 This still doesn't cover the case when you override getattr , and I'm sure there are many more cases that it doens't cover and may not be coverable.当您覆盖getattr 时,这仍然没有涵盖这种情况,而且我确信还有更多的情况没有涵盖并且可能无法涵盖。 :) :)

Thanks @AnuragUniyal: You made my day!谢谢@AnuragUniyal:你成就了我的一天! This is my variant of code that's working for me:这是为我工作的代码变体:

# noinspection PyProtectedMember
def object_to_dict(obj):
    data = {}
    if getattr(obj, '__dict__', None):
        for key, value in obj.__dict__.items():
            try:
                data[key] = object_to_dict(value)
            except AttributeError:
                data[key] = value
        return data
    else:
        return obj

A little update to Shabbyrobe's answer to make it work for namedtuple s:对 Shabbyrobe 的回答进行了一些更新,使其适用于namedtuple s:

def obj2dict(obj, classkey=None):
    if isinstance(obj, dict):
        data = {}
        for (k, v) in obj.items():
            data[k] = obj2dict(v, classkey)
        return data
    elif hasattr(obj, "_asdict"):
        return obj2dict(obj._asdict())
    elif hasattr(obj, "_ast"):
        return obj2dict(obj._ast())
    elif hasattr(obj, "__iter__"):
        return [obj2dict(v, classkey) for v in obj]
    elif hasattr(obj, "__dict__"):
        data = dict([(key, obj2dict(value, classkey))
                     for key, value in obj.__dict__.iteritems()
                     if not callable(value) and not key.startswith('_')])
        if classkey is not None and hasattr(obj, "__class__"):
            data[classkey] = obj.__class__.__name__
        return data
    else:
        return obj
def list_object_to_dict(lst):
    return_list = []
    for l in lst:
        return_list.append(object_to_dict(l))
    return return_list

def object_to_dict(object):
    dict = vars(object)
    for k,v in dict.items():
        if type(v).__name__ not in ['list', 'dict', 'str', 'int', 'float']:
                dict[k] = object_to_dict(v)
        if type(v) is list:
            dict[k] = list_object_to_dict(v)
    return dict

Looked at all solutions, and @hbristow's answer was closest to what I was looking for.查看所有解决方案,@hbristow 的答案最接近我想要的答案。 Added enum.Enum handling since this was causing a RecursionError: maximum recursion depth exceeded error and reordered objects with __slots__ to have precedence of objects defining __dict__ .添加了enum.Enum处理,因为这会导致RecursionError: maximum recursion depth exceeded错误并使用__slots__重新排序对象以具有定义__dict__的对象的优先级。

def todict(obj):
  """
  Recursively convert a Python object graph to sequences (lists)
  and mappings (dicts) of primitives (bool, int, float, string, ...)
  """
  if isinstance(obj, str):
    return obj
  elif isinstance(obj, enum.Enum):
    return str(obj)
  elif isinstance(obj, dict):
    return dict((key, todict(val)) for key, val in obj.items())
  elif isinstance(obj, collections.Iterable):
    return [todict(val) for val in obj]
  elif hasattr(obj, '__slots__'):
    return todict(dict((name, getattr(obj, name)) for name in getattr(obj, '__slots__')))
  elif hasattr(obj, '__dict__'):
    return todict(vars(obj))
  return obj

No custom implementation is required.不需要自定义实现。 jsons library can be used.可以使用jsons库。

import jsons

object_dict = jsons.dump(object_instance)

I'd comment on the accepted answer but my rep is not high enough... The accepted answer is great but add another elif just after the if to support NamedTuples serialization to dict properly too:我会对接受的答案发表评论,但我的代表不够高......接受的答案很好,但在if之后添加另一个elif以支持 NamedTuples 序列化以正确地 dict :

    elif hasattr(obj, "_asdict"):
        return todict(obj._asdict())

Well.好。 Added functionality of limiting the depth to @Shabbyrobe answer.添加了限制@Shabbyrobe 答案深度的功能。 Thought it might be worth for the objects which loop back.认为循环回的对象可能是值得的。

def todict(obj, limit=sys.getrecursionlimit(), classkey=None):
        if isinstance(obj, dict):
            if limit>=1:
                data = {}
                for (k, v) in obj.items():
                    data[k] = todict(v, limit-1,classkey)
                return data
            else:
                return 'class:'+obj.__class__.__name__
        elif hasattr(obj, "_ast"):
            return todict(obj._ast(), limit-1) if limit>=1 else {'class:'+obj.__class__.__name__}
        elif hasattr(obj, "__iter__") and not isinstance(obj, str):
            return [todict(v, limit-1, classkey) for v in obj] if limit>=1 else {'class:'+obj.__class__.__name__}
        elif hasattr(obj, "__dict__"):
            if limit>=1:
                data = dict([(key, todict(value, limit-1, classkey)) 
                    for key, value in obj.__dict__.items() 
                    if not callable(value) and not key.startswith('_')])
                if classkey is not None and hasattr(obj, "__class__"):
                    data[classkey] = obj.__class__.__name__
                return data
            else:
                return 'class:'+obj.__class__.__name__
        else:
            return obj

previous answers not work when class field is class instance.当类字段是类实例时,以前的答案不起作用。 use this:用这个:

from dataclasses import dataclass, field

@dataclass
class BaseNumber:
    number:str = ''
    probability:float = 0.

@dataclass
class ContainerInfo:
    type:str = ''
    height:int = ''
    width:str = ''
    length:str = ''

@dataclass
class AdditionalNumber:
    number:str = ''
    prob:float = 0.
    info:ContainerInfo = ContainerInfo()

@dataclass  
class ContainerData:
    container_number = BaseNumber()
    container_type = AdditionalNumber()
    errors:list = field(default_factory=list)

    def todict(self, obj='sadasdas'):
        if obj == 'sadasdas':
            obj = self
            
        if isinstance(obj, dict):
            data = {}
            for (k, v) in obj.items():
                data[k] = self.todict(v)
            return data
        elif hasattr(obj, "_ast"):
            return self.todict(obj._ast())
        elif hasattr(obj, "__iter__") and not isinstance(obj, str):
            return [self.todict(v) for v in obj]
        elif hasattr(obj, "__dict__"):
            aaa = dir(obj)
            data = dict([(key, self.todict(value)) 
                for key, value in {field: getattr(obj, field) for field in dir(obj)}.items()
                if not callable(value) and not key.startswith('_')
            ])
            return data
        else:
            return obj

这个解决方案很好用,除了一个非 ascii 字符的问题 "\ס\פ\י\י\ד\ר\מ\ן: \מ\י\מ\ד \ה\ע\כ\ב\י\ש “这些怎么能干净利落地处理?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM