简体   繁体   English

在python中获取类的路径或名称空间,即使它是嵌套的

[英]getting the class path or name space of a class in python even if it is nested

I'm currently writing a serialization module in Python that can serialize user defined classes. 我目前正在用Python编写一个序列化模块,可以序列化用户定义的类。 in order to do this I need to get the full name space of the object and write it to a file. 为了做到这一点,我需要获取对象的全名空间并将其写入文件。 I can then use that string to recreate the object. 然后我可以使用该字符串来重新创建对象。

for example assume that we have the following class structure in a file named A.py 例如,假设我们在名为A.py的文件中具有以下类结构

class B:
    class C:
        pass

now with the assumption that my_klass_string is the string "A::B::C" 现在我假设my_klass_string是字符串"A::B::C"

klasses = my_klass_string.split("::")
if globals().has_key(klasses[0]):   
    klass = globals()[klasses[0]]
else:
    raise TypeError, "No class defined: %s} " % klasses[0]
if len(klasses) > 1:
    for klass_string in klasses:
        if klass.__dict__.has_key(klass_string):
            klass = klass.__dict__[klass_string]
        else:
            raise TypeError, "No class defined: %s} " % klass_string            
klass_obj = klass.__new__(klass)

I can create an instance of the class C even though it lies under class B in the module A . 我可以创建一个C类的实例,即使它位于模块A B类之下。 the above code is equivalent to calling eval(klass_obj = ABC__new__(ABC)) 上面的代码相当于调用eval(klass_obj = ABC__new__(ABC))

note: I'm using __new__() here because I'm reconstituting a serialized object and I don't want to init the object as I don't know what parameters the class's __init__ methods takes. 注意:我在这里使用__new__() ,因为我正在重构序列化对象,我不想初始化对象,因为我不知道类的__init__方法采用了什么参数。 I want to create the object with out calling init and then assign attributes to it later. 我想在不调用init的情况下创建对象,然后再为其分配属性。

any way I can create an object of class ABC from a string. 我可以用字符串创建ABC类的对象。 bout how do I go the other way? 我该如何走另一条路? how to I get a string that describes the full path to the class from an instance of that class even if the class is nested? 如何从一个类的实例获取一个描述该类的完整路径的字符串,即使该类是嵌套的?

You can't, in any reasonable non-crazy way. 你不能以任何合理的非疯狂方式。 I guess you could find the class name and the module, and then for each class name verify that it exist in the module, and if not, go through all classes that does exist in the module in a hierarchical way until you find it. 我猜你可以找到类名和模块,然后为每个类名验证它是否存在于模块中,如果没有,则以分层方式遍历模块中存在的所有类,直到找到它为止。

But since there is no reason to ever have class hierarchy like that, it's a non-problem. 但由于没有理由像这样拥有类层次结构,因此它不是问题。 :-) :-)

Also, I know you don't want to hear this at this point in your work, but: 另外,我知道你不想在工作的这一点听到这个,但是:

Cross-platform serialization is an interesting subject, but doing it with objects like this is unlikely to be very useful, as the target system must have the exact same object hierarchy installed. 跨平台序列化是一个有趣的主题,但使用这样的对象不太可能非常有用,因为目标系统必须安装完全相同的对象层次结构。 You must therefore have two systems written in two different languages that are exactly equivalent. 因此,您必须使用两种完全相同的语言编写两个系统。 That's almost impossible and likely to not be worth the trouble. 这几乎是不可能的,可能不值得这么麻烦。

You would for example not be able to use any object from Pythons standard library, as those don't exist in Ruby. 例如,您无法使用Pythons标准库中的任何对象,因为Ruby中不存在这些对象。 The end result is that you must make your own object hierarchy that in the end use only basic types like strings and numbers. 最终结果是您必须创建自己的对象层次结构,最终只使用字符串和数字等基本类型。 And in that case, your objects have just become containment for basic primitives, and then you can just as well serialize everything with JSON or XML anyway. 在这种情况下,您的对象刚刚成为基本原语的包含, 然后您也可以使用JSON或XML序列化所有内容。

You cannot get the "full path to the class given an instance of the class", for the reason that there is no such thing in Python. 由于Python中没有这样的东西,你无法获得“给定类的实例的类的完整路径”。 For instance, building on your example: 例如,建立你的例子:

>>> class B(object):
...     class C(object):
...             pass
... 
>>> D = B.C
>>> x = D()
>>> isinstance(x, B.C)
True

What should the "class path" of x be? x的“类路径”应该是什么? D or BC ? D还是BC Both are equally valid, and thus Python does not give you any means of telling one from the other. 两者都同样有效,因此Python没有给你任何方法来告诉对方。

Indeed, even Python's pickle module has troubles pickling the object x : 实际上,即使Python的pickle模块也有麻烦挑选对象x

>>> import pickle
>>> t = open('/tmp/x.pickle', 'w+b')
>>> pickle.dump(x, t)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/pickle.py", line 1362, in dump
    Pickler(file, protocol).dump(obj)
  ...
  File "/usr/lib/python2.6/pickle.py", line 748, in save_global
   (obj, module, name))
  pickle.PicklingError: Can't pickle <class '__main__.C'>: it's not found as __main__.C

So, in general, I see no other option than adding an attribute to all your classes (say, _class_path ), and your serialization code would look it up for recording the class name into the serialized format: 所以,一般来说,我没有看到除了向所有类添加属性(比如_class_path )之外的其他选项,并且您的序列化代码会查找将类名称记录为序列化格式:

class A(object):
  _class_path = 'mymodule.A'
  class B(object):
    _class_path = 'mymodule.A.B'
    ...

You can even do this automatically with some metaclass magic (but also read the other comments in the same SO post for caveats that may apply if you do the D=BC above). 您甚至可以使用某些元类魔法自动执行此操作(但也可以阅读相同SO帖子中的其他注释, 获取可能适用的警告,如果您执行上面的D=BC )。

That said, if you can limit your serialization code to (1) instances of new-style classes, and (2) these classes are defined at the top-level of a module, then you can just copy what pickle does (function save_global at lines 730--768 in pickle.py from Python 2.6). 也就是说,如果你可以将序列化代码限制为(1)新式类的实例,并且(2)这些类是在模块的顶层定义的,那么你可以复制pickle所做的(函数save_global at来自Python 2.6的pickle.py中的730--768行。

The idea is that every new-style class defines attributes __name__ and __module__ , which are strings that expand to the class name (as found in the sources) and the module name (as found in sys.modules ); 我们的想法是每个新式类定义属性__name____module__ ,它们是扩展为类名(在源代码中找到)和模块名称(在sys.modules找到)的字符串; by saving these you can later import the module and get an instance of the class: 通过保存这些,您可以稍后导入模块并获取该类的实例:

__import__(module_name)
class_obj = getattr(sys.modules[module_name], class_name)

I'm currently writing a serialization module in Python that can serialize user defined classes. 我目前正在用Python编写一个序列化模块,可以序列化用户定义的类。

Don't . 不要 The standard library already includes one. 标准库已包含一个。 Depending on how you count, actually, it includes at least two ( pickle and shelve ). 实际上,根据您的计算方式,它至少包含两个( pickleshelve )。

There are two ways of doing this. 两种方法可以做到这一点。

Solution 1 解决方案1

The first one goes via the garbage-collector. 第一个通过垃圾收集器。

B -> __dict__ -> C

this is the code: 这是代码:

>>> class B(object):
    class C(object):
        pass

>>> gc.get_referrers(B.C) # last element in the list
[<attribute '__dict__' of 'C' objects>, <attribute '__weakref__' of 'C' objects>, (<class '__main__.C'>, <type 'object'>), {'__dict__': <attribute '__dict__' of 'B' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'B' objects>, 'C': <class '__main__.C'>, '__doc__': None}] 

>>> gc.get_referrers(gc.get_referrers(B.C)[-1]) # first element in this list
[<class '__main__.B'>, [<attribute '__dict__' of 'C' objects>, <attribute '__weakref__' of 'C' objects>, (<class '__main__.C'>, <type 'object'>), {'__dict__': <attribute '__dict__' of 'B' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'B' objects>, 'C': <class '__main__.C'>, '__doc__': None}]]

>>> gc.get_referrers(gc.get_referrers(B.C)[-1])[0]
<class '__main__.B'>

Algorithm: 算法:

  1. search for a class dictionairy with the same __module__ as C 使用与C相同的__module__搜索类dictionairy
  2. take the class, use the 'C' attribute 拿类,使用'C'属性
  3. if this class is nested. 如果这个类是嵌套的。 do 1. recurively 反复做

Solution 2 解决方案2

use the source file. 使用源文件。 use inspect to get the lines of the class and scan upwards for new classes that nest it. 使用inspect来获取类的行并向上扫描以嵌套它的新类。

Note: I know no clean way in python 2, but python 3 provides one. 注意:我知道在python 2中没有干净的方法,但python 3提供了一个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM