在Python中使用元类填充Factory

Question

Obviously, registering classes in Python is a major use-case for metaclasses. 显然，在Python中注册类是元类的主要用例。 In this case, I've got a serialization module that currently uses dynamic imports to create classes and I'd prefer to replace that with a factory pattern. 在这种情况下，我有一个序列化模块，该模块当前使用动态导入来创建类，我更喜欢将其替换为工厂模式。

So basically, it does this: 因此，基本上，它是这样做的：

data = #(Generic class based on serial data)
moduleName = data.getModule()
className = data.getClass()
aModule = __import__(moduleName)
aClass = getattr(aModule, className)

But I want it to do this: 但我希望它这样做：

data = #(Generic class based on serial data)
classKey = data.getFactoryKey()
aClass = factory.getClass(classKey)

However, there's a hitch: If I make the factory rely on metaclasses, the Factory only learns about the existence of classes after their modules are imported (eg, they're registered at module import time). 但是，有一个障碍：如果我让工厂依赖元类，则工厂仅在导入模块后才知道类的存在（例如，它们是在模块导入时注册的）。 So to populate the factory, I'd have to either: 因此，要填充工厂，我必须：

manually import all related modules (which would really defeat the purpose of having metaclasses automatically register things...) or 手动导入所有相关模块（这实际上会破坏让元类自动注册事物的目的……）或
automatically import everything in the whole project (which strikes me as incredibly clunky and ham-fisted). 自动导入整个项目中的所有内容（这让我感到难以置信且笨拙）。

Out of these options, just registering the classes directly into a factory seems like the best option. 在这些选项中，将类直接注册到工厂似乎是最好的选择。 Has anyone found a better solution that I'm just not seeing? 有没有人找到我没有看到的更好的解决方案？ One option might be to automatically generate the imports required in the factory module by traversing the project files, but unless you do that with a commit-hook, you run the risk of your factory getting out of date. 一种选择可能是通过遍历项目文件来自动生成工厂模块中所需的导入，但是除非您使用提交钩子来执行此操作，否则可能会导致工厂过时的风险。

Update: 更新：

I have posted a self-answer, to close this off. 我发布了一个自我解答，以结束此问题。 If anyone knows a good way to traverse all Python modules across nested subpackages in a way that will never hit a cycle, I will gladly accept that answer rather than this one. 如果有人知道一种很好的方式来遍历嵌套子包中的所有Python模块，而且这种方式永远不会发生循环，那么我将很乐意接受该答案，而不是这个答案。 The main problem I see happening is: 我看到发生的主要问题是：

\A.py (import Sub.S2)
\Sub\S1.py (import A)
\Sub\S2.py
\Sub\S3.py (import Sub.S2)

When you try to import S3, it first needs to import Main (otherwise it won't know what a Sub is). 当您尝试导入S3时，它首先需要导入Main（否则将不知道Sub是什么）。 At that point, it tries to import A. While there, the __init__.py is called, and tries to register A. At this point, A tries to import S1. 此时，它尝试导入A。在那里，将调用__init__.py ，并尝试注册A。这时，A尝试导入S1。 Since the __init__.py in Sub is hit, it tries to import S1, S2, and S3. 由于Sub中的__init__.py被命中，因此它将尝试导入S1，S2和S3。 However, S1 wants to import A (which does not yet exist, as it is in the process of being imported)! 但是，S1想要导入A（它尚不存在，因为它正在导入中）！ So that import fails. 因此导入失败。 You can switch how the traversal occurs (ie, depth first rather than breadth first), but you hit the same issues. 您可以切换遍历的方式（即深度优先而不是广度优先），但是遇到相同的问题。 Any insight on a good traversal approach for this would be very helpful. 对此有很好的遍历方法的任何见解都将非常有帮助。 A two-stage approach can probably solve it (ie, traverse to get all module references, then import as a flat batch). 两阶段方法可能可以解决该问题（即遍历以获取所有模块引用，然后作为统一批处理导入）。 However, I am not quite sure of the best way to handle the final stage (ie, to know when you are done traversing and then import everything). 但是，我不太确定处理最后阶段的最佳方法（即知道何时完成遍历然后再导入所有内容）。 My big restriction is that I do not want to have a super-package to deal with (ie, an extra directory under Sub and A). 我的最大限制是我不想拥有一个超级包来处理（即Sub和A下的一个额外目录）。 If I had that, it could kick off traversal, but everything would need to import relative to that for no good reason (ie, all imports longer by an extra directory). 如果我这样做了，它可能会开始遍历，但是所有的导入都没有充分的理由（例如，所有的导入都需要一个额外的目录）。 Thusfar, adding a special function call to sitecustomize.py seems like my only option (I set the root directory for the package development in that file anyway). 到目前为止，向sitecustomize.py添加一个特殊的函数调用似乎是我唯一的选择（无论如何，我还是在该文件中为软件包开发设置了根目录）。

Answer 1

The solution I found to this was to do all imports on the package based off of a particular base directory and have special __init__.py functions for all of the ones that might have modules with classes that I'd want to have registered. 我发现的解决方案是基于特定的基本目录对软件包进行所有导入，并对所有可能具有要注册的类的模块具有特殊的__init__.py函数。 So basically, if you import any module, it first has to import the base directory and proceeds to walk every package (ie, folder) with a similar __init__.py file. 因此，基本上，如果您导入任何模块，则它首先必须导入基本目录，然后继续遍历每个具有类似__init__.py文件的软件包（即文件夹）。

The downside of this approach is that the same modules are sometimes imported multiple times, which is annoying if anyone leaves code with side effects in a module import. 这种方法的缺点是有时会多次导入相同的模块，如果有人在模块导入中留下带有副作用的代码，这将很烦人。 However, that's bad either way. 但是，这两种方法都不好。 Unfortunately, some major packages (cough, cough: Flask) have serious complaints with IDLE if you do this (IDLE just restarts, rather than doing anything). 不幸的是，如果您这样做，某些主要软件包（咳嗽，咳嗽：烧瓶）会对IDLE造成严重的投诉（IDLE只是重新启动，而不是做任何事情）。 The other downside is that because modules import each other, sometimes it attempts to import a module that it is already in the process of importing (an easily caught error, but one I'm still trying to stamp out). 另一个缺点是，由于模块彼此导入，因此有时它会尝试导入已经在导入过程中的模块（一个容易捕获的错误，但我仍在尝试淘汰）。 It's not ideal, but it does get the job done. 这并不理想，但是确实可以完成工作。 Additional details on the more specific issue are attached, and if anyone can offer a better answer, I will gladly accept it. 附件中附有有关更具体问题的其他详细信息，如果有人可以提供更好的答案，我将很乐意接受。

在Python中使用元类填充Factory

问题描述

Update: 更新：

1 个解决方案

解决方案1
0 2014-05-03 02:19:40

在Python中使用元类填充Factory

问题描述

Update: 更新：

1 个解决方案

解决方案1 0 2014-05-03 02:19:40

解决方案1
0 2014-05-03 02:19:40