简体   繁体   English

基于管道的系统的架构/设计。 如何改进这段代码?

[英]Architecture/Design of a pipeline-based system. How to improve this code?

I have a pipeline-based application that analyzes text in different languages (say, English and Chinese).我有一个基于管道的应用程序,可以分析不同语言(比如英语和中文)的文本。 My goal is to have a system that can work in both languages, in a transparent way.我的目标是拥有一个可以以透明方式使用两种语言的系统。 NOTE : This question is long because it has many simple code snippets.注意:这个问题很长,因为它有很多简单的代码片段。

The pipeline is composed of three components (let's call them A, B, and C), and I've created them in the following way so that the components are not tightly coupled:管道由三个组件组成(我们称它们为 A、B 和 C),我按以下方式创建了它们,这样组件就不会紧密耦合:

public class Pipeline {
    private A componentA;
    private B componentB;
    private C componentC;

    // I really just need the language attribute of Locale,
    // but I use it because it's useful to load language specific ResourceBundles.
    public Pipeline(Locale locale) {
        componentA = new A();
        componentB = new B();
        componentC = new C();
    }

    public Output runPipeline(Input) {
        Language lang = LanguageIdentifier.identify(Input);
        //
        ResultOfA resultA = componentA.doSomething(Input);
        ResultOfB resultB = componentB.doSomethingElse(resultA); // uses result of A
        return componentC.doFinal(resultA, resultB); // uses result of A and B
    }
}

Now, every component of the pipeline has something inside which is language specific.现在,管道的每个组件内部都有一些特定于语言的东西。 For example, in order to analyze Chinese text, I need one lib, and for analyzing English text, I need another different lib.例如,为了分析中文文本,我需要一个库,而为了分析英文文本,我需要另一个不同的库。

Moreover, some tasks can be done in one language and cannot be done in the other.此外,有些任务可以用一种语言完成,而不能用另一种语言完成。 One solution to this problem is to make every pipeline component abstract (to implement some common methods), and then have a concrete language-specific implementation.解决这个问题的一种方法是将每个管道组件抽象化(实现一些通用方法),然后有一个具体的特定于语言的实现。 Exemplifying with component A, I'd have the following:以组件 A 为例,我将具有以下内容:

public abstract class A {
    private CommonClass x;  // common to all languages
    private AnotherCommonClass y; // common to all languages

    abstract SomeTemporaryResult getTemp(input); // language specific
    abstract AnotherTemporaryResult getAnotherTemp(input); // language specific

    public ResultOfA doSomething(input) {
          // template method
          SomeTemporaryResult t = getTemp(input); // language specific
          AnotherTemporaryResult tt = getAnotherTemp(input); // language specific
          return ResultOfA(t, tt, x.get(), y.get());
    }
}

public class EnglishA extends A {
    private EnglishSpecificClass something;
    // implementation of the abstract methods ... 
}

In addition, since each pipeline component is very heavy and I need to reuse them, I thought of creating a factory that caches up the component for further use, using a map that uses the language as the key, like so (the other components would work in the same manner):此外,由于每个管道组件都非常重,我需要重用它们,所以我想到创建一个工厂来缓存组件以供进一步使用,使用 map 以语言为键,就像这样(其他组件会以相同的方式工作):

public Enum AFactory {
    SINGLETON;
    
    private Map<String, A> cache; // this map will only have one or two keys, is there anything more efficient that I can use, instead of HashMap?
    
    public A getA(Locale locale) {
        // lookup by locale.language, and insert if it doesn't exist, et cetera
        return cache.get(locale.getLanguage());
    }
}

So, my question is: What do you think of this design?所以,我的问题是:你觉得这个设计怎么样? How can it be improved ?如何改进 I need the "transparency" because the language can be changed dynamically, based on the text that it's being analyzed.我需要“透明度”,因为可以根据正在分析的文本动态更改语言。 As you can see from the runPipeline method, I first identify the language of the Input, and then, based on this, I need to change the pipeline components to the identified language.runPipeline方法可以看出,我首先识别了 Input 的语言,然后,基于此,我需要将管道组件更改为识别的语言。 So, instead of invoking the components directly, maybe I should get them from the factory, like so:所以,与其直接调用组件,也许我应该从工厂获取它们,如下所示:

public Output runPipeline(Input) {
    Language lang = LanguageIdentifier.identify(Input);
    ResultOfA resultA = AFactory.getA(lang).doSomething(Input);
    ResultOfB resultB = BFactory.getB(lang).doSomethingElse(resultA);
    return CFactory.getC(lang).doFinal(resultA, resultB);
}

Thank you for reading this far.感谢您阅读到这里。 I very much appreciate every suggestion that you can make on this question.我非常感谢你就这个问题提出的每一个建议。

I like the basic design.我喜欢基本的设计。 If the classes are simple enough, I might consider consolidating the A/B/C factories into a single class, as it seems there could be some sharing in behavior at that level.如果这些类足够简单,我可能会考虑将 A/B/C 工厂合并到一个 class 中,因为在该级别上似乎可以共享一些行为。 I'm assuming that these are really more complex than they appear, though, and that's why that is undesirable.不过,我假设这些实际上比它们看起来更复杂,这就是为什么这是不可取的。

The basic approach of using Factories to reduce coupling between components is sound, imo. imo,使用工厂来减少组件之间耦合的基本方法是合理的。

The factory idea is good, as is the idea, if feasible, to encapsulate the A, B, & C components into single classes for each language.工厂的想法很好,如果可行的话,将 A、B 和 C 组件封装到每种语言的单个类中。 One thing that I would urge you to consider is to use Interface inheritance instead of Class inheritance. You could then incorporate an engine that would do the runPipeline process for you.我敦促您考虑的一件事是使用Interface inheritance 而不是Class inheritance。然后您可以合并一个引擎来为您执行runPipeline过程。 This is similar to the Builder/Director pattern .这类似于Builder/Director 模式 The steps in this process would be as follows:此过程中的步骤如下:

  1. get input获取输入
  2. use factory method to get correct interface (english/chinese)使用工厂方法获得正确的界面(英文/中文)
  3. pass interface into your engine将接口传递给你的引擎
  4. runPipeline and get result运行管道并获得结果

On the extends vs implements topic, Allen Holub goes a bit over the top to explain the preference for Interfaces .extends vs implements主题上, Allen Holub 有点过头来解释对Interfaces的偏好。


Follow up to you comments:跟进你的评论:

My interpretation of the application of the Builder pattern here would be that you have a Factory that would return a PipelineBuilder .我在这里对 Builder 模式应用的解释是,您有一个返回PipelineBuilderFactory The PipelineBuilder in my design is one that encompases A, B, & C, but you could have separate builders for each if you like.在我的设计中, PipelineBuilder是包含 A、B 和 C 的一个,但如果您愿意,您可以为每个构建器使用单独的构建器。 This builder then is given to your PipelineEngine which uses the Builder to generate your results.然后将该构建器提供给您的PipelineEngine ,它使用该Builder生成您的结果。

As this makes use of a Factory to provide the Builders, your idea above for a Factory remains in tact, replete with its caching mechanism.由于这使用工厂来提供构建器,因此您上面关于工厂的想法仍然完好无损,充满了它的缓存机制。

With regard to your choice of abstract extension, you do have the choice of giving your PipelineEngine ownership of the heavy objects.关于您选择的abstract扩展,您确实可以选择让您的PipelineEngine拥有重对象的所有权。 However, if you do go the abstract way, note that the shared fields that you have declared are private and therefore would not be available to your subclasses.但是,如果您以abstract方式执行 go,请注意您声明的共享字段是private的,因此您的子类将无法使用。

If I'm not mistaken, What you are calling a factory is actually a very nice form of dependency injection.如果我没记错的话,你所说的工厂实际上是一种非常好的依赖注入形式。 You are selecting an object instance that is best able to meet the needs of your parameters and return it.您正在选择最能满足您参数需求的object实例并返回。

If I'm right about that, you might want to look into DI platforms.如果我是对的,您可能想研究 DI 平台。 They do what you did (which is pretty simple, right?) then they add a few more abilities that you may not need now but you may find would help you later.他们做你所做的(这很简单,对吧?)然后他们添加了一些你现在可能不需要但你以后可能会发现对你有帮助的能力。

I'm just suggesting you look at what problems are solved now.我只是建议你看看现在解决了什么问题。 DI is so easy to do yourself that you hardly need any other tools, but they might have found situations you haven't considered yet. DI 非常容易自己完成,您几乎不需要任何其他工具,但他们可能已经发现了您尚未考虑的情况。 Google finds many great looking links right off the bat. 谷歌立即找到许多漂亮的链接。

From what I've seen of DI, it's likely that you'll want to move the entire creation of your "Pipe" into the factory, having it do the linking for you and just handing you what you need to solve a specific problem, but now I'm really reaching--my knowledge of DI is just a little better than my knowledge of your code (in other words, I'm pulling most of this out of my butt).从我对 DI 的了解来看,您可能希望将“管道”的整个创建移动到工厂中,让它为您进行链接,并只为您提供解决特定问题所需的东西,但现在我真的达到了——我对 DI 的了解只比我对你的代码的了解好一点(换句话说,我把大部分内容都从我的屁股里拿出来了)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM