简体   繁体   English

哪种设计模式适合这种情况?

[英]What design pattern is appropriate for this situation?

I have 2D hydraulic data, which are multigigabyte text files containing depth and velocity information for each point in a grid, broken up into time steps. 我有2D液压数据,这些数据是千兆字节的文本文件,其中包含网格中每个点的深度和速度信息,并分为多个时间步长。 Each timestep contains a depth/velocity value for every point in the grid. 每个时间步长都包含网格中每个点的深度/速度值。 So you could follow one point through each timestep and see how its depth/velocity changes. 因此,您可以在每个时间步长关注一个点,并查看其深度/速度如何变化。 I want to read in this data one timestep at a time, calculating various things - the maximum depth a grid cell achieves, max velocity, the number of the first timestep where water is more than 2 feet deep, etc. The results of each of these calculations will be a grid - max depth at each point, etc. 我想一次读取一个时间步长的数据,计算各种东西-网格单元达到的最大深度,最大速度,水深超过2英尺的第一个时间步长的数量,等等。这些计算将是一个网格-每个点的最大深度,等等。

So far, this sounds like the Decorator pattern. 到目前为止,这听起来像是Decorator模式。 However, I'm not sure how to get the results out of the various calculations - each calculation produces a different grid. 但是,我不确定如何从各种计算中得出结果-每个计算都会产生不同的网格。 I would have to keep references to each decorator after I create it in order to extract the results from it, or else add a getResults() method that returns a map of different results, etc, neither of which sound ideal. 创建装饰器后,我必须保留对每个装饰器的引用,以便从装饰器中提取结果,或者添加getResults()方法以返回不同结果的映射图,等等,这两种方法都不理想。

Another option is the Strategy pattern. 另一种选择是策略模式。 Each calculation is a different algorithm that operates on a time step (current depth/velocity) and the results of previous rounds (max depth so far, max velocity so far, etc). 每种计算都是一个不同的算法,它按时间步长(当前深度/速度)和前几轮的结果(到目前为止的最大深度,到目前为止的最大速度等)进行运算。 However, these previous results are different for each computation - which means either the algorithm classes become stateful, or it becomes the caller's job to keep track of previous results and feed them in. I also dislike the Strategy pattern because the behavior of looping over the timesteps becomes the caller's responsibility - I'd like to just give the "calculator" an iterator over the timesteps (fetching them from the disk as needed) and have it produce the results it needs. 但是,这些先前的结果对于每次计算而言都是不同的-这意味着算法类变为有状态的,或者成为跟踪先前结果并将其输入的调用者的工作。我还不喜欢“策略”模式,因为在模型上循环的行为timesteps成为调用者的责任-我想在时间步长上给“计算器”一个迭代器(根据需要从磁盘中获取它们),并使其产生所需的结果。

Additional constraints: 其他限制:

  • Input is large and being read from disk, so iterating exactly once, by time step, is the only practical method 输入量很大,并且可以从磁盘读取,因此按时间步长仅迭代一次是唯一实用的方法
  • Grids are large, so calculations should be done in place as much as possible 网格很大,因此应该尽可能多地进行计算

If i understand your problem right, you have a grid_points which have many timesteps & each timestep has depth & velocity. 如果我正确理解您的问题,那么您将拥有一个grid_points,其中包含许多时间步长,每个时间步长都具有深度和速度。 Now have GBs of data. 现在有GB的数据。

I would suggest to do one pass on the data & store the parsed data in a RDBMS. 我建议对数据进行一次传递并将解析后的数据存储在RDBMS中。 then run queries or stored procedures on this data. 然后对该数据运行查询或存储过程。 This way at least the application will not run out of memory 这样,至少应用程序不会耗尽内存

First, maybe I've not well understood the issue and miss the point in my answer, in which case I apologize for taking your time. 首先,也许我不太了解这个问题,错过了答案的重点,在这种情况下,我很抱歉浪费您的时间。

At first sight I would think of an approach that's more akin to the "strategy pattern", in combination with a data-oriented base, something like the following pseudo-code: 乍一看,我会想到一种更类似于“策略模式”的方法,结合面向数据的基础,类似于以下伪代码:

foreach timeStamp

  readGridData

  foreach activeCalculator in activeCalculators

    useCalculatorPointerListToAccessSpecificStoredDataNeededForNewCalculation

    performCalculationOnFreshGridData

    updateUpdatableData

    presentUpdatedResultsToUser

    storeGridResultsInDataPool(OfResultBaseClassType)

    discardNoLongerNeededStoredGridResults

  next calculator
next timeStep

Again, sorry if this is off the point. 再次,抱歉,这是不可能的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM