简体繁体 English

使用时序数据评估表达树的设计建议

[英]Design suggestion for expression tree evaluation with time-series data

原文 2010-05-05 03:01:47 2 1 c#/ design-patterns/ expression-trees

I have a (C#) genetic program that uses financial time-series data and it's currently working but I want to re-design the architecture to be more robust. 我有一个使用财务时间序列数据的（C＃）遗传程序，目前正在运行，但是我想重新设计该体系结构，以使其更强大。 My main goals are: 我的主要目标是：

sequentially present the time-series data to the expression trees. 将时间序列数据顺序呈现给表达式树。
allow expression trees to access previous data rows when needed. 允许表达式树在需要时访问先前的数据行。
to optimize performance of the data access while evaluating the expression trees. 在评估表达式树的同时优化数据访问的性能。
keep a common interface so various types of data can be used. 保持通用接口，以便可以使用各种类型的数据。

Here are the possible approaches I've thought about: 这是我考虑过的可能方法：

I can evaluate the expression tree by passing in a data row into the root node and let each child node use the same data row. 我可以通过将数据行传递到根节点并让每个子节点使用相同的数据行来评估表达式树。
I can evaluate the expression tree by passing in the data row index and letting each node get the data row from a shared DataSet (currently I'm passing the row index and going to multiple synchronized arrays to get the data). 我可以通过传入数据行索引并让每个节点从共享的DataSet获取数据行来评估表达式树（当前，我正在传递行索引并进入多个同步数组以获取数据）。
Hybrid: an immutable data set is accessible by all of the expression trees and each expression tree is evaluated by passing in a data row. 混合的：所有表达式树都可以访问一个不变的数据集，并且通过传递数据行来评估每个表达式树。

The benefit of the first approach is that the data row is being passed into the expression tree and there is no further query done on the data set (which should increase performance in a multithreaded environment). 第一种方法的好处是将数据行传递到表达式树中，并且不再对数据集进行任何查询（这将提高多线程环境中的性能）。 The drawback is that the expression tree does not have access to the rest of the data (in case some of the functions need to do calculations using previous data rows). 缺点是表达式树无法访问其余数据（以防某些功能需要使用以前的数据行进行计算）。

The benefit of the second approach is that the expression trees can access any data up to the latest data row, but unless I specify what that row is, I'll have to iterate through the rows and figure out which one is the last one. 第二种方法的好处是，表达式树可以访问最新数据行之前的任何数据，但是除非我指定该行是什么，否则我将不得不遍历各行并找出哪一个是最后一行。

The benefit of the hybrid is that it should generally perform better and still provide access to the earlier data. 混合的好处是，它通常应该表现得更好，并且仍然提供对早期数据的访问。 It supports two basic "views" of data: the latest row and the previous rows. 它支持数据的两个基本“视图”：最新行和先前行。

Do you guys know of any design patterns or do you have any tips that can help me build this type of system? 你们是否知道任何设计模式，或者您有什么技巧可以帮助我构建这种类型的系统？ Should I use a DataSet to hold and present the data, or are there more efficient ways to present rows of data while maintaining a simple interface? 我应该使用DataSet来保存和呈现数据，还是在维护简单接口的同时，有更有效的方式来呈现数据行？

FYI: All of my code is written in C#. 仅供参考：我所有的代码都是用C＃编写的。

1 个解决方案

What you said mostly are all about operations, which should not be the first initiative for OO design. 您所说的主要是关于操作的，这不应该是面向对象设计的第一个倡议。 I suggest you create RowObject which maps to the every row of the data table and create another class RowObjectManager which contains a collection of RowObject and related operations like calling the algorithm. 我建议您创建映射到数据表每一行的RowObject，并创建另一个类RowObjectManager，其中包含RowObject和相关操作（如调用算法）的集合。 This is pretty much like Facade pattern and the you can encapsulate the algorithm in another class and call the algorithm using dependency injection way, which can be decoupled from the RowObjectManager class. 这非常类似于Facade模式，您可以将算法封装在另一个类中，并使用依赖项注入方式调用算法，该方式可以与RowObjectManager类分离。

Then you should pass OBJECT rather then the properties of the object like index to the algorithm, and the algorithm can return the result to the caller. 然后，您应该将OBJECT而不是对象的属性（如index）传递给算法，然后算法可以将结果返回给调用方。