简体   繁体   English

OLTP应用程序读取数据仓库数据设计

[英]OLTP Application Reading Data Warehouse Data Design

We are just beginning to put together a data warehouse that will be useful for our reporting requirements, bringing disparate data sources together. 我们刚刚开始整合一个数据仓库,该数据仓库对我们的报告要求非常有用,将不同的数据源整合在一起。

Reviewing the potential uses of the data once together, we have found some potential scenarios where some of our transactional processing systems could reference this data in a useful way. 一起审查数据的潜在用途,我们发现了一些潜在的情况,我们的一些事务处理系统可以以有用的方式引用这些数据。 Obviously the data would be out of date, and optimised for reads, however in some scenarios this is fine for the applications purposes, and would reduce the load on core servers. 显然,数据将过时,并针对读取进行了优化,但在某些情况下,这对于应用程序来说是好的,并且会减少核心服务器上的负载。

My question is this: is it considered a bad design for a transactional system to access the data stored in a data warehouse? 我的问题是:对于事务系统来说,访问存储在数据仓库中的数据是否被认为是一个糟糕的设计? Obviously the primary purpose of our warehouse is reporting, which makes me question whether we should allow other non-reporting systems to read the data. 显然,我们仓库的主要目的是报告,这使我怀疑是否应该允许其他非报告系统读取数据。 My instincts guide me away from allowing applications to read and display the data, are there any good reasons to listen to them?! 我的直觉引导我远离允许应用程序读取和显示数据,是否有充分的理由倾听它们?!

There's nothing fundamentally wrong with making the warehouse a hub for application data consumers as well as analytical data consumers. 将仓库作为应用程序数据使用者和分析数据使用者的中心,没有任何根本性的错误。 Here are some points to think about though. 这里有一些要考虑的要点。

You'll need a technical solution that supports the required level of availability, transaction isolation and consistency for both workloads. 您需要一个技术解决方案,以支持所需的可用性级别,事务隔离和两个工作负载的一致性。 Eg Can you ensure that the application won't starve analytical queries of resources and vice versa? 例如,您是否可以确保应用程序不会使资源的分析查询挨饿,反之亦然? Can you make data available to the applications in a consistent and timely manner even during warehouse loads? 即使在仓库装载期间,您是否可以以一致和及时的方式向应用程序提供数据? It's unwise to assume that you'll always be able to load the warehouse out of hours - even if you think you can do that today. 假设你总是能够在数小时内装载仓库是不明智的 - 即使你认为你今天可以做到这一点。

Make sure your warehouse is well-normalized (meaning at least Boyce-Codd / 5th Normal Form or something close to it). 确保您的仓​​库规范化(至少意味着Boyce-Codd / 5th Normal Form或其附近的东西)。 That's good advice for any warehouse, but perhaps especially if you need to support non-analytical queries. 对于任何仓库来说,这都是一个很好的建议,但特别是如果您需要支持非分析性查询。

Do your apps need to update the warehouse? 您的应用需要更新仓库吗? If so then you need to consider how that integrates with the rest of the ETL process. 如果是这样,那么您需要考虑如何与ETL过程的其余部分集成。

Consider whether to give the app a data mart of its own. 考虑是否为应用程序提供自己的数据集市。 That may well be the safest option to start with. 这可能是最安全的选择。

There is nothing wrong with having your OLTP systems access DW data and, in fact, as systems evolve, you will see the line between transactional and informational systems blur. 让OLTP系统访问DW数据没有任何问题,事实上,随着系统的发展,您将看到事务系统和信息系统之间的界限变得模糊。

I, also, wouldn't worry too much about data structures so long as you come up with something that works. 只要你想出一些有用的东西,我也不会过分担心数据结构。 3 NF might be the answer but, accessing highly summarized data from a multidimensional database might also be a good solution - depending on the problem you are trying to solve. 3 NF可能就是答案,但是从多维数据库访问高度汇总的数据也可能是一个很好的解决方案 - 取决于您尝试解决的问题。

One last thing to consider is the type of data you are trying to get out of the data warehouse. 最后要考虑的是您尝试从数据仓库中获取的数据类型。 Is it summarized transactions (eg average sale amount) or more like shared dimensional data (eg customer name and address)? 是汇总交易(例如平均销售额)还是更像共享的维度数据(例如客户名称和地址)? If the latter, you might want to consider combining a master data management strategy with your data warehouse strategy. 如果是后者,您可能需要考虑将主数据管理策略与数据仓库策略相结合。

One more last thing, try to figure out why you are hesitant to share data between these databases. 还有一件事,试着弄清楚为什么你在这些数据库之间共享数据犹豫不决。 Is it something you can put your finger on or is it really just because you've been trained by our industry to think that they need to be separate? 这是你可以指责的事情还是仅仅因为你已经受到我们行业的培训,认为他们需要分开? Remember, in the end, our jobs are not really to build data warehouses & business intelligence systems; 请记住,最终,我们的工作并不是建立数据仓库和商业智能系统。 they are to solve business problems in reliable, pragmatic, cost effective ways. 他们以可靠,务实,经济的方式解决业务问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM