[英]What's the best way to save large monthly data backups in SQL?
I work on a program that stores information about network connections across my University and I have been asked to create a report that shows the status changes of these connections over time. 我正在研究一个程序,该程序存储有关我大学之间网络连接的信息,并且要求我创建一个报告,以显示这些连接的状态随时间的变化。 I was thinking about adding another table that has the current connection information and the date the data was added so when the report is run, it just grabs the data at that date, but I'm worried that the report might get slow after a couple of months as it would be adding about 50,000 rows every month.
我当时正在考虑添加另一个表,该表包含当前的连接信息和添加数据的日期,因此在运行报告时,它只是在该日期获取数据,但是我担心报告可能会在几个时间后变慢数月,因为每个月将增加约50,000行。 Is there a better way to do this?
有一个更好的方法吗? We use a Microsoft SQL Server.
我们使用Microsoft SQL Server。
It depends on the reason you are holding historical data for facts. 这取决于您拥有事实历史数据的原因。
If the reason is: 如果原因是:
FromDate
and ToDate
which will remove the need to join the active and historical data tables later on. FromDate
和ToDate
将其保存在同一表中,这将消除以后再连接活动数据表和历史数据表的需要。 I'll highlight the Slowly Changing Dimension (SCD) type 2 approach that tracks data history by maintaining multiple versions of records and uses either the EndDate
or a flag to identify the active record. 我将重点介绍“缓慢变化维度(SCD)2类”方法,该方法通过维护记录的多个版本来跟踪数据历史记录,并使用
EndDate
或标志来标识活动记录。 This method allows tracking any number of historical records as each time a new record is inserted, the older ones are populated with an EndDate
. 每次插入新记录时,此方法允许跟踪任意数量的历史记录,而较旧的记录将填充
EndDate
。
Step 1: For re-loaded facts UPDATE
IsActive = 0
for the record to be history preserved and populate EndDate
as the current date. 步骤1:对于重新加载的事实,
UPDATE
IsActive = 0
,以便保留记录,并以EndDate
作为当前日期。
merge ActiveTable as T
using DataToBeLoaded as D
on T.ID = D.ID
and
T.isactive = 1 -- Current active entry
when matched then
update set T.IsActive = 0,
T.EndDate = GETDATE();
Step 2: Insert the latest data into the ActiveTable with IsActive = 1
and FromDate
as the current date. 步骤2:将最新数据插入
IsActive = 1
并将FromDate
作为当前日期的ActiveTable中。
Disclaimer: The following approach using SCD 2 could make your data warehouse huge. 免责声明:以下使用SCD 2的方法可能会使您的数据仓库变得庞大。 However, I don't believe it would affect performance much for your scenario.
但是,我认为这不会严重影响您的方案的性能。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.