简体   繁体   English

在SQL中保存大型每月数据备份的最佳方法是什么?

[英]What's the best way to save large monthly data backups in SQL?

I work on a program that stores information about network connections across my University and I have been asked to create a report that shows the status changes of these connections over time. 我正在研究一个程序,该程序存储有关我大学之间网络连接的信息,并且要求我创建一个报告,以显示这些连接的状态随时间的变化。 I was thinking about adding another table that has the current connection information and the date the data was added so when the report is run, it just grabs the data at that date, but I'm worried that the report might get slow after a couple of months as it would be adding about 50,000 rows every month. 我当时正在考虑添加另一个表,该表包含当前的连接信息和添加数据的日期,因此在运行报告时,它只是在该日期获取数据,但是我担心报告可能会在几个时间后变慢数月,因为每个月将增加约50,000行。 Is there a better way to do this? 有一个更好的方法吗? We use a Microsoft SQL Server. 我们使用Microsoft SQL Server。

It depends on the reason you are holding historical data for facts. 这取决于您拥有事实历史数据原因。

If the reason is: 如果原因是:

  • For reporting needs then you could hold it in the same table by adding two date columns FromDate and ToDate which will remove the need to join the active and historical data tables later on. 对于报告需求,您可以通过添加两个日期列FromDateToDate将其保存在同一表中,这将消除以后再连接活动数据表和历史数据表的需要。
  • Just for reference then it makes sense to have it in a different table as it may decrease the performance of your indexes on your active table. 仅供参考,将它放在不同的表中很有意义,因为这可能会降低活动表上索引的性能。

I'll highlight the Slowly Changing Dimension (SCD) type 2 approach that tracks data history by maintaining multiple versions of records and uses either the EndDate or a flag to identify the active record. 我将重点介绍“缓慢变化维度(SCD)2类”方法,该方法通过维护记录的多个版本来跟踪数据历史记录,并使用EndDate或标志来标识活动记录。 This method allows tracking any number of historical records as each time a new record is inserted, the older ones are populated with an EndDate . 每次插入新记录时,此方法允许跟踪任意数量的历史记录,而较旧的记录将填充EndDate

Step 1: For re-loaded facts UPDATE IsActive = 0 for the record to be history preserved and populate EndDate as the current date. 步骤1:对于重新加载的事实, UPDATE IsActive = 0 ,以便保留记录,并以EndDate作为当前日期。

merge ActiveTable as T
using DataToBeLoaded as D  
    on  T.ID = D.ID
         and 
        T.isactive = 1  -- Current active entry
  when matched then
    update set T.IsActive = 0,
               T.EndDate = GETDATE();

Step 2: Insert the latest data into the ActiveTable with IsActive = 1 and FromDate as the current date. 步骤2:将最新数据插入IsActive = 1并将FromDate作为当前日期的ActiveTable中。

Disclaimer: The following approach using SCD 2 could make your data warehouse huge. 免责声明:以下使用SCD 2的方法可能会使您的数据仓库变得庞大。 However, I don't believe it would affect performance much for your scenario. 但是,我认为这不会严重影响您的方案的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Nodejs 将大数据保存到 SQL 服务器的最佳方法 - Best way to save large data to SQL server using Nodejs 在SQL数据库中保存大型文本数据(保存在stringbuilder中)的最佳方法? - Best way to save large text data (hold in a stringbuilder) in SQL Database? 将 XML 数据保存到 SQL 服务器的最佳方法是什么? - What is the best way to save XML data to SQL Server? 基于月列对数据进行分组的最佳方法是什么? - What would be the best way to group my data that is based on monthly columns? SQL Server-更改PK数据类型的最佳方法是什么? - SQL Server - What's the best way to change a PK data type? 在SQL Server中存储大数据的最佳方法 - Best way to store large data in SQL Server 在MS SQL Server中管理大量表的最佳方法是什么? - What's the best way to manage a large number of tables in MS SQL Server? 在SQL Server中保存大型请求和响应xml的最佳实践是什么 - What is best practice to save large request and response xml in SQL Server 什么是保存大量顺序数据的最佳方法 - What is the best method to save large amounts of sequential data 在SQL Server中分区大表的最佳方法是什么? - What is the best way to partition large tables in SQL Server?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM