“Streaming”从SQL Server中的表中读取超过1000万行

Question

What is the best strategy to read millions of records from a table (in SQL Server 2012, BI instance), in a streaming fashion (like SQL Server Management Studio does)? 以流方式（如SQL Server Management Studio）从表（在SQL Server 2012，BI实例中）读取数百万条记录的最佳策略是什么？

I need to cache these records locally (C# console application) for further processing. 我需要在本地缓存这些记录（C＃控制台应用程序）以进行进一步处理。

Update - Sample code that works with SqlDataReader 更新 - 与SqlDataReader一起使用的示例代码

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data;
using System.Data.SqlClient;
using System.Threading;


namespace ReadMillionsOfRows
{
    class Program
    {
        static ManualResetEvent done = new ManualResetEvent(false);


        static void Main(string[] args)
        {

          Process();
          done.WaitOne();
        }

        public static async Task Process()
        {
            string connString = @"Server=;Database=;User Id=;Password=;Asynchronous Processing=True";
            string sql = "Select * from tab_abc";

            using (SqlConnection conn = new SqlConnection(connString))
            {
                await conn.OpenAsync();
                using (SqlCommand comm = new SqlCommand(sql))
                {
                    comm.Connection = conn;
                    comm.CommandType = CommandType.Text;

                    using (SqlDataReader reader = await comm.ExecuteReaderAsync())
                    {
                        while (await reader.ReadAsync())
                        {
                            //process it here
                        }
                    }
                }
            }

            done.Set();
        }

    }
}

Answer 1

Use a SqlDataReader it is forward only and fast. 使用SqlDataReader它只是前进和快速。 It will only hold a reference to a record while it is in the scope of reading it. 它只会在读取记录范围时保留对记录的引用。

Answer 2

That depends on what your cache looks like. 这取决于缓存的外观。 If you're going to store everything in memory, and a DataSet is approriate as a cache, just read everything to the DataSet. 如果您要将所有内容存储在内存中，并且DataSet适合作为缓存，则只需读取DataSet中的所有内容即可。

If not, use the SqlDataReader as suggested above, read the records one by one storing them in your big cache. 如果没有，请按照上面的建议使用SqlDataReader ，逐个读取记录，将它们存储在大缓存中。

Do note, however, that there's already a very popular caching mechanism for large database tables - your database. 但请注意，对于大型数据库表 - 您的数据库，已经有一种非常流行的缓存机制。 With the proper index configuration, the database can probably outperform your cache. 使用正确的索引配置，数据库可能会胜过您的缓存。

Answer 3

You can use Entity Framework and paginate the select using Take and Skip to fetch the rows by buffer. 您可以使用Entity Framework并使用Take和Skip对select进行分页， Take通过缓冲区获取行。 If you need in memory caching for such a large dataset I would suggest using GC.GetTotalMemory in order to test if there is any free memory left. 如果你需要内存缓存这么大的数据集我建议使用GC.GetTotalMemory来测试是否还有剩余空闲内存。

“Streaming”从SQL Server中的表中读取超过1000万行

问题描述

3 个解决方案

解决方案1
7 已采纳 2012-10-24 08:32:30

解决方案2
3 2012-10-24 09:36:41

解决方案3
0 2012-10-24 08:41:07

“Streaming”从SQL Server中的表中读取超过1000万行

问题描述

3 个解决方案

解决方案1 7 已采纳 2012-10-24 08:32:30

解决方案2 3 2012-10-24 09:36:41

解决方案3 0 2012-10-24 08:41:07

解决方案1
7 已采纳 2012-10-24 08:32:30

解决方案2
3 2012-10-24 09:36:41

解决方案3
0 2012-10-24 08:41:07