在SQL Server中选择整个表的最快方法是什么？

Question

I am writing a app that reads a whole table, does some processing, then writes the resulting data to another table. 我正在编写一个应用程序，它读取整个表，进行一些处理，然后将结果数据写入另一个表。 I am using the SqlBulkCopy class (.net version of "bcp in") which does the insert very fast. 我正在使用SqlBulkCopy类（.net版本的“bcp in”），它可以非常快速地插入。 But I cannot find any efficent way to select data in the first place. 但我首先找不到任何有效的方法来选择数据。 there is not .net equivilent of "bcp out", which seems strange to me. 没有.net等效的“bcp out”，这对我来说似乎很奇怪。

Currently I'm using select * from table_name . 目前我正在使用select * from table_name 。 For prespective it takes 2.5 seconds to select 6,000 rows ... and only 600ms to bulk insert the same number of rows. 对于预期，选择6,000行需要2.5秒......并且只有600毫秒来批量插入相同数量的行。

I would expect that selecting data should always be faster than inserting. 我希望选择数据总是比插入更快。 What is the fastest way to select all rows & columns from a table? 从表中选择所有行和列的最快方法是什么？

Answers to qeustions: qeustions的答案：

I timed my select to take 2.5 seconds 2 ways. 我计时选择2.5秒2。 First was while running my application and running a sql trace. 首先是在运行我的应用程序并运行sql跟踪时。 second was running the same query in SSMS. 第二个是在SSMS中运行相同的查询。 Both retured about the same result. 两人都恢复了大致相同的结果。
I am reading data using SqlDataReader. 我正在使用SqlDataReader读取数据。
No other applications are using this database. 没有其他应用程序正在使用此数据库
My current processing takes under 1 second, so 2+ second read time is relatively large. 我目前的处理时间不到1秒，因此2秒以上的读取时间相对较长。 But mostly I'm concerned(interested) in performance when scaling this up to 100,000 rows and millions of rows. 但大多数情况下，当我将其扩展到100,000行和数百万行时，我对性能感兴趣（感兴趣）。
Sql Server 08r2 and my application are both running on my dev machine. Sql Server 08r2和我的应用程序都在我的开发机器上运行。
Some of the data processing is set based so I need to have the whole table in memory (to support much larger data sets, I know this step will probably need to be moved into SQL so I only need to operate per row in memory) 一些数据处理是基于设置的，所以我需要将整个表放在内存中（为了支持更大的数据集，我知道这个步骤可能需要转移到SQL中，所以我只需要在内存中每行操作）

Here is my code: 这是我的代码：

DataTable staging = new DataTable();
using (SqlConnection dwConn = (SqlConnection)SqlConnectionManager.Instance.GetDefaultConnection())
{
    dwConn.Open();
    SqlCommand cmd = dwConn.CreateCommand();
    cmd.CommandText = "select * from staging_table";

    SqlDataReader reader = cmd.ExecuteReader();
    staging.Load(reader);
}

Answer 1

select * from table_name is the simplest, easiest and fastest way to read a whole table. select * from table_name 是读取整个表的最简单，最简单，最快捷的方法。

Let me explain why your results lead to wrong conclusions. 让我解释为什么你的结果导致错误的结论。

Copying a whole table is an optimized operation that merely requires cloning the old binary data into the new one (at most you can perform a file copy operation, according to storage mechanism). 复制整个表是一种优化的操作，只需要将旧的二进制数据克隆到新的二进制数据中（根据存储机制，最多可以执行文件复制操作）。
Writing is buffered . 写缓冲 。 DBMS says the record was written but it's actually not yet done, unless you work with transactions. DBMS说记录是写的，但它实际上还没有完成，除非你处理事务。 Disk operations are generally delayed. 磁盘操作通常会延迟。
Querying a table also requires (unlike cloning) adapting data from the binary-stored layout/format to a driver-dependant format that is ultimately readable by your client. 查询表还需要（与克隆不同）将数据从二进制存储的布局/格式调整为最终可由客户端读取的驱动程序相关格式。 This takes time. 这需要时间。

Answer 2

It all depends on your hardware, but it is likely that your network is the bottleneck here. 这一切都取决于您的硬件，但很可能您的网络是这里的瓶颈。

Apart from limiting your query to just read the columns you'd actually be using, doing a select is as fast as it will get. 除了限制您的查询只读取您实际使用的列之外，执行选择的速度与获取的速度一样快。 There is caching involved here, when you execute it twice in a row, the second time shoud be much faster because the data is cached in memory. 这里涉及缓存，当你连续两次执行它时，第二次会更快，因为数据被缓存在内存中。 execute dbcc dropcleanbuffers to check the effect of caching. 执行dbcc dropcleanbuffers以检查缓存的效果。

If you want to do it as fast as possible try to implement the code that does the processing in T-SQL, that way it could operate directly on the data right there on the server. 如果你想尽可能快地尝试实现在T-SQL中进行处理的代码，那么它可以直接在服务器上的数据上运行。

Another good tip for speed tuning is have the table that is being read on one disk (look at filegroups) and the table that is written to on another disk. 速度调整的另一个好方法是在一个磁盘上查找表（查看文件组）和在另一个磁盘上写入的表。 That way one disk can do a continuous read and the other a continuous write. 这样一个磁盘可以连续读取，另一个磁盘可以连续写入。 If both operations happen on the same disk the heads of the disk keep going back and forth what seriously downgrades performance. 如果两个操作都发生在同一个磁盘上，则磁盘的磁头会不断地来回转换，严重降低了性能。

If the logic your writing cannot be doen it T-SQL you could also have a look at SQL CLR. 如果您的编写逻辑不能用于T-SQL，您还可以查看SQL CLR。

Another tip: when you do select * from table, use a datareader if possible. 另一个提示：当您从表中选择*时，如果可能，请使用datareader。 That way you don't materialize the whole thing in memory first. 这样你就不会首先在内存中实现整个事物。

GJ GJ

Answer 3

It is a good idea generally to include the column names in the select list, but with today's RDBMS's, it won't make much difference. 通常将列名称包含在选择列表中是个好主意，但是对于今天的RDBMS，它不会有太大的区别。 You will only see difference in this regard if you limit the columns selected. 如果限制所选列，您将只看到这方面的差异。 Generally speaking it is good practice to include column names. 一般来说，最好包括列名。 But to answer it seems a select is indeed slower than inserting in the scenario you describe and yes a select * from table_name is indeed the fastest way to read all rows and cols from a table 但要回答它似乎选择确实比插入您描述的场景慢，是的， select * from table_name确实是从表中读取所有行和列的最快方法

在SQL Server中选择整个表的最快方法是什么？

问题描述

3 个解决方案

解决方案1
11 已采纳 2011-03-10 13:35:20

解决方案2
2 2011-03-10 14:02:18

解决方案3
1 2011-03-10 13:49:16

在SQL Server中选择整个表的最快方法是什么？

问题描述

3 个解决方案

解决方案1 11 已采纳 2011-03-10 13:35:20

解决方案2 2 2011-03-10 14:02:18

解决方案3 1 2011-03-10 13:49:16

解决方案1
11 已采纳 2011-03-10 13:35:20

解决方案2
2 2011-03-10 14:02:18

解决方案3
1 2011-03-10 13:49:16