简体   繁体   English

在SQL Server中选择整个表的最快方法是什么?

[英]What is the Fastest Way to Select a Whole Table in SQL Server?

I am writing a app that reads a whole table, does some processing, then writes the resulting data to another table. 我正在编写一个应用程序,它读取整个表,进行一些处理,然后将结果数据写入另一个表。 I am using the SqlBulkCopy class (.net version of "bcp in") which does the insert very fast. 我正在使用SqlBulkCopy类(.net版本的“bcp in”),它可以非常快速地插入。 But I cannot find any efficent way to select data in the first place. 但我首先找不到任何有效的方法来选择数据。 there is not .net equivilent of "bcp out", which seems strange to me. 没有.net等效的“bcp out”,这对我来说似乎很奇怪。

Currently I'm using select * from table_name . 目前我正在使用select * from table_name For prespective it takes 2.5 seconds to select 6,000 rows ... and only 600ms to bulk insert the same number of rows. 对于预期,选择6,000行需要2.5秒......并且只有600毫秒来批量插入相同数量的行。

I would expect that selecting data should always be faster than inserting. 我希望选择数据总是比插入更快。 What is the fastest way to select all rows & columns from a table? 从表中选择所有行和列的最快方法是什么?


Answers to qeustions: qeustions的答案:

  • I timed my select to take 2.5 seconds 2 ways. 我计时选择2.5秒2。 First was while running my application and running a sql trace. 首先是在运行我的应用程序并运行sql跟踪时。 second was running the same query in SSMS. 第二个是在SSMS中运行相同的查询。 Both retured about the same result. 两人都恢复了大致相同的结果。
  • I am reading data using SqlDataReader. 我正在使用SqlDataReader读取数据。
  • No other applications are using this database. 没有其他应用程序正在使用此数据库
  • My current processing takes under 1 second, so 2+ second read time is relatively large. 我目前的处理时间不到1秒,因此2秒以上的读取时间相对较长。 But mostly I'm concerned(interested) in performance when scaling this up to 100,000 rows and millions of rows. 但大多数情况下,当我将其扩展到100,000行和数百万行时,我对性能感兴趣(感兴趣)。
  • Sql Server 08r2 and my application are both running on my dev machine. Sql Server 08r2和我的应用程序都在我的开发机器上运行。
  • Some of the data processing is set based so I need to have the whole table in memory (to support much larger data sets, I know this step will probably need to be moved into SQL so I only need to operate per row in memory) 一些数据处理是基于设置的,所以我需要将整个表放在内存中(为了支持更大的数据集,我知道这个步骤可能需要转移到SQL中,所以我只需要在内存中每行操作)

Here is my code: 这是我的代码:

DataTable staging = new DataTable();
using (SqlConnection dwConn = (SqlConnection)SqlConnectionManager.Instance.GetDefaultConnection())
{
    dwConn.Open();
    SqlCommand cmd = dwConn.CreateCommand();
    cmd.CommandText = "select * from staging_table";

    SqlDataReader reader = cmd.ExecuteReader();
    staging.Load(reader);
}

select * from table_name is the simplest, easiest and fastest way to read a whole table. select * from table_name 读取整个表的最简单,最简单,最快捷的方法。

Let me explain why your results lead to wrong conclusions. 让我解释为什么你的结果导致错误的结论。

  1. Copying a whole table is an optimized operation that merely requires cloning the old binary data into the new one (at most you can perform a file copy operation, according to storage mechanism). 复制整个表是一种优化的操作,只需要将旧的二进制数据克隆到新的二进制数据中(根据存储机制,最多可以执行文件复制操作)。
  2. Writing is buffered . 写缓冲 DBMS says the record was written but it's actually not yet done, unless you work with transactions. DBMS说记录是写的,但它实际上还没有完成,除非你处理事务。 Disk operations are generally delayed. 磁盘操作通常会延迟。
  3. Querying a table also requires (unlike cloning) adapting data from the binary-stored layout/format to a driver-dependant format that is ultimately readable by your client. 查询表还需要(与克隆不同)将数据从二进制存储的布局/格式调整为最终可由客户端读取的驱动程序相关格式。 This takes time. 这需要时间。

It all depends on your hardware, but it is likely that your network is the bottleneck here. 这一切都取决于您的硬件,但很可能您的网络是这里的瓶颈。

Apart from limiting your query to just read the columns you'd actually be using, doing a select is as fast as it will get. 除了限制您的查询只读取您实际使用的列之外,执行选择的速度与获取的速度一样快。 There is caching involved here, when you execute it twice in a row, the second time shoud be much faster because the data is cached in memory. 这里涉及缓存,当你连续两次执行它时,第二次会更快,因为数据被缓存在内存中。 execute dbcc dropcleanbuffers to check the effect of caching. 执行dbcc dropcleanbuffers以检查缓存的效果。

If you want to do it as fast as possible try to implement the code that does the processing in T-SQL, that way it could operate directly on the data right there on the server. 如果你想尽可能快地尝试实现在T-SQL中进行处理的代码,那么它可以直接在服务器上的数据上运行。

Another good tip for speed tuning is have the table that is being read on one disk (look at filegroups) and the table that is written to on another disk. 速度调整的另一个好方法是在一个磁盘上查找表(查看文件组)和在另一个磁盘上写入的表。 That way one disk can do a continuous read and the other a continuous write. 这样一个磁盘可以连续读取,另一个磁盘可以连续写入。 If both operations happen on the same disk the heads of the disk keep going back and forth what seriously downgrades performance. 如果两个操作都发生在同一个磁盘上,则磁盘的磁头会不断地来回转换,严重降低了性能。

If the logic your writing cannot be doen it T-SQL you could also have a look at SQL CLR. 如果您的编写逻辑不能用于T-SQL,您还可以查看SQL CLR。

Another tip: when you do select * from table, use a datareader if possible. 另一个提示:当您从表中选择*时,如果可能,请使用datareader。 That way you don't materialize the whole thing in memory first. 这样你就不会首先在内存中实现整个事物。

GJ GJ

It is a good idea generally to include the column names in the select list, but with today's RDBMS's, it won't make much difference. 通常将列名称包含在选择列表中是个好主意,但是对于今天的RDBMS,它不会有太大的区别。 You will only see difference in this regard if you limit the columns selected. 如果限制所选列,您将只看到这方面的差异。 Generally speaking it is good practice to include column names. 一般来说,最好包括列名。 But to answer it seems a select is indeed slower than inserting in the scenario you describe and yes a select * from table_name is indeed the fastest way to read all rows and cols from a table 但要回答它似乎选择确实比插入您描述的场景慢,是的, select * from table_name确实是从表中读取所有行和列的最快方法

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将DataTable引入SQL Server的最快方法是什么? - What is the fastest way to get a DataTable into SQL Server? 在这种情况下,将数百万行插入SQL表的最快方法是什么? - What is the fastest way to insert millions of rows into a SQL table in this case? 这是将多行添加到 SQL Server 表的最快方法吗 - Is this the fastest way to add multiple rows to a SQL Server table 在SQL Server CE Winforms中查找数据的最快方法是什么 - What is fastest way to find data in SQL Server CE Winforms 从C#在SQL Server中插入记录的最快方法是什么 - What is the fastest way to insert record in SQL Server from C# 将 dataGridView 行导出到 Excel 或 SQL 服务器数据库的最快方法是什么 - What is the fastest way to export dataGridView rows to Excel or into an SQL Server database 将大量记录插入SQL Server数据库的最快方法是什么? - What is the fastest way to insert a large amount of records into a SQL Server DB? 使用C#在SQL Server中保存数据的最快方法是什么? - What is the fastest way to save data in SQL Server using C#? 判断SQL Server是否可用的最快方法 - Fastest way to tell if SQL Server is available 使用C#在SQL Server上的临时表中插入3万行的最快方法 - Fastest way to insert 30 thousand rows in a temp table on SQL Server with C#
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM