简体   繁体   English

C#,Sql Server 2008:将大型结果集流式传输给最终用户仅适用于某些数据库

[英]C#, Sql Server 2008: Stream large result set to end user only works on some databases

I have a long running query that returns a large data set. 我有一个长时间运行的查询,返回一个大型数据集。 This query is called from a web service and the results are converted to a CSV file for the end user. 从Web服务调用此查询,并将结果转换为最终用户的CSV文件。 Previous versions would take 10+ minutes to run and would only return results to the end user once the query completes. 以前的版本需要10分钟以上才能运行,并且只有在查询完成后才会将结果返回给最终用户。

I rewrote the query to where it runs in a minute or so in most cases, and rewrote the way it is accessed so the results would be streamed to the client as they came into the asp.net web service from the database server. 在大多数情况下,我将查询重写为大约一分钟左右运行的位置,并重写了它的访问方式,以便结果在从数据库服务器进入asp.net Web服务时流式传输到客户端。 I tested this using a local instance of SQL Server as well as a remote instance without issue. 我使用SQL Server的本地实例以及没有问题的远程实例对此进行了测试。

Now, on the cusp of production deployment it seems our production SQL server machine does not send any results back to the web service until the query has completed execution. 现在,在生产部署的尖端,似乎我们的生产SQL服务器机器在查询完成执行之前不会将任何结果发送回Web服务。 Additionally, I found another machine, that is identical to the remote server that works (clones), is also not streaming results. 此外,我发现另一台机器,与工作(克隆)的远程服务器相同,也不是流式传输结果。

The version of SQL Server 2008 is identical on all machines. SQL Server 2008的版本在所有计算机上都是相同的。 The production machine has a slightly different version of Windows Server installed (6.0 vs 6.1). 生产计算机安装的Windows Server版本略有不同(6.0 vs 6.1)。 The production server has 4 cores and several times the RAM as the other servers. 生产服务器有4个内核,RAM数是其他服务器的几倍。 The other servers are single core with 1GB ram. 其他服务器是单核,1GB内存。

Is there any setting that would be causing this? 是否有任何设置会导致这种情况? Or is there any setting I can set that will prevent SQL Server from buffering the results? 或者是否有任何我可以设置的设置会阻止SQL Server缓冲结果?

Although I know this won't really affect the overall runtime at all, it will change the end-user perception greatly. 虽然我知道这根本不会影响整体运行时间,但它会极大地改变最终用户的感知。

tl;dr; TL;博士; I need the results of aa query to stream to the end user as the query runs. 在查询运行时,我需要将查询的结果流式传输到最终用户。 It works with some database machines, but not on others. 它适用于某些数据库计算机,但不适用于其他计算机。 All machines are running the same version of SQL Server. 所有计算机都运行相同版本的SQL Server。

The gist of what I am doing in C#: 我在C#中所做的事情的要点:

var reader = cmd.ExecuteReader();
Response.Write(getHeader());
while(reader.Read())
{
  Response.Write(getCSVForRow(reader));
  if(shouldFlush()) Response.Flush()
}

Clarification based on response below 根据以下回复进行澄清

There are 4 database servers, Local, Prod, QA1, QA2. 有4个数据库服务器,Local,Prod,QA1,QA2。 They are all running SQL Server 2008. They all have identical databases loaded on them (more or less, 1 day lag on non-prod). 它们都在运行SQL Server 2008.它们都加载了相同的数据库(或多或少,非产生1天的延迟)。

The web service is hosted on my machine (though I have tested remotely hosted as well). Web服务托管在我的机器上(虽然我也测试了远程托管)。

The only change between tests is the connection string in the web.config. 测试之间唯一的变化是web.config中的连接字符串。

QA2 is working (streaming), and it is a clone of QA1 (VMs). QA2正在工作(流式传输),它是QA1(VM)的克隆。 The only difference between QA1 and QA2 is an added database on QA2 not related to this query at all. QA1和QA2之间的唯一区别是QA2上添加的数据库与此查询完全无关。

QA1 is not working. QA1不起作用。

All tests include the maximum sized dataset in the result (we limit to 5k rows at this time). 所有测试都包括结果中的最大大小数据集(此时我们限制为5k行)。 The browser displays a download dialog once the first flush happens. 第一次刷新发生后,浏览器会显示下载对话框。 This is the desired result. 这是期望的结果。 We want them to know their download is processing, even if the download speed is low and at times drops to zero (such is the way with databases). 我们希望他们知道他们的下载正在处理,即使下载速度很低并且有时下降到零(这是数据库的方式)。

My flushing code is simple at this time. 我的冲洗代码目前很简单。 Every k rows we flush, with k currently set to 20. 我们刷新每k行, k当前设置为20。

The most perplexing part of this is the fact that QA1 and QA2 behave differently. 最令人困惑的部分是QA1和QA2表现不同的事实。 I did notice our production server is set to compatibility mode 2005 (90) where both QA and local database are set to 2008 (100). 我注意到我们的生产服务器设置为兼容模式2005(90),其中QA和本地数据库都设置为2008(100)。 I doubt this matters. 我怀疑这很重要。 When I exec the sprocs through SSMS I have similar behavior across all machines. 当我通过SSMS执行sprocs时,我在所有机器上都有类似的行为。 I see results stream in immediately. 我立即看到结果流。

Is there any connection string setting that could disable the streaming? 是否有任何连接字符串设置可以禁用流媒体?

Everything I know says that what you're doing should work; 我所知道的一切都说你正在做的事应该奏效; both the DataReader and Response.Write()/.Flush() act in a "streaming" fashion and will result in the client getting the data one row at a time as soon as there are rows to get. DataReader和Response.Write()/。Flush()都以“流”方式运行,并且只要有行可以使客户端一次获取一行数据。 Response does include a buffer, but you're pushing the buffer to the client after every read/write iteration which minimizes its use. 响应确实包含一个缓冲区,但是您在每次读/写迭代后将缓冲区推送到客户端,从而最大限度地减少了它的使用。

I'd check that the web service is configured to respond correctly to Flush() commands from the response. 我将检查Web服务是否配置为正确响应响应中的Flush()命令。 Make sure the production environment is not a Win2008 Server Core installation; 确保生产环境不是Win2008 Server Core安装; Windows Server 2008 does not support Response.Flush() in certain Server Core roles. Windows Server 2008在某些服务器核心角色中不支持Response.Flush()。 I'd also check that the conditions evaluated in ShouldFlush() will return true when you expect them to in the production environment (You may be checking the app config for a value, or looking at IIS settings; I dunno). 当你期望它们在生产环境中时,我还会检查在ShouldFlush()中评估的条件是否会返回true(您可能正在检查应用程序配置中的值,或者查看IIS设置;我不知道)。

In your test, I'd try a much larger set of sample data; 在您的测试中,我会尝试更大的样本数据集; it may be that the production environment is exposing problems that are also present on the test environments, but with a smaller set of test data and a high-speed Ethernet backbone, the problem isn't noticeable compared to returning hundreds of thousands of rows over DSL. 可能是生产环境暴露了测试环境中也存在的问题,但是使用较小的测试数据集和高速以太网骨干网,与返回数十万行相比,问题并不明显。 DSL。 You can verify that it is working in a streaming fashion by inserting a Thread.Sleep() call after each Flush(250); 您可以通过在每次Flush(250)之后插入Thread.Sleep()调用来验证它是否以流式方式工作; this'll slow down execution of the service, and let you watch the response get fed to your client at 4 rows per second. 这将减慢服务的执行速度,让您观察响应以每秒4行的速度提供给您的客户端。

Lastly, make sure that the client you're using in the production environment is set up to display CSV files in a fashion that allows for streaming. 最后,确保您在生产环境中使用的客户端设置为以允许流式传输的方式显示CSV文件。 This basically means that a web browser acting as the client should not be configured to pass the file off to a third-party app. 这基本上意味着不应将充当客户端的Web浏览器配置为将文件传递给第三方应用程序。 A web browser can easily display a text stream passed over HTTP; Web浏览器可以轻松显示通过HTTP传递的文本流; that's what it does, really. 这就是它的作用,真的。 However, if it sees the stream as a CSV file, and it's configured to hand CSV files over to Excel to open, the browser will cache the whole file before invoking the third-party app. 但是,如果它将流视为CSV文件,并且将其配置为将CSV文件移交给Excel以打开,则浏览器将在调用第三方应用程序之前缓存整个文件。

  1. Put a new task that builds this huge CSV file in a task table. 将一个新的任务放在任务表中构建这个巨大的CSV文件。
  2. Run the procedure to process this task. 运行该过程以处理此任务。
  3. Wait for the result to appear in your task table with SqlDependency. 等待结果显示在带有SqlDependency的任务表中。
  4. Return the result to the client. 将结果返回给客户端。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM