简体   繁体   English

使用LINQ to SQL查询SQL Azure中经常更改的列时如何提高性能

[英]How do I improve performance when querying on a column that changes frequently in SQL Azure using LINQ to SQL

I have an SQL Azure database, and one of the tables contains over 400k objects. 我有一个SQL Azure数据库,其中一个表包含超过40万个对象。 One of the columns in this table is a count of the number of times that the object has been downloaded. 该表中的列之一是对象下载次数的计数。

I have several queries that include this particular column (call it timesdownloaded ), sorted descending, in order to find the results. 我有几个查询,其中包括此特定列(称其为timesdownloaded ),以降序排列,以查找结果。

Here's an example query in LINQ to SQL (I'm writing all this in C# .NET): 这是LINQ to SQL中的示例查询(我正在C#.NET中编写所有代码):

var query = from t in db.tablename
    where t.textcolumn.StartsWith(searchfield)
    orderby t.timesdownloaded descending
    select t.textcolumn;

// grab the first 5
var items = query.Take(5);

This query called perhaps 90 times per minute on average. 该查询平均每分钟可能调用90次。

Objects are downloaded perhaps 10 times per minute on average, so this timesdownloaded column is updated that frequently. 对象平均每分钟可能被下载10次,因此,“ timesdownloaded列的更新频率timesdownloaded

As you can imagine, any index involving the timesdownloaded column gets over 30% fragmented in a matter of hours. 可以想象,任何涉及timesdownloaded列的索引在timesdownloaded就会超过30%被分散。 I have implemented an index maintenance plan that checks and rebuilds these indexes when necessary every few hours. 我已经实施了一项索引维护计划,该计划每隔几个小时在必要时检查并重建这些索引。 This helps, but of course adds spikes in query response times whenever the indexes are rebuilt which I would like to avoid or minimize. 这会有所帮助,但当然,无论何时重建索引,我都会避免或最小化查询响应时间的峰值。

I have tried a variety of indexing schemes. 我尝试了各种索引方案。

The best performing indexes are covering indexes that include both the textcolumn and timesdownloaded columns. 表现最好的索引涵盖了包括textcolumntimesdownloaded列的索引。 When these indexes are rebuilt, the queries are amazingly quick of course. 重建这些索引后,查询当然很快。

However, these indexes fragment badly and I end up with pretty frequent delay spikes due to rebuilding indexes and other factors that I don't understand. 但是,这些索引碎片严重,由于重建索引和我不了解的其他因素,我最终会出现相当频繁的延迟尖峰。

I have also tried simply not indexing the timesdownloaded column. 我也尝试过不索引timesdownloaded列。 This seems to perform more consistently overall, though slower of course. 总体上看,这似乎表现得更加一致,尽管速度当然较慢。 And when I check on the SQL query execution plan, it seems to be pretty inconsistent in how SQL tries to optimize this query. 而且,当我检查SQL查询执行计划时,似乎在SQL尝试优化此查询的方式中并不一致。 Of course it ends up with a log of logical reads as it has to fetch the timesdownloaded column from the table and not an organized index. 当然,由于必须从表中获取timesdownloaded列而不是有组织的索引,因此它最终会产生逻辑读取日志。 So this isn't optimal. 因此,这不是最佳选择。

What I'm trying to figure out is if I am fundamentally missing something in how I have configured or manage this database. 我要弄清楚的是,从根本上我缺少配置或管理该数据库的方式。

I'm no SQL expert, and I've yet to find a good answer for how to do this. 我不是SQL专家,而且对于如何执行此操作,我还没有找到好的答案。

I've seen some suggestions that Stored Procedures could help, but I don't understand why and haven't tried to get those going with LINQ just yet. 我已经看到了一些建议,这些建议可以帮助存储过程,但是我不知道为什么,也没有尝试过使用LINQ。

As commented below, I have considered caching but haven't taken that step yet either. 如下所述,我已经考虑过缓存,但也没有采取这一步骤。

For some context, this query is a part of a search suggestion feature. 在某些情况下,此查询是搜索建议功能的一部分。 So it is called frequently with many different search terms. 因此,经常使用许多不同的搜索词来调用它。

Any suggestions would be appreciated! 任何建议,将不胜感激!

Based on the comments to my question and further testing, I ended up using an Azure Table to cache my results. 根据对问题的评论和进一步的测试,最终我使用了Azure表来缓存我的结果。 This is working really well and I get a lot of hits off of my cache and many fewer SQL queries. 这确实工作得很好,我从缓存中获得了很多成功,SQL查询更少了。 The overall performance of my API is much better now. 我的API的整体性能现在要好得多。

I did try using Azure In Role Caching, but that method doesn't appear to work well for my needs. 我曾尝试使用Azure In Role Caching,但是这种方法似乎不能很好地满足我的需求。 It ended up using too much memory (no matter how I configured it, which I don't understand), swapping to disk like crazy and brought my little Small instances to their knees. 它最终使用了过多的内存(无论我如何配置,我都不了解),疯狂地交换到磁盘,并使我的Small Small实例屈服于膝盖。 I don't want to pay more at the moment, so Tables it is. 我现在不想支付更多,所以表是。

Thanks for the suggestions! 感谢您的建议!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM