简体   繁体   中英

very slow count with 7 million rows

I got more than 7 million rows in a table and

SELECT COUNT(*) FROM MyTable where MyColumn like '%some string%'

gives me 20,000 rows and takes more than 13 seconds.

The table has NONCLUSTERED INDEX on MyColumn.

Is there any way to improve speed?

Leading wildcards searches can not be optimised with T-SQL and won't use an index

Look at SQL Server's full text search

您可以尝试全文搜索 ,或文本搜索引擎,如Lucene

Try using a binary collation first, which will mean that the complex Unicode rules are replaced by a simple byte comparison.

SELECT COUNT(*) 
FROM MyTable 
WHERE MyColumn COLLATE Latin1_General_BIN2 LIKE '%some string%'

Also, have a look at chapter titled 'Build your own index' in SQL Server MVP Deep Dives written by Erland Sommarskog

The basic idea is that you introduce a restriction to the user and require the string to be at least three contiguous characters long. Next, you extract all three letter sequences from the MyColumn field and store these fragments in a table together with the MyTable.id they belong to. When looking for a string, you split it into three letter fragments as well, and look up which record id they belong to. This way you find the matching strings a lot quicker. This is the strategy in a nutshell.

The book describes implementation details and ways to optimise this further.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM