简体   繁体   English

sql server-全文搜索

[英]sql server - full-text search

So let's say I have two databases, one for production purposes and another one for development purposes. 假设我有两个数据库,一个用于生产,另一个用于开发。

When we copied the development database, the full-text catalog did not get copied properly, so we decided to create the catalog ourselves. 当我们复制开发数据库时,全文目录没有被正确复制,因此我们决定自己创建目录。 We matched all the tables and indexes and created the database and the search feature seems to be working okay too (but been entirely tested yet). 我们匹配了所有表和索引并创建了数据库,搜索功能似乎也可以正常工作(但已经过全面测试)。

However, the former catalog had a lot more files in its folder than the one we manually created. 但是,以前的目录在其文件夹中比我们手动创建的目录包含更多的文件。 Is that fine? 这样好吗 I thought they would have exact same number of files (but the size may vary) 我以为它们的文件数完全相同(但大小可能会有所不同)

First ...when using full text search I would suggest that you don't manually try to create what the wizard does for you. 首先 ...使用全文本搜索时,我建议您不要手动尝试创建向导为您执行的操作。 I have to wonder about missing more than just some data. 我想知道丢失的不仅仅是一些数据。 Why not just recreate the indexes? 为什么不只是重新创建索引?

Second ...I suggest that you don't use freetext feature of sql server unless you have no other choice. 其次 ...我建议您不要使用sql server的自由文本功能,除非您别无选择。 I used to be a big believer in freetext but was shown an example of creating a Lucene(.net) index and searching it in comparison to creating an index in SQL Server and searching it. 我曾经在自由文本大的信徒,但表现出创建一个Lucene(.NET)指数和比较搜索它创建SQL Server中的索引和搜索它的一个例子。 Creating a SQL Server index in comparison to creating a Lucene index is considerably slower and hard to maintain. 与创建Lucene索引相比,创建SQL Server索引要慢得多且难以维护。 Searching a SQL Server index is considerably less accurate (poor results) in comparison to Lucene. 与Lucene相比,搜索SQL Server索引的准确性(可怜的结果)要低得多。 Lucene is like having your own personal Google for searching data. Lucene就像拥有自己的个人Google来搜索数据一样。

How? 怎么样? Index your data (only the data you need to search) in Lucene and include the Primary Key of the data that you are indexing for use later. 在Lucene中索引数据(仅需要搜索的数据),并包括要索引的数据的主键以供以后使用。 Then search the index using your language and the Lucene(.net) API (many articles written on this topic). 然后使用您的语言和Lucene(.net)API搜索索引(有关此主题的许多文章)。 In your search results make sure you return the PK. 在搜索结果中,请确保您返回PK。 Once you have identified the records you are interested in you can then go get the rest of the data and/or any related data based on the PK that was returned. 确定了您感兴趣的记录后,便可以根据返回的PK获取其余数据和/或任何相关数据。

Gotchas? 陷阱? Updating the index is also much quicker and easier. 更新索引也更快,更容易。 However, you have to roll your own for creating the index, updating the index, and searching the index. 但是,您必须自己滚动才能创建索引,更新索引和搜索索引。 SUPER EASY to do...but still...there are no wizards or one handed coding here! 超级容易做...但是仍然...这里没有向导或一手编码! Also, the index is on the file system. 此外,索引位于文件系统上。 If the file is open and being searched and you try to open it again for another search you will obviously have some issues...so writing some form of infrastructure around opening and reading these indexes needs to be built. 如果该文件是开放的,被搜索并尝试再次打开它的另一个搜索你显然有一些问题......所以写某种形式的开放周围基础设施和阅读这些指标需要建立。

How does this help in SQL Server? 这对SQL Server有何帮助? You can easily wrap your Lucene search in a CLR function or proc which can be installed in the database that you can then use as though it were native to your t-SQL queries. 您可以轻松地将Lucene搜索包装到CLR函数或proc中,然后将其安装在数据库中,然后就可以将其用作t-SQL查询的本机。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM