简体   繁体   English

全文搜索未给出期望的结果

[英]Full text search not giving desired result

Below are the commands I ran 以下是我运行的命令

create database fu;

create table table_name( name varchar(10));

insert into table_name values('karan');

insert into table_name values('nitin');    

insert into table_name values('orip');

insert into table_name values('karan orip');

insert into table_name values('karan nitin');

alter table table_name add fulltext(name); //fulltext

select * from products where match(name) against('karan');

Now, the above query returns me empty set. 现在,上面的查询返回我空集。 Why is that? 这是为什么?

Also, is I do 还可以吗

select * from products where match(name) against('karan' in boolean mode);

The above statement gives me perfect result. 上面的陈述给了我完美的结果。

You seem to use the MyISAM storage engine. 您似乎正在使用MyISAM存储引擎。 There's a limitation that words that are found in more than 50% of all rows will be treated as stopwords: 有一个限制,即在所有行的50%以上找到的单词将被视为停用词:

Your search word 'karan' is found in 3 of 5 rows, so it's over this mark. 您的搜索词'karan'在5行中的3行中找到,因此超出了此标记。

MyISAM Limitation MyISAM局限性
For very small tables, word distribution does not adequately reflect their semantic value, and this model may sometimes produce bizarre results for search indexes on MyISAM tables. 对于很小的表,单词分布不能充分反映其语义值,并且此模型有时可能会为MyISAM表上的搜索索引产生奇怪的结果。 For example, although the word “MySQL” is present in every row of the articles table shown earlier, a search for the word in a MyISAM search index produces no results: 例如,尽管前面显示的商品表的每一行中都出现了“ MySQL”一词,但是在MyISAM搜索索引中搜索该词不会产生任何结果:

[...] [...]

The search result is empty because the word “MySQL” is present in at least 50% of the rows, and so is effectively treated as a stopword. 搜索结果为空,因为在至少50%的行中都存在单词“ MySQL”,因此有效地将其视为停用词。 This filtering technique is more suitable for large data sets, where you might not want the result set to return every second row from a 1GB table, than for small data sets where it might cause poor results for popular terms. 这种过滤技术更适用于大型数据集,在小型数据集中,您可能不希望结果集从1GB表返回第二行,而在小型数据集中,它可能导致流行术语的结果不佳。

You can get around this issue by using the InnoDB engine if you're on MySQL 5.6 or newer. 如果您使用的是MySQL 5.6或更高版本,则可以使用InnoDB引擎解决此问题。

The 50% threshold can surprise you when you first try full-text searching to see how it works, and makes InnoDB tables more suited to experimentation with full-text searches. 首次尝试全文搜索以了解其工作原理时,50%的阈值可能会让您感到惊讶,并使InnoDB表更适合进行全文搜索的实验。

from MySQL manual, Natural Language Full-Text Searches MySQL手册,自然语言全文本搜索

Because of limitations and performance issues in MySQL built-in full-text search indexes I would suggest to use external full-text engine like Sphinx or Lucene/Solr. 由于MySQL内置全文搜索索引的局限性和性能问题,我建议使用外部的全文引擎,例如Sphinx或Lucene / Solr。 Both of them will give you much more speed and way better functionality and relevance. 它们都将为您提供更快的速度和更好的功能以及相关性。 This is will be mandatory if you plan to search against big amount of data, in this case MySQL FT search can take seconds to complete while external systems which are based on inverted index could search gigs of data withing milliseconds. 如果您打算搜索大量数据,这将是强制性的,在这种情况下,MySQL FT搜索可能需要几秒钟才能完成,而基于倒排索引的外部系统可能会在数毫秒内搜索大量数据。

Solr is written in Java and requires JVM so could be a good choice if you already use Java in your application. Solr用Java编写,并且需要JVM,因此如果您已在应用程序中使用Java,则它可能是一个不错的选择。 Sphinx is written in C++, work as a daemon and support MySQL protocol so could be a bit easier to work with. Sphinx是用C ++编写的,可以作为守护进程使用并支持MySQL协议,因此使用起来可能会更容易一些。 You can get a sense of how to use Sphinx here: http://astellar.com/2011/12/replacing-mysql-full-text-search-with-sphinx/ Sphinx also support snippets (results highlighting) which could be useful. 您可以在此处了解如何使用Sphinx: http : //astellar.com/2011/12/replacing-mysql-full-text-search-with-sphinx/ Sphinx还支持摘要(结果突出显示),这可能很有用。

In any case when using external search engine you may still want to query MySQL to fetch metadata for found documents. 无论如何,在使用外部搜索引擎时,您可能仍要查询MySQL以获取找到的文档的元数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM