简体   繁体   English

带通配符的SQL文本搜索

[英]SQL text search with wildcards

I have to do a text search on 2 fields of my full-InnoDB MySql database. 我必须在完整的InnoDB MySql数据库的2个字段上进行文本搜索。 In the past i used: 过去我用过:

Field1 LIKE %text% OR Field2 LIKE %text%

But now my data increased a lot and "Like" searching became too slow. 但是现在我的数据增加了很多,“赞”搜索变得太慢了。 So I started to search for another, better solution. 因此,我开始寻找另一个更好的解决方案。

Since now I tried: 从现在开始,我尝试:

  • Fulltext indices (MATCH(...) AGAINST) -> Doesn't support leading wildcards 全文索引(MATCH(...)AGAINST)->不支持前导通配符

  • CONTAINS -> not avaiable in MySql (only Transact-Sql) CONTAINS->在MySql中不可用(仅Transact-Sql)

  • Reverse fulltext indices -> work with leading wildcards but not if the match is in the middle 反向全文索引->使用前导通配符,但如果匹配项位于中间,则不起作用

  • REGEXP -> not really faster than LIKE REGEXP->不是真的比LIKE快

I used MyISAM "Shadow tables" for the Fulltext indices... 我将MyISAM“影子表”用于全文索引。

I don't want to use external search engines like Sphinx, Lucene or se 我不想使用Sphinx,Lucene或se等外部搜索引擎

So my question: Is there something I forgot? 所以我的问题是:有什么我忘了吗? I am thinking of a trick to get bothwildcards work. 我正在考虑使两个通配符都起作用的技巧。 Like an CONTEXT Index in Oracle. 就像Oracle中的上下文索引一样。 Or I don't know...that's why I ask ;) 还是我不知道...这就是为什么我问;)

The leading percent '%' sign in your pattern is causing a table scan to be performed, hence the performance degradation as your data volumeincreased. 模式中的前导百分数'%'导致执行表扫描,因此,随着数据量的增加,性能下降。

Note that this also reflects that yor data is no longe "relational" due to the non-atomic nature of he fields; 请注意,这也反映出由于场的非原子性质,您的数据不再是“关系”的。 that your DB is hene no longer a Relational DataBase, not withstanding that you are still using a RDBMS as the data storage engine. 尽管您仍在使用RDBMS作为数据存储引擎,但您的数据库不再是关系数据库。

The true solution is to restructure the data table by converting Field1 and Field2 back to a set of relational, meaning atomic, fields. 真正的解决方案是通过将Field1和Field2转换回一组关系(即原子域)字段来重组数据表。 This may be beyond any reasonable scope for the project at hand, I know; 我知道,这可能超出了手头项目的任何合理范围。 but it is the solution. 但这是解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM