简体   繁体   English

全文搜索是答案吗?

[英]Is Full Text search the answer?

OK I have a mySQL Database that looks something like this好的,我有一个看起来像这样的 mySQL 数据库

ID - an int and the unique ID of the recorded ID - 一个 int 和记录的唯一 ID

Title - The name of the item标题 - 项目的名称

Description - The items description描述 - 项目描述

I want to search both title and description of key words, currently I'm using.我想搜索关键字的标题和描述,目前我正在使用。

SELECT * From 'item' where title LIKE %key% SELECT * 来自 'item' where title LIKE %key%

And this works and as there's not much in the database, as however searching for “this key” doesn't find “this that key” I want to improve the search engine of the site, and may be even add some kind of ranking system to it (but that's a long time away).这很有效,因为数据库中没有太多,但是搜索“this key”没有找到“this that key”我想改进网站的搜索引擎,甚至可能会添加某种排名系统到它(但那是很长一段时间了)。

So to the question, I've heard about something called “Full text search” it is (as far as I can tell) a staple of database design, but being a Newby to this subject I know nothing about it so…所以对于这个问题,我听说过一种叫做“全文搜索”的东西(据我所知)它是数据库设计的主要内容,但作为这个主题的新手,我对此一无所知,所以......

1) Do you think it would be useful? 1)你觉得有用吗?

And an additional questron…还有一个额外的questron……

2) What can I read about database design / search engine design that will point me in the right direction. 2) 我能读到什么关于数据库设计/搜索引擎设计的信息,这将为我指明正确的方向。

If it's of relevance the site is currently written in stright PHP (IE without a framework) (thro the thought of converting it to Ruby on Rails has crossed my mind)如果它是相关的,该网站目前是直接写的 PHP (没有框架的 IE)(通过在 Rails 上将其转换为 Ruby 的想法已经越过我的脑海)

update更新

Thanks all, I'll go for Fulltext search.谢谢大家,我将 go 进行全文搜索。 And for any one finding this later, I found a good tutorial on fulltext search as well.对于以后发现这一点的任何人,我也找到了一个很好的全文搜索教程

The problem with the '%keyword%' type search is that there is no way to efficiently search on it in a regular table, even if you create an index on that column. '%keyword%' 类型搜索的问题在于,即使您在该列上创建索引,也无法在常规表中有效地搜索它。 Think about how you would look that string up in the phone book.想想你会如何看待电话簿中的那根绳子。 There is actually no way to optimize it - you have to scan the entire phone book - and that is what MySQL does, a full table scan.实际上没有办法优化它——你必须扫描整个电话簿——这就是 MySQL 所做的,全表扫描。

If you change that search to 'keyword%' and use an index, you can get very fast searching.如果您将该搜索更改为“关键字%”并使用索引,您可以获得非常快速的搜索。 It sounds like this is not what you want, though.不过,听起来这不是您想要的。

So with that in mind, I have used fulltext indexing/searching quite a bit, and here are a few pros and cons:因此,考虑到这一点,我已经使用了相当多的全文索引/搜索,这里有一些优点和缺点:

Pros优点

  • Very fast非常快
  • Returns results sorted by relevance (by default, although you can use any sorting)返回按相关性排序的结果(默认情况下,尽管您可以使用任何排序)
  • Stop words can be used.可以使用停用词。

Cons缺点

  • Only works with MyISAM tables仅适用于 MyISAM 表
  • Words that are too short are ignored (default minimum is 4 letters)太短的单词将被忽略(默认最小为 4 个字母)
  • Requires different SQL in where clause, so you will need to modify existing queries.在 where 子句中需要不同的 SQL,因此您需要修改现有查询。
  • Does not match partial strings (for example, 'word' does not match 'keyword', only 'word')不匹配部分字符串(例如,'word' 不匹配'keyword',只匹配'word')

Here is some good documentation on full-text searching . 这是一些关于全文搜索的好文档

Another option is to use a searching system such as Sphinx .另一种选择是使用搜索系统,例如Sphinx It can be extremely fast and flexible.它可以非常快速和灵活。 It is optimized for searching and integrates well with MySQL.它针对搜索进行了优化,并与 MySQL 很好地集成在一起。

I would guess that MySQL fulltext is sufficient for your needs, but it's worth noting that the built in support doesn't scale very well.我猜想 MySQL 全文足以满足您的需求,但值得注意的是,内置支持不能很好地扩展。 For average size documents it starts to become unusable for table sizes as small as a few hundred thousand rows.对于平均大小的文档,它开始变得无法用于小到几十万行的表格大小。 If you think that this might become a problem further on you should probably look into Sphinx already.如果您认为这可能会成为进一步的问题,您可能应该已经研究过 Sphinx。 It's becoming the defacto standard for MYSQL-users, even though I personally prefer to implement my own solution using java lucene.它正在成为 MYSQL 用户的事实标准,尽管我个人更喜欢使用 java lucene 来实现我自己的解决方案。 :) :)

Also, I'd like to mention that full text search is fundamentally different from the standard LIKE '%keyword%'-search.另外,我想提一下,全文搜索与标准的 LIKE '%keyword%' 搜索根本不同。 Unlike the LIKE-search full text indexing allows you to search for several keywords that doesn't have to appear right next to each other.与 LIKE 搜索全文索引不同,您可以搜索多个不必紧挨着出现的关键字。 Standard search engines such as google are full text search engines, for example.例如,谷歌等标准搜索引擎是全文搜索引擎。

You might also consider Zend_Lucene.你也可以考虑 Zend_Lucene。 It's slightly easier to integrate than Sphinx, because it is pure PHP.它比 Sphinx 稍微容易集成,因为它是纯 PHP。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM