简体   繁体   中英

Mysql fulltext search relevance across multiple tables

I have been tasked with creating a site wide search feature. The search needs to look at articles, events and page content

I've used MATCH()/AGAINST() in MySQL before and know how to get the relevance of a result but as far as I know the relevance is unique to the search (contents, number of rows etc) the relevance of results from the articles table wont match the relevance of results from the events table.

Is there anyway to unify the relevance so that results from all three tables have a comparable relevance?

Yes, you can unify them very well using a search engine such as Apache Lucene and Solr.

http://lucene.apache.org/solr/

If you need to do it only in MySQL, you can do this with a UNION. You'll probably want to suppress any zero-relevant results.

You'll need to decide how you want to affect the relevance depending on which table matches.

For example, suppose you want articles to be most important, events to be medium important, and pages to be least important. You can use multipliers like this:

set @articles_multiplier=3;
set @events_multiplier=2;
set @pages_multiplier=1;

Here's a working example you can try that demonstrates some of these techniques:

Create sample data:

create database d;
use d;

create table articles (id int primary key, content text) ENGINE = MYISAM;
create table events (id int primary key, content text) ENGINE = MYISAM;
create table pages (id int primary key, content text) ENGINE = MYISAM;

insert into articles values 
(1, "Lorem ipsum dolor sit amet"),
(2, "consectetur adipisicing elit"),
(3, "sed do eiusmod tempor incididunt");

insert into events values 
(1, "Ut enim ad minim veniam"),
(2, "quis nostrud exercitation ullamco"),
(3, "laboris nisi ut aliquip");

insert into pages values 
(1, "Duis aute irure dolor in reprehenderit"),
(2, "in voluptate velit esse cillum"),
(3, "dolore eu fugiat nulla pariatur.");

Make it searchable:

ALTER TABLE articles ADD FULLTEXT(content);
ALTER TABLE events ADD FULLTEXT(content);
ALTER TABLE pages ADD FULLTEXT(content);

Use a UNION to search all these tables:

set @target='dolor';

SELECT * from (
  SELECT 
    'articles' as 'table_name', id, 
    @articles_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from articles
  UNION
  SELECT 
    'events' as 'table_name', 
    id,
    @events_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from events
  UNION
  SELECT 
    'pages' as 'table_name', 
    id, 
    @pages_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from pages
)
as sitewide WHERE relevance > 0;

The result:

+------------+----+------------------+
| table_name | id | relevance        |
+------------+----+------------------+
| articles   |  1 | 1.98799377679825 |
| pages      |  3 | 0.65545331108093 |
+------------+----+------------------+

(Sorry, I want to leave this as comment to the above answer, but I dont have enough reputation to comment)

Be aware that UNION in subqueries are very poorly optimized. A frequently case is when you want to paginate your results using "LIMIT @page * 10, 10" in the parent query, then MySQL must get all the results from the subqueries in order to evaluate the parent query.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM