简体   繁体   English

普通列和全文列的MySQL索引

[英]MySQL index for normal column and full text column

I'm trying to speed up a query for the below: 我正在尝试加快以下查询:

My table has around 4 million records. 我的表有大约400万条记录。

EXPLAIN SELECT  * FROM chrecords WHERE  company_number = 'test'  OR MATCH (company_name,registered_office_address_address_line_1,registered_office_address_address_line_2) AGAINST('test') LIMIT 0, 10;
+------+-------------+-----------+------+------------------+------+---------+------+---------+-------------+
| id   | select_type | table     | type | possible_keys    | key  | key_len | ref  | rows    | Extra       |
+------+-------------+-----------+------+------------------+------+---------+------+---------+-------------+
|    1 | SIMPLE      | chrecords | ALL  | i_company_number | NULL | NULL    | NULL | 2208348 | Using where |
+------+-------------+-----------+------+------------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

I've created two indexes using the below: 我使用以下方法创建了两个索引:

ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_company_number` (`company_number`);

ALTER TABLE `chapp`.`chrecords`ADD FULLTEXT(
    `company_name`,
    `registered_office_address_address_line_1`,
    `registered_office_address_address_line_2`
);

How can "combine" the two indexes however? 但是,如何“组合”这两个索引? As the above query takes 15+ seconds to execute (only using one index). 由于上述查询需要15秒钟以上的时间才能执行(仅使用一个索引)。

The entire table definition: 整个表的定义:

CREATE TABLE `chapp`.`chrecords` (
  `id` INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
  `company_name` VARCHAR(100) NULL,
  `company_number` VARCHAR(100) NULL,
  `registered_office_care_of` VARCHAR(100) NULL,
  `registered_office_po_box` VARCHAR(100) NULL,
  `registered_office_address_address_line_1` VARCHAR(100) NULL,
  `registered_office_address_address_line_2` VARCHAR(100) NULL,
  `registered_office_locality` VARCHAR(100) NULL,
  `registered_office_region` VARCHAR(100) NULL,
  `registered_office_country` VARCHAR(100) NULL,
  `registered_office_postal_code` VARCHAR(100) NULL
  );

ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_company_name` (`company_name`);
ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_company_number` (`company_number`);
ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_registered_office_address_address_line_1` (`registered_office_address_address_line_1`);
ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_registered_office_address_address_line_2` (`registered_office_address_address_line_2`);

ALTER TABLE `chapp`.`chrecords`ADD FULLTEXT(
    `company_name`,
    `registered_office_address_address_line_1`,
    `registered_office_address_address_line_2`
);

Try using a UNION rather than OR . 尝试使用UNION而不是OR

  SELECT *
    FROM (
       SELECT  * 
        FROM chrecords 
        WHERE company_number = 'test'
    ) a
    UNION (
       SELECT * 
         FROM cbrecords
        WHERE MATCH (company_name, 
                     registered_office_address_address_line_1, 
                     registered_office_address_address_line_2)
              AGAINST('test') 
        LIMIT 0, 10
     ) b

If this helps, it's because MySQL struggles to use more than one index in a single subquery. 如果这有帮助,那是因为MySQL难以在单个子查询中使用多个索引。 This gives the query planner two queries. 这给查询计划者两个查询。

You can run EXPLAIN on each of the subqueries separately to understand their performance. 您可以分别在每个子查询上运行EXPLAIN ,以了解其性能。 UNION just puts their results together and eliminates duplicates. UNION只是将其结果放在一起并消除重复项。 If you want to keep the duplicates, do UNION ALL . 如果要保留重复项,请执行UNION ALL

Please notice that lots of single-column indexes on MySQL tables are generally harmful to performance. 请注意,MySQL表上的许多单列索引通常对性能有害。 You should refrain from creating indexes unless they're constructed to help specific queries. 您应该避免创建索引,除非它们旨在帮助特定查询。

    (
        SELECT  *
            FROM  chrecords
            WHERE  company_number = 'test' 
            ORDER BY something
            LIMIT 10
    )
    UNION DISTINCT
    (
        SELECT  *
            FROM  cbrecords
            WHERE  MATCH (company_name, registered_office_address_address_line_1,
                                        registered_office_address_address_line_2)
                   AGAINST('test')
            ORDER BY something
            LIMIT 10
    ) 
    ORDER BY something
    LIMIT 10

Notes: 笔记:

  • No need for an outer SELECT 无需外部SELECT
  • Explicitly say DISTINCT (the default) or ALL (which is faster) so that you will know that you thought about whether dedupping was needed, versus speed. 明确地说出DISTINCT (默认设置)或ALL (更快),这样您就会知道您是否考虑了去重复,而不是速度。
  • A LIMIT without an ORDER BY is not very meaningful 没有ORDER BY LIMIT的意义不大
  • However, if you just want some rows to look at, you can remove the ORDER BYs . 但是,如果只希望查看一些行,则可以删除ORDER BYs
  • Yes the ORDER BY and LIMIT need to be repeated outside so that you can get the ordering correct and limit to 10. 是的,需要在外部重复ORDER BYLIMIT ,以便您可以正确订购并限制为10。

If you need an OFFSET , the the inside need a full count, say LIMIT 50 for 5 pages, the n the outside needs to skip to the 5th page: LIMIT 40,10 . 如果您需要OFFSET ,则内部需要完整计数,例如5页LIMIT 50 ,n需要跳至第五页: LIMIT 40,10

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM