如何使以下查詢在具有許多條件的查找中更有效

Question

首先，我需要通過使用幾個 where 條件限制以下查詢查找中的結果來具有分頁功能。

SELECT SQL_CALC_FOUND_ROWS
    a.uid, b.NAME
FROM
    `profiles` AS a FORCE INDEX(profiles_country_city_gender_index)
JOIN `users` AS b
ON b.id = a.uid
AND a.country = 'INDONESIA'
AND a.gender = 0
JOIN (
    SELECT
        a.uid
    FROM
       profile_details AS a
    JOIN profile_details AS kids ON kids.uid = a.uid
    AND kids.kids_pref = 1
    JOIN profile_details AS current ON current.uid = a.uid
    AND current.current_relationship = 1
    JOIN profile_details AS smoking ON smoking.uid = a.uid
    AND smoking.smoking_pref = 1
    ) AS e ON e.uid = a.uid
AND ( TIMESTAMPDIFF( YEAR, a.birth_date, NOW()) BETWEEN 25 AND 35 )
LIMIT 33;

這里的所有表都是與表用戶的一對一關系，它們是

簡介
Profile_details

在用戶中使用 id 列作為主鍵，在其他表中使用 uid 作為外鍵。 一開始，我對上述查詢/設計沒有問題，直到表增長到 300K 行，查詢運行OK, Time: 0.726000s來獲取結果，這對我來說太慢了。

我嘗試使用 count(*) 根據上述條件計算行數並獲得大致相同的結果，我需要有更快的方法從查找條件中獲取行數，以使分頁系統按預期工作等待的時間。

正如您在查詢中看到的，我正在使用：

FORCE INDEX(profiles_country_city_gender_index)

我認為通過使用 scope 產生的較大行並沒有太大幫助：

AND a.country = 'INDONESIA' 
AND a.gender = 0

結果（148801 行 scope 限制按國家和性別等於 0），如果我與城市配對，則查詢時間不是問題，因為行結果要小得多，但當某天有更大的行時仍然會出現問題。

對於任何可能要求查詢解釋的人：

Explain SELECT SQL_CALC_FOUND_ROWS
        a.uid, 
        b.NAME ...

Results:

| select_type | table   | type   | possible_keys                      | key                                | key_len | ref              | rows   | filtered  | Extra                              |
+-------------+---------+--------+------------------------------------+------------------------------------+---------+------------------+--------+-----------+------------------------------------+
| SIMPLE      | a       | ref    | profiles_country_city_gender_index | profiles_country_city_gender_index | 242     | const            | 148801 | 10.00     | Using index condition; Using where |
| SIMPLE      | a       | ref    | profile_details_uid_foreign        | profile_details_uid_foreign        | 3       | restfulapi.a.uid | 1      | 100.00.00 | Using index                        |
| SIMPLE      | kids    | ref    | profile_details_uid_foreign        | profile_details_uid_foreign        | 3       | restfulapi.a.uid | 1      | 10.00     | Using where                        |
| SIMPLE      | current | ref    | profile_details_uid_foreign        | profile_details_uid_foreign        | 3       | restfulapi.a.uid | 1      | 10.00     | Using where                        |
| SIMPLE      | smoking | ref    | profile_details_uid_foreign        | profile_details_uid_foreign        | 3       | restfulapi.a.uid | 1      | 10.00     | Using where                        |
| SIMPLE      | b       | eq_ref | PRIMARY                            | PRIMARY                            | 3       | restfulapi.a.uid | 1      | 100.00.00 |                                    |

正如您在解釋結果中看到的那樣，沒有表掃描或使用臨時或使用范圍，只有索引條件。 我想如果表格有至少 100 萬行由國家 scope 返回，只需將縮放行的時間乘以 300K 就很糟糕了:(。

下面是表定義，以防有助於分析問題：

CREATE TABLE `profile_details` (
  `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `uid` mediumint(8) unsigned NOT NULL,
  `intents` tinyint(4) NOT NULL DEFAULT '3',
  `height` smallint(6) DEFAULT NULL,
  `body_type` tinyint(4) NOT NULL DEFAULT '5',
  `kids_pref` tinyint(4) NOT NULL DEFAULT '1',
  `drinking_pref` tinyint(4) NOT NULL DEFAULT '2',
  `living_with` tinyint(4) NOT NULL DEFAULT '0',
  `current_relationship` tinyint(4) NOT NULL DEFAULT '1',
  `sexual_pref` tinyint(4) NOT NULL DEFAULT '1',
  `smoking_pref` tinyint(4) NOT NULL DEFAULT '0',
  `status_online` tinyint(4) NOT NULL DEFAULT '0',
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `profile_details_uid_foreign` (`uid`),
  KEY `idx_multipart` (`intents`,`body_type`,`kids_pref`,`drinking_pref`,`living_with`,`current_relationship`,`sexual_pref`,`smoking_pref`),
  CONSTRAINT `profile_details_uid_foreign` FOREIGN KEY (`uid`) REFERENCES `users` (`id`)
)

CREATE TABLE `profiles` (
  `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `uid` mediumint(8) unsigned NOT NULL,
  `birth_date` date NOT NULL,
  `gender` tinyint(4) NOT NULL DEFAULT '0',
  `country` varchar(60) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'ID',
  `city` varchar(60) COLLATE utf8mb4_unicode_ci DEFAULT 'Makassar',
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  `latitude` double NOT NULL DEFAULT '0',
  `longitude` double NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `profiles_uid_foreign` (`uid`),
  KEY `profiles_birth_date_index` (`birth_date`),
  KEY `profiles_latitude_longitude_index` (`latitude`,`longitude`),
  KEY `profiles_country_city_gender_index` (`country`,`city`,`gender`),
  KEY `idx_country_gender_birthdate` (`country`,`gender`,`birth_date`),
  KEY `idx_country_city_gender_birthdate` (`country`,`city`,`gender`,`birth_date`),
  CONSTRAINT `profiles_uid_foreign` FOREIGN KEY (`uid`) REFERENCES `users` (`id`)
)

我怎樣才能找到解決方案，我是否需要重新設計表格以獲得理想的系統？ 也許這是最后的選擇。

編輯

我正在嘗試您之前的建議，首先我在三列中添加了一個索引：

CREATE INDEX profiles_country_gender_birth_date_index on `profiles`(country,gender,birth_date);

我試圖 select Count(*) without JOIN with profile_detail：

SELECT
    count(*)


FROM
    `profiles` AS a 
    FORCE INDEX ( profiles_country_gender_birth_date_index )
    JOIN `users` AS b ON b.id = a.uid 
and 
a.country = 'INDONESIA' 

    AND a.gender =1 
    AND a.birth_date BETWEEN NOW()- INTERVAL 35 YEAR 
    AND NOW()- INTERVAL 25 YEAR

結果時間在 0.7 秒到 0.35 秒之間不穩定，我不知道為什么會這樣。 以下是 Json 格式的解釋查詢計划，以防萬一幫助找出罪魁禍首。

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "114747.38"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "a",
          "access_type": "range",
          "possible_keys": [
            "profiles_country_gender_birth_date_index"
          ],
          "key": "profiles_country_gender_birth_date_index",
          "used_key_parts": [
            "country",
            "gender",
            "birth_date"
          ],
          "key_length": "246",
          "rows_examined_per_scan": 94066,
          "rows_produced_per_join": 32961,
          "filtered": "100.00",
          "index_condition": "((`restfulapi`.`a`.`gender` = 1) and (`restfulapi`.`a`.`country` = 'INDONESIA') and (`restfulapi`.`a`.`birth_date` between <cache>((now() - interval 35 year)) and <cache>((now() - interval 25 year))))",
          "cost_info": {
            "read_cost": "15858.00",
            "eval_cost": "6592.23",
            "prefix_cost": "75194.00",
            "data_read_per_join": "16M"
          },
          "used_columns": [
            "uid",
            "birth_date",
            "gender",
            "country"
          ]
        }
      },
      {
        "table": {
          "table_name": "b",
          "access_type": "eq_ref",
          "possible_keys": [
            "PRIMARY"
          ],
          "key": "PRIMARY",
          "used_key_parts": [
            "id"
          ],
          "key_length": "3",
          "ref": [
            "restfulapi.a.uid"
          ],
          "rows_examined_per_scan": 1,
          "rows_produced_per_join": 32961,
          "filtered": "100.00",
          "using_index": true,
          "cost_info": {
            "read_cost": "32961.15",
            "eval_cost": "6592.23",
            "prefix_cost": "114747.38",
            "data_read_per_join": "89M"
          },
          "used_columns": [
            "id"
          ]
        }
      }
    ]
  }
}

Answer 1

INDEX(country, gender, birth_date)  -- in this order

並將birth_date的使用更改為“sargeable”：

AND  ( TIMESTAMPDIFF( YEAR, a.birth_date, NOW()) BETWEEN 25 AND 35 )

至

AND a.birth_date BETWEEN NOW() - INTERVAL 35 YEAR
                     AND NOW() - INTERVAL 25 YEAR

這樣優化器就可以使用birth_date 。

LIMIT 33 -- 你關心哪 33 行？ 也許你需要一個ORDER BY ？

當計划JOIN profile_details...可行時，不要執行JOIN ( SELECT... profile_details... ) 。

SQL_CALC_FOUND_ROWS需要花費一些東西。 將其移除以查看其運行速度，然后決定是否值得保留。

我認為您不需要多次JOIN profile_details ，特別是因為它與profiles是 1:1 的。

這就是我的意思：

而不是JOIN ( SELECT... )剛剛

JOIN  profile_details AS d  USING(uid)

然后將這些添加到 WHERE 子句中：

AND  d.kids_pref = 1
AND  d.current_relationship = 1
AND  d.smoking_pref = 1

避免文件排序

INDEX(country, gender,   -- Tested with '='
      birth_date,        -- Tested as a "range"
      uid)               -- For the ORDER BY -- Useless!

構建索引時，按此順序包含列

所有列測試為“列 = 常數”。
一個范圍（例如BETWEEN ）。 如果這與ORDER BY相同，則可能會避免使用“文件排序”。

如果WHERE中沒有“范圍”，則

所有列測試為“列 = 常數”。
ORDER BY列——假設它們都是DESC或所有ASC （或者，在 MySQL 8.0 中，匹配INDEX定義）。 這可能會避免“文件排序”。

但是索引不能同時處理“范圍”和不同的“排序依據”。 考慮以下。 您有一個包含姓氏和名字的人員列表。 查詢是

SELECT ...
    WHERE last_name LIKE 'Ja%'   -- a "range"
    ORDER BY first_name;

INDEX(last_name, first_name)將有助於WHERE ，但會使 first_names 混亂。 反之亦然。

（這是一種簡化，有關更多詳細信息，請參閱http://mysql.rjweb.org/doc.php/index_cookbook_mysql 。）

如何使以下查詢在具有許多條件的查找中更有效

問題描述

1 個解決方案

解決方案1
0 已采納 2020-06-12 23:16:40

如何使以下查詢在具有許多條件的查找中更有效

問題描述

1 個解決方案

解決方案1 0 已采納 2020-06-12 23:16:40

解決方案1
0 已采納 2020-06-12 23:16:40