[英]Mysql optimize to avoid table scan
請考慮下表:
_____________________
| sentence_word |
|---------|---------|
| sent_id | word_id |
|---------|---------|
| 1 | 1 |
| 1 | 2 |
| ... | ... |
| 2 | 4 |
| 2 | 1 |
| ... | ... |
通過這種表結構,我想存儲句子中的單詞。 現在,我想找出句子中哪些單詞與特定單詞一起出現。 結果應如下所示:
_____________________
| word_id | counted |
|---------|---------|
| 5 | 1000 |
| 7 | 800 |
| 3 | 600 |
| 1 | 400 |
| 2 | 100 |
| ... | ... |
該查詢如下所示:
SELECT
word_id,
COUNT(*) AS counted
FROM sentence_word
WHERE sentence_word.sent_id IN (SELECT
sent_id
FROM sentence_word
WHERE word_id = [desired word]
)
AND word_id != [desired word]
GROUP BY word_id
ORDER BY counted DESC;
查詢正在正常工作,但它始終掃描整個表。 我為send_id和word_id創建了一個索引。 您是否有什么想法可以對其進行優化,使其不需要一直掃描整個表?
您可以嘗試這樣的自我加入:
SELECT COUNT(DISTINCT sw1.word_id)
FROM sentence_word sw1
JOIN sentence_word sw2 ON (
sw1.sent_id = sw2.sent_id
AND sw2.word_id = [your word id]
)
WHERE sw1.word_id != [your word id]
甚至更好
SELECT COUNT(DISTINCT sw1.word_id)
FROM sentence_word sw1
JOIN sentence_word sw2 ON (
sw1.sent_id = sw2.sent_id
AND sw2.word_id = [your word id]
AND sw2.word_id != sw1.word_id
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.