简体   繁体   English

内连接查询太慢了

[英]Inner Join query is so slow

I'm writing a query that is supposed to retrieve words from "Words table" if they are contained in sentences in the "Sentences table"我正在编写一个查询,如果它们包含在“句子表”中的句子中,则应该从“单词表”中检索单词

For example: the query should output "hello" if it finds at least one sentence that contains the word "hello"例如:如果找到至少一个包含单词“hello”的句子,则查询应该输出“hello”

I was able to write this query so far :到目前为止,我能够编写此查询:

SELECT DISTINCT (words.word) FROM sentences inner join words on sentences.sentence LIKE CONCAT('% ', words.word ,' %')

The issue with this query that it's super slow, like it took 8hours+ and did not output any results given that the words table is around 250k rows and the sentence table is around 1M rows.这个查询的问题是它超级慢,就像它花了 8 小时 + 并且没有输出任何结果,因为 words 表大约有 250k 行,而 sentence 表大约有 1M 行。 Can anyone help with a faster solution.任何人都可以提供更快的解决方案。

With the information you shared, the data is not extremely large, but the 'sentence' table is going thru full-tables-can for each distinct word in 'words' table.使用您共享的信息,数据不是非常大,但是“句子”表正在通过完整的表 - 可以用于“单词”表中的每个不同单词。
Also distinct has a processing span involved. distinct 还涉及处理跨度。
It would be prudent to index the words & sentence table将单词和句子表索引是谨慎的
( probably words on initiating Letter & Sectence on same) & partition them too based on same criteria. (可能是关于启动 Letter 和 Sectence 的词)并根据相同的标准对它们进行分区。 \ \

Then run same query as above.然后运行与上面相同的查询。

This approach might reduce how much data being joined per key match.这种方法可能会减少每个键匹配加入的数据量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM