簡體   English   中英

SQL:了解WHERE子句中的OR運算符

[英]SQL: Understanding the OR operator in a WHERE clause

我有一個名為Movie,Genre和Keyword的表,我創建了一個名為'genkeyword'的視圖。 視圖'genkeyword'有很多元組,所以可以在DB Fiddle訪問它。

我有以下查詢:

SELECT title, 
       year, 
       Count(DISTINCT genre)   AS genre_freq, 
       Count(DISTINCT keyword) AS keyword_freq 
FROM   genkeyword 
WHERE  ( genre IN (SELECT genre 
                   FROM   genkeyword 
                   WHERE  title = 'Harry Potter and the  Deathly Hallows') 
          OR keyword IN (SELECT keyword 
                         FROM   genkeyword 
                         WHERE  title = 'Harry Potter and the  Deathly Hallows') ) 
       AND title <> 'Harry Potter and the Deathly Hallows' 
GROUP  BY title, 
          year 
ORDER  BY genre_freq DESC, 
          keyword_freq DESC; 

我打算用這個查詢來獲取每個具有與哈利波特相同的類型和關鍵詞的電影的流派和關鍵詞頻率:輸出應該是:

title                      |      genre_freq    |    keyword_freq
Cinderella                        2                        2
The Shape of Water                2                        1
How to Train Your Dragon          2                        0
Enchanted                         1                        3

我知道查詢不正確,因為我得到以下輸出:

    title                      |      genre_freq    |    keyword_freq
    The Shape of Water                4                  3       
    Enchanted                         3                  4
    Cinderella                        2                  5
    How to Train Your Dragon          2                  3              

但是,我想澄清一下我對查詢如何工作的理解。

在我的查詢的'where'子句中:

where (genre in (select genre from genkeyword where title='Harry Potter') or 
keyword in (select keyword from genkeyword where title='Harry Potter')) 

我是否正確地說生成了兩個結果集,一個包含所有具有Harry Potter中的類型的元組(讓它為R1),另一個包含所有具有哈利波特關鍵字的元組(讓這成為R2)?

如果所考慮的元組包含類型結果集R1中的類型或關鍵字結果集R2中的關鍵字,則計算類型/關鍵字。 我不確定在這種情況下count(不同類型)和count(distinct keyword)是如何工作的。 如果元組包含R1中的類型,則只計算類型或計算關鍵字? 這對於元組在R2中包含關鍵字的情況是相同的,是否計算了類型以及關鍵字?

我不明白為什么我從查詢中得到genre_freq和keyword_freq值錯誤。 這是因為我不完全理解在幕后如何計算類型和關鍵詞頻率。 任何見解都表示贊賞。

到目前為止我在SO上看到的最常見問題之一。

回答你的問題。 OR子句基本上將關鍵字部分和類型部分的結果粘貼在彼此之下。 SQL在行(或記錄)中工作,因此您應該始終考慮行。

首先,它選擇包含像哈利波特一樣的所有類型的行。 然后它選擇包含關鍵字的所有行。 然后它執行計數。 顯然,這太高了,因為你也會獲得所有不具有相同類型的記錄,但確實有重疊的關鍵字。 您還將獲得具有重疊類型但不重疊關鍵字的所有行。

要正確計算記錄,只需將OR更改為AND。 這將僅選擇具有相同類型的記錄以及包含關鍵字的記錄。 計算這些將產生正確的結果。

正如Imre_G所說,這是一個很好的問題,他對出現問題的解釋就是現實。 你基本上選擇你不想要的流派和關鍵詞,然后計算這些,因為它們共享一個共同元素。

這是修復它的一種方法,可能不是最好的,但最簡單的方法:

SELECT
    COALESCE(a.title, b.title) AS title,
    COALESCE(a.year, b.year) AS year,
    a.genre_freq,
    b.keyword_freq
FROM
(SELECT title, year, count(distinct genre) as genre_freq FROM genkeyword where (genre in 
(select genre from genkeyword where title='Harry Potter and the Deathly Hallows') )
AND title <> 'Harry Potter and the Deathly Hallows'
group by title, year) a
LEFT JOIN
(select title, year, 
count(distinct keyword) as keyword_freq 
from genkeyword
where keyword in (select keyword from genkeyword where title='Harry Potter and the Deathly Hallows')
 and title <> 'Harry Potter and the Deathly Hallows' group by title, year) b
 ON b.title = a.title;

現在該解決方案僅在電影的關鍵字匹配時才有效。 正確的解決方案是用FULL OUTER JOIN替換LEFT JOIN ,但MySQL由於某種原因不支持FULL OUTER JOIN 這也有一個解決方案,但它很長,涉及很多UNION ;(

如何在MySQL中進行全面的連接?

在合計之前,您可以使用子查詢來反轉您的邏輯和驅動器類型和關鍵字

select title,year,
        sum(case when src = 'g' then 1 else 0 end) as genre,
        sum(case when src = 'k' then 1 else 0 end) as keyword
from
(
select 'g' as src, g1.title ,g1.year, g1.genre
from genkeyword g
join genkeyword g1 on g1.genre = g.genre
where g.title =  'Harry Potter and the Deathly Hallows' and g1.title <> 'Harry Potter and the Deathly Hallows'
union
select 'k' as src, g1.title ,g1.year, g1.genre
from genkeyword g
join genkeyword g1 on g1.keyword = g.keyword
where g.title =  'Harry Potter and the Deathly Hallows' and g1.title <> 'Harry Potter and the Deathly Hallows'
) s
group by title , year;

+--------------------------+------+-------+---------+
| title                    | year | genre | keyword |
+--------------------------+------+-------+---------+
| Cinderella               | 2015 |     2 |       2 |
| Enchanted                | 2007 |     1 |       3 |
| How to Train Your Dragon | 2010 |     2 |       0 |
| The Shape of Water       | 2017 |     2 |       4 |
+--------------------------+------+-------+---------+
4 rows in set (0.10 sec)

試試這個查詢。
我沒有使用您創建的任何視圖,但如果您願意,可以使用它們。

MySQL的

SET @tmpMovieid = (SELECT DISTINCT id 
                   FROM Movie 
                   WHERE title = 'Harry Potter and the Deathly Hallows');

SELECT id,
       title,
       IFNULL(Max(CASE WHEN coltype = 'genre' THEN col end),   0) AS genre_freq,
       IFNULL(Max(CASE WHEN coltype = 'Keyword' THEN col end), 0) AS keyword_freq

FROM   (SELECT id,
               title,
               Count(g.genre) AS col,
               'genre'        AS colType
        FROM   Movie m
               INNER JOIN Genre g ON m.id = g.Movie_id
        WHERE  g.genre IN (SELECT DISTINCT genre
                           FROM   Genre
                           WHERE  Movie_id = @tmpMovieid)
        GROUP  BY id, title

        UNION ALL

        SELECT id,
               title,
               Count(k.keyword) AS col,
               'Keyword'        AS colType
        FROM   Movie m
               INNER JOIN Keyword k ON m.id = k.Movie_id
        WHERE  k.keyword IN (SELECT DISTINCT keyword
                             FROM   Keyword
                             WHERE  Movie_id = @tmpMovieid)
        GROUP  BY id, title) tmp

WHERE  id <> @tmpMovieid
GROUP  BY id, title
ORDER  BY genre_freq DESC, keyword_freq DESC;

在線演示: https//www.db-fiddle.com/f/s1xLQ6r4Zwi5hVjCsdcwV8/0


SQL Server
注意:由於您已將'text'用作某些列數據類型,因此需要轉換某些操作。 但話說回來,因為你使用的是MySQL,所以你不需要這個。 無論如何我寫這篇文章是為了向你展示差異和樂趣。

DECLARE @tmpMovieID INT;
SET @tmpMovieID = (SELECT DISTINCT id
                   FROM   movie
                   WHERE  Cast(title AS NVARCHAR(MAX)) = 'Harry Potter and the Deathly Hallows');

SELECT tmpGenre.id                  AS id,
       tmpGenre.title               AS title,
       ISNULL(tmpGenre.genre, 0)    AS genre,
       ISNULL(tmpKeyword.keyword,0) AS keyword

FROM   (SELECT id,
               Cast(title AS NVARCHAR(MAX))          AS title,
               Count(Cast(g.genre AS NVARCHAR(MAX))) AS genre
        FROM   movie m
               INNER JOIN genre g ON m.id = g.movie_id
        WHERE  Cast(g.genre AS NVARCHAR(MAX)) IN (SELECT DISTINCT Cast(genre AS NVARCHAR(MAX))
                                                 FROM   genre
                                                 WHERE  movie_id = @tmpMovieID)
        GROUP  BY id, Cast(title AS NVARCHAR(MAX))) tmpGenre

       FULL OUTER JOIN (SELECT id,
                               Cast(title AS NVARCHAR(MAX))            AS title,
                               Count(Cast(k.keyword AS NVARCHAR(MAX))) AS Keyword
                        FROM   movie m
                               INNER JOIN keyword k ON m.id = k.movie_id
                        WHERE  Cast(k.keyword AS NVARCHAR(MAX)) IN
                               (SELECT DISTINCT Cast(keyword AS NVARCHAR(MAX))
                                FROM   keyword
                                WHERE  movie_id = @tmpMovieID)
                        GROUP  BY id, Cast(title AS NVARCHAR(MAX))) tmpKeyword

                    ON tmpGenre.id = tmpKeyword.id
WHERE  tmpGenre.id <> @tmpMovieID
ORDER  BY tmpGenre.genre DESC, tmpKeyword.keyword DESC;

在線演示: https//dbfiddle.uk/?drbms = sqlserver_2017&fiddle=a1ee14e1e08b7e55eff2e8e94f89a287&hide=1


結果

+------+---------------------------+-------------+--------------+
| id   |          title            | genre_freq  | keyword_freq |
+------+---------------------------+-------------+--------------+
| 407  | Cinderella                |          2  |            2 |
| 826  | The Shape of Water        |          2  |            1 |
| 523  | How to Train Your Dragon  |          2  |            0 |
| 799  | Enchanted                 |          1  |            3 |
+------+---------------------------+-------------+--------------+

順便說一句,感謝您提出一個明確的問題,並提供表格架構,示例數據和所需的輸出。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM