簡體   English   中英

大表的SQL查詢

[英]SQL query for large table size

需要幫助在SQL數據庫中查找相似的值。 表結構如下:

    id         |        item_id_nm |      height |    width |     length |     weight
    ----------------------------------------------------------------------------------
    1          |       00000000001 |      1.0    |     1.0  |        1.0 |         1.0
    2          |       00000000001 |      1.1    |     1.0  |        0.9 |         1.1
    3          |       00000000001 |      2.0    |     1.0  |        1.0 |         1.0
    4          |       00000000002 |      1.0    |     1.0  |        1.0 |         1.0
    5          |       00000000002 |      1.0    |     1.1  |        1.1 |         1.0
    6          |       00000000002 |      1.0    |     1.0  |        1.0 |         2.0

id顯然不能重復,item_id_nm可以重復(實際上可以多次發生,也就是> 2)。

您將如何形成SQL以查找重復的item_id_nm,但僅當高度或寬度,長度或重量的值相差> 30%時才查找。

我知道它需要遍歷表,但是我該如何進行檢查。 謝謝您的幫助。

編輯:包含%30差異的示例。 id = 3,其高度與id的1和2的1.0(或1.1)相差200%。因此,抱歉,不清楚,但是對於高度,寬度,長度或重量的每個值,可能會有30%的差異。如果其中一個有30%的差異,則將其視為另一個的重復。

這應該使您的行與平均值相差30%或更多:

SELECT t1.*
FROM tbl t1
INNER JOIN (
    SELECT
         item_id_nm,
        AVG(width) awidth, AVG(height) aheight, 
        AVG(length) alength, AVG(weight) aweight
    FROM tbl
    GROUP BY item_id_nm ) t2
USING (item_id_nm)
WHERE 
    width > awidth * 1.3 OR width < awidth * 0.7
    OR height > aheight * 1.3 OR height < aheight * 0.7
    OR length > alength * 1.3 OR length < alength * 0.7
    OR weight > aweight * 1.3 OR weight < aweight * 0.7

這應該給您幾行相差30%的行:

SELECT t1.*,t2.*
FROM tbl t1
INNER JOIN tbl t2
USING (item_id_nm)
WHERE 
     (t1.width > t2.with * 1.3 OR t1.width < t2.width * 0.7)
    OR (t1.height > t2.height * 1.3 OR t1.height < t2.height * 0.7)
    OR (t1.length > t2.length * 1.3 OR t1.length < t2.length * 0.7)
    OR (t1.weight > t2.weight * 1.3 OR t1.weight < t2.weight * 0.7)

我認為您可以使用以下方式:

SELECT item_id_nm
FROM yourtable
GROUP BY item_id_nm
HAVING
  MIN(height)*1.3 < MAX(height) OR
  MIN(width)*1.3 < MAX(width) OR
  MIN(length)*1.3 < MAX(length) OR
  MIN(weight)*1.3 < MAX(weight)
SELECT
    *
FROM
    TableName
WHERE
   (height > 1.3 * width OR height < 0.7 width) OR
   (length > 1.3 * width OR length < 0.7 width)
GROUP BY
    item_id_nm
HAVING
    COUNT(item_id_nm) > 1

我會用:

SELECT s1.id AS id1, s2.id AS id2
, s1.height AS h1, s2.height as h2
, s1.width as width1, s2.width as width2
, s1.length as l1, s2.length as l2
, s1.weight as weight1, s2.weight as weight2
FROM stack s1
INNER JOIN stack s2
ON s1.item_id_nm = s2.item_id_nm
WHERE s1.id != s2.id
AND s1.id < s2.id
AND (abs(100-((s2.height*100)/s1.height)) > 30
OR abs(100-((s2.width*100)/s1.width)) > 30
OR abs(100-((s2.length*100)/s1.length)) > 30
OR abs(100-((s2.weight*100)/s1.weight)) > 30)

使用PostgreSQL( http://sqlfiddle.com/#!12/e5f25/15 )。 此代碼不返回重復的行。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM