简体   繁体   中英

How to select distinct rows where 2 columns should match on multiple rows?

I have a table like this :

id | user_id | param_id | param_value  
1      1          44          google
2      1          45         adTest
3      1          46         Campaign
4      1          47          null
5      1          48          null
6      2          44          google
7      2          45         adAnotherTest
8      2          46         Campaign2
9      2          47         null
10     2          48         null  

I want to fetch all the user_ids where (param_id = 44 AND param_value=google) AND (param_id= 45 AND param_value = adTest) . So the above where clause should give only user_id = 1 and not user_id = 2 . They both have google at param_id 44 but only user 1 has param_value adTest at param_id = 45 .

The problem is the n the future more params could be added . I need to find a dynamic query . Here what i have tryed :

SELECT DISTINCT up.user_id FROM user_params AS up

                    LEFT JOIN user_params AS upp ON up.id = upp.id

                    WHERE up.param_id IN (?,?) 

                    AND upp.param_value IN (?,?)
SELECT DISTINCT up.user_id 
FROM user_params AS up
LEFT JOIN user_params AS upp ON up.id = upp.id
group by up.user_id
having sum(param_id = 44 AND param_value = 'google') >= 1
and sum(param_id = 45 AND param_value = 'adTest') >= 1

Another way:

SELECT  -- DISTINCT 
    up1.user_id 
FROM 
    user_params AS up1
  JOIN
    user_params AS up2 
      ON up1.user_id = up2.user_id
WHERE
    up1.param_id = 44 AND up1.param_value = 'google'
  AND 
    up2.param_id = 45 AND up2.param_value = 'adTest' ;

You do not need the DISTINCT , if there is a UNIQUE constraint on (user_id, param_id)

For efficiency, add an index on (param_id, param_value, user_id)


The problem you are dealing with is called "Relational Division" and there is a great answer by @Erwin Brandstetter here: How to filter SQL results in a has-many-through relation , with a lot of ways to write such a query, along with performance tests.

The tests were done in Postgres so some of the queries do not even run in MySQL but at least half of them do run and efficiency would be similar in many of them.

If you want to optimize this should give the same results without the need an LEFT JOIN table scans ( thanks to juergen d for having part )

SELECT
 user_id

FROM 
 user_params

WHERE
  param_id IN(44, 45)
 AND
  param_value IN('google', 'adTest')

GROUP BY
 user_id 

HAVING 
    sum(param_id = 44 AND param_value = 'google') >= 1
  AND
    sum(param_id = 45 AND param_value = 'adTest') >= 1
;

see http://sqlfiddle.com/#!2/17b65/4 for demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM