I have two tables: user
and projects
, with a one-to-many relationship between two.
projects
table has field status
with project statuses of the user.
status
can be one of:
launched, confirm, staffed, overdue, complete, failed, ended
I want to categorize users in two categories:
launched
phaselaunched
status.I am using the following query:
SELECT DISTINCT(u.*), CASE
WHEN p.status = 'LAUNCHED' THEN 1
ELSE 2
END as user_category
FROM users u
LEFT JOIN projects p ON p.user_id = u.id
WHERE (LOWER(u.username) like '%%%'
OR LOWER(u.personal_intro) like '%%%'
OR LOWER(u.location) like '%%%'
OR u.account_status != 'DELETED'
AND system_role=10 AND u.account_status ='ACTIVE')
ORDER BY set_order, u.page_hits DESC
LIMIT 10
OFFSET 0
I am facing duplicate records for following scenario:
If user has projects with status launched
as well as overdue
, complete
or failed
, then that user is recorded two times as both the conditions in CASE
are satisfying for that user.
Please suggest a query where a user that has any project in launched
status gets his user_category
set to 1
. The same user should not be repeated for user_category 2
.
The query is probably not doing what you think it does for a number of reasons
There is DISTINCT
and there is DISTINCT
ON
(col1, col2)
.
DISTINCT (u.*)
is no different from DISTINCT u.*
. The parentheses are just noise.
AND
binds before OR
according to operator precedence . I suspect you want to use parentheses around the conditions OR
'ed together? Or do you need it the way it is? But you don't need parentheses around the whole WHERE
clause in any case.
Your expression LOWER(u.username) LIKE '%%%'
doesn't make any sense. Every non-null string qualifies. Can be replaced with u.username IS NOT NULL
. I suspect you want something different?
Postgres is case sensitive in string handling. You write of status
being 'launched' etc. but use 'LAUNCHED' in your query. Which is it?
A couple of table qualifications are missing from the question making it ambiguous for the reader. I filled in as I saw fit.
Everything put together, it might work like this:
SELECT DISTINCT ON (u.set_order, u.page_hits, u.id)
u.*
, CASE WHEN p.status = 'LAUNCHED' THEN 1 ELSE 2 END AS user_category
FROM users u
LEFT JOIN projects p ON p.user_id = u.id
WHERE LOWER(u.username) LIKE '%%%' -- ???
OR LOWER(u.personal_intro) LIKE '%%%'
OR LOWER(u.location) LIKE '%%%'
OR u.account_status != 'DELETED' -- with original logic
AND u.system_role = 10
AND u.account_status = 'ACTIVE'
ORDER BY u.set_order, u.page_hits DESC, u.id, user_category
LIMIT 10
Detailed explanation in this related question:
Two EXISTS
semi-joins instead of the DISTINCT ON
and CASE
might be faster:
SELECT u.*
, CASE WHEN EXISTS (
SELECT FROM projects p
WHERE p.user_id = u.id AND p.status = 'LAUNCHED')
THEN 1 ELSE 2 END AS user_category
FROM users u
WHERE
( LOWER(u.username) LIKE '%%%' -- ???
OR LOWER(u.personal_intro) LIKE '%%%'
OR LOWER(u.location) LIKE '%%%'
OR u.account_status != 'DELETED' -- with alternative logic?
)
AND u.system_role = 10 -- assuming it comes from users ???
AND u.account_status = 'ACTIVE'
AND EXISTS (SELECT 1 FROM projects p WHERE p.user_id = u.id)
ORDER BY u.set_order, u.page_hits DESC
LIMIT 10;
You can use MIN()
on your CASE
result, and it seems dropping the DISTINCT would be a wise choice:
SELECT u.*, MIN(CASE
WHEN p.status = 'LAUNCHED' THEN 1
ELSE 2
END) as user_category
...
GROUP BY <list all columns in the users table>
...
Since "launched" gives a 1, using MIN() will not only force a single result but will also give preference to "launched" over the other states.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.