I have a huge log table and I need to fetch some data for usage statistics. let's say we have a log table:
| user_id | action |
| 12345 | app: IOs |
| 12345 | app_version: 2018 |
| 12346 | app: Android |
| 12346 | app_version: 2019 |
| 12347 | app: Windows |
| 12347 | app_version: 2019 |
Is there a way to fetch all user ids who uses old(2018) mobile apps?
There is a way I did it but it is not efficient
SELECT
user_id
FROM
log
WHERE
action LIKE '%2018%'
AND
user_id IN (SELECT DISTINCT user_id FROM log WHERE(action LIKE '%IOs%' OR action LIKE '%Android%' ))
GROUP BY user_id
This query took about half an hour on production.
So in the end I want to have list of user ids as efficient as possible as I also will join another table to get their emails. What options do I have?
You can use aggregation:
SELECT l.user_id
FROM log l
WHERE l.action LIKE '%2018%' OR
l.action LIKE '%IOs%' OR
l.action LIKE '%Android%'
GROUP BY l.user_id
HAVING SUM(l.action LIKE '%2018%') > 0 AND -- at least one 2018
SUM(l.action LIKE '%2018%') <> COUNT(*); -- at least one other
Unfortunately, the LIKE
comparisons require scanning the log
table. The only way around this would be to use a full text index.
You can simplify the logic to:
SELECT l.user_id
FROM log l
WHERE l.action REGEXP '2018|IOs|Android'
GROUP BY l.user_id
HAVING SUM(l.action LIKE '%2018%') > 0 AND -- at least one 2018
SUM(l.action LIKE '%2018%') <> COUNT(*); -- at least one other
I'm not sure if one REGEXP
is (marginally) faster than three LIKE
s or not.
You can use EXISTS
:
SELECT l.*
FROM log l
WHERE EXISTS (SELECT 1 FROM log l1 WHERE l1.user_id = l.user_id AND l1.action LIKE '%2018%');
Here is my solution with a LEFT JOIN
. I understand that you have a big logging table so this might not be the best one. I also added a few more records for testing:
Basically I use the LEFT JOIN
to move data from columns to rows so that I can simply filter with WHERE
.
SQL fiddle: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=9db538e59b3d265e4e8d8559762e79d4
WITH log_table AS (
SELECT *
FROM (VALUES (12345, 'app: iOS'),
(12345, 'app_version: 2018'),
(12346, 'app: Android'),
(12346, 'app_version: 2019'),
(12347, 'app: Windows'),
(12347, 'app_version: 2019'),
(12348, 'app: iOS'),
(12348, 'app_version: 2019'),
(12349, 'app: Android'),
(12349, 'app_version: 2018'),
(12350, 'app: Windows'),
(12350, 'app_version: 2018')
) v(user_id, action)
)
SELECT
L.user_id
FROM
log_table AS L
LEFT JOIN log_table AS L2
ON L.user_id = L2.user_id
WHERE (L.action LIKE '%iOS%' OR L.action LIKE '%Android%') AND L2.action LIKE '%2018%'
The result: (only select those with iOS or Android and have 2018 version)
user_id
12345
12349
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.