I am trying to write a query in MySQL that will output the most frequently occurring pair of values. I have the following table:
This table contains users' music streaming activity on a given day. I want to find out which pair of artists was the most frequently played one on a specific day. The answer should be (Pink Floyd, Queen) because 3 users listened to both artists on the same day. How can I achieve this?
I've started by joining the table onto itself using this code:
With temp as (
select person_id, artist_name, count(*) as times_played from users where date_played = '2020-10-01' group by 1,2)
select a.person_id, a.artist_name, b.artist_name from temp a join temp b
On a.person_id = b.person_id and a.artist_name != b. artist_name;
The result is the following :
I am not sure how to process from this point, so any help would be appreciated!
Below is the code to create the table in mySQL
create table users
(
person_id int,
artist_name varchar(255),
date_played date
);
insert into users
(person_id, artist_name, date_played)
values
(1, 'Pink Floyd', '2020-10-01'),
(1, 'Led Zeppelin', '2020-10-01'),
(1, 'Queen', '2020-10-01'),
(1, 'Pink Floyd', '2020-10-01'),
(2, 'Journey', '2020-10-01'),
(2, 'Pink Floyd', '2020-10-01'),
(2, 'Queen', '2020-10-01'),
(2, 'Pink Floyd', '2020-10-01'),
(3, 'Pink Floyd', '2020-10-01'),
(3, 'Aerosmith', '2020-10-01'),
(3, 'Queen', '2020-10-01'),
(4, 'Pink Floyd', '2020-10-01'),
(4, 'Led Zeppelin', '2020-10-01');
Here's how I solved my question thanks to the trick I found in the code provided by Tim Biegeleisen in this post ( u1.artist_name < u2.artist_name
):
With temp AS (
SELECT
person_id,
artist_name
FROM users
WHERE date_played = '2020-10-01'
GROUP BY 1,2
)
SELECT *
FROM (
SELECT
u1.artist_name AS artist1,
u2.artist_name AS artist2,
COUNT(*) AS times_played,
RANK() OVER (ORDER BY COUNT(*) DESC) Rnk
FROM temp u1
JOIN temp u2
ON u1.artist_name < u2.artist_name AND u1.person_id = u2.person_id
GROUP by 1,2
) sub
WHERE Rnk = 1;
We can try handling this requirement using a self join along with the RANK()
analytic function:
WITH cte AS (
SELECT
u1.artist_name AS artist1,
u2.artist_name AS artist2,
RANK() OVER (ORDER BY COUNT(*) DESC) rnk
FROM users u1
INNER JOIN users u2
ON u1.artist_name < u2.artist_name AND u1.person_id = u2.person_id
WHERE
u1.date_played = u2.date_played
GROUP BY
u1.artist_name,
u2.artist_name
)
SELECT
artist1,
artist2
FROM cte
WHERE rnk = 1;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.