简体   繁体   English

从子表获得最大值的SQL查询

[英]SQL Query With Max Value from Child Table

Three pertinent tables: tracks (music tracks), users, and follows. 三个相关的表:曲目(音乐曲目),用户和关注者。

The follows table is a many to many relationship relating users (followers) to users (followees). 下表是将用户(追随者)与用户(追随者)相关的多对多关系。

I'm looking for this as a final result: <track_id> , <user_id> , <most popular followee> 我正在寻找最终结果: <track_id><user_id><most popular followee>

The first two columns are simple and result from a relationship between tracks and users. 前两列很简单,是由曲目和用户之间的关系导致的。 The third is my problem. 第三是我的问题。 I can join with the follows table and get all of the followees that each user follows, but how to get only the most followee that has the highest number of follows. 我可以与follows表一起加入,并获得每个用户关注的所有关注者,但是如何获取关注者数量最多的最多关注者。

Here are the tables with their pertinent columns: 以下是带有相关列的表:

tracks: id, user_id (fk to users.id), song_title
users: id
follows: followee_id (fk to users.id), follower_id (fk to users.id)

Here's some sample data: 以下是一些示例数据:

TRACKS
1, 1, Some song title

USERS
1
2
3
4

FOLLOWS
2, 1
3, 1
4, 1 
3, 4
4, 2
4, 3

DESIRED RESULT
1, 1, 4

For the desired result, the 3rd field is 4 because as you can see in the FOLLOWS table, user 4 has the most number of followers. 对于期望的结果,第3字段是4,因为正如您在FOLLOWS表中所看到的那样,用户4的关注者数量最多。

I and a few great minds around me are still scratching our heads. 我和周围一些伟大的思想仍在挠头。

So I threw this into Linqpad because I'm better with Linq. 所以我把它扔进了Linqpad,因为我对Linq更好。

Tracks
    .Where(t => t.TrackId == 1)
    .Select(t => new { 
        TrackId = t.TrackId,
        UserId = t.UserId, 
        MostPopularFolloweeId = Followers
            .GroupBy(f => f.FolloweeId)
            .OrderByDescending(g => g.Count())
            .FirstOrDefault()
            .Key
    });

The resulting SQL query was the following (@p0 being the track id): 产生的SQL查询如下(@ p0是轨道ID):

-- Region Parameters
DECLARE @p0 Int = 1
-- EndRegion
SELECT [t0].[TrackId], [t0].[UserId], (
    SELECT [t3].[FolloweeId]
    FROM (
        SELECT TOP (1) [t2].[FolloweeId]
        FROM (
            SELECT COUNT(*) AS [value], [t1].[FolloweeId]
            FROM [Followers] AS [t1]
            GROUP BY [t1].[FolloweeId]
            ) AS [t2]
        ORDER BY [t2].[value] DESC
        ) AS [t3]
    ) AS [MostPopularFolloweeId]
FROM [Tracks] AS [t0]
WHERE [t0].[TrackId] = @p0

That outputs the expected response, and should be a start to a cleaner query. 这将输出预期的响应,并且应该是更清晰查询的开始。

This sounds like an aggregation query with row_number() . 这听起来像是使用row_number()进行的聚合查询。 I'm a little confused on how all the joins come together: 我对所有联接如何组合在一起感到有些困惑:

select t.*
from (select t.id, f.followee_id, count(*) as cnt,
             row_number() over (partition by t.id order by count(*) desc) as seqnum
      from followers f join
           tracks t 
           on f.follow_id = t.user_id
      group by t.id, f.followee_id
     ) t
where seqnum = 1;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM