简体   繁体   中英

Complicated SQL Query involving multiple tables

In this Query, I have to list pair of players with their playerID and playerName who play for the exact same teams.If a player plays for 3 teams, the other has to play for exact same 3 teams. No less, no more. If two players currently do not play for any team, they should also be included. The query should return (playerID1, playername1, playerID2, playerName2) with no repetition such as if player 1 info comes before player 2, there should not be another tuple with player 2 info coming before player 1.

For example if player A plays for yankees and redsox, and player b plays for Yankees, Red Sox, and Dodgers I should not get them. They both have to play for Yankees, and Red Sox and no one else. Right now this query finds answer if players play for any same team.

 player(playerID: integer, playerName: string)
 team(teamID: integer, teamName: string, sport: string)
 plays(playerID: integer, teamID: integer)

Right now the Query I have is

SELECT p1.playerID, p1.playerName, p2.playerID, p2.playerName
FROM player p1, player p2, plays
WHERE p1.teamID = p2.teamID AND teamID in.....

I am stuck on how to approach it after this. Any hints on how to approach this problem. Thanks for your time.

I think the easiest approach is to concatenate the teams together and just join on the results. Postgres provides the function string_agg() to aggregate strings:

select p1.playerId, p1.playerName, p2.playerId, p2.playerName
from (select p.playerId, string_agg(cast(p.TeamId as varchar(255)), ',' order by TeamId) as teams,
      from plays p join
           players pp
           on p.playerId = pp.playerId
      group by p.playerId
     ) p1 join
     (select p.playerId, string_agg(cast(p.TeamId as varchar(255)), ',' order by TeamId) as teams,
      from plays p join
           players pp
           on p.playerId = pp.playerId
      group by p.playerId
     ) p2
     on p1.playerid < p2.playerid and p1.teams = p2.teams;


You can do this without string_agg . The idea is to start with a list of all possible player combinations.

Then, join in the teams for the first player using left outer join . And join in the teams for the second by using full outer join and matching on the team and driver name. The reason you need the driver table is to be sure that the id/name does not get lost in the full outer join:

select driver.playerid1, driver.playerid2
from (select p1.playerId as playerId1, p1.playerName as playerName1,
             p2.playerId as playerId2, p1.playerName as playerName2
      from players p1 cross join
           players p2
      where p1.playerId < p2.playerId
     ) driver left outer join
     plays p1
     on p1.playerId = driver.playerId full outer join
     plays p2
     on p2.playerId = driver.playerId and
        p2.teamid = p1.teamid
group by driver.playerid1, driver.playerid2
having count(p1.playerid) = count(*) and
       count(p2.playerid) = count(*);

This joins two players on the team id (with ordering so a pair only gets considered once). It then says there is a match when all the rows for the two players have non-NULL team values. This is perhaps more clear with the equivalent having clause:

having sum(case when p1.playerid is null then 1 else 0 end) = 0 and
       sum(case when p2.playerid is null then 1 else 0 end) = 0;

The full outer join will produce NULL values when two players have teams that don't match. So, no NULL values mean that all the teams match.

This is an adaptation of my answer to a previous question of yours.

  1. Get all unique combinations of players using a triangular join:

     SELECT p1.playerID, p1.playerName, p2.playerID, p2.playerName FROM player p1 INNER JOIN player p2 ON p1.playerID < p2.playerID 
  2. Subtract the second player's team set from that of the first player and check if there are no rows in the result:

     NOT EXISTS ( SELECT teamID FROM plays WHERE playerID = p1.playerID EXCEPT SELECT teamID FROM plays WHERE playerID = p2.playerID ) 
  3. Swap the sets, subtract and check again:

     NOT EXISTS ( SELECT teamID FROM plays WHERE playerID = p2.playerID EXCEPT SELECT teamID FROM plays WHERE playerID = p1.playerID ) 
  4. Finally, apply both conditions to the result of the triangular join in Step 1.

     SELECT p1.playerID, p1.playerName, p2.playerID, p2.playerName FROM player p1 INNER JOIN player p2 ON p1.playerID < p2.playerID WHERE NOT EXISTS ( SELECT teamID FROM plays WHERE playerID = p1.playerID EXCEPT SELECT teamID FROM plays WHERE playerID = p2.playerID ) AND NOT EXISTS ( SELECT teamID FROM plays WHERE playerID = p2.playerID EXCEPT SELECT teamID FROM plays WHERE playerID = p1.playerID ) ; 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM