Subquery has been wrong all this time what do I do?

Question

So I have the following table structure for a Sports Event system

TEAMS TABLE
team_id
game_id
team_name
team_logo

PLAYERS TABLE player_id
team_id
player_name
player_mobile
player_email

So whenever a player submits a team registration details get saved on both tables. Events could be something like Cricket, Basketball, Netball, etc. Sometimes they dont fill in players details and sometimes they resubmit their team again which means same team name is submitted.

So whenever I need to check the accurate details of the team list I have been using this:

SELECT team_id FROM `teams` WHERE `game_id`= 35 GROUP BY `team_name

To get a list of the people in these teams that are the same name I was using this:

SELECT team_id, player_name FROM `player` WHERE team_id IN (SELECT team_id FROM `teams` WHERE `game_id`= 35 GROUP BY `team_name`) AND player_name IS NOT NULL AND player_name <> ''

The problem is the query on top gives me different results to what I am getting on the bottom. What I need to do is to get a list of current teams whenever i need. Duplicates of teams should be not there. Then I need a list of the players of these teams.

Currently stumped :( Help me pls.

Answer 1

TL;DR

You can get the desired results with a JOIN and DISTINCT

SELECT DISTINCT t.team_name, P.player_name 
FROM teams AS t
    INNER JOIN Players AS p
        ON p.team_id = t.team_id;

FULL EXPLANATION

The following query is not deterministic, that is to say, you could run the same query on the same data multiple times and get different results:

SELECT team_id 
FROM `teams` 
WHERE `game_id`= 35 
GROUP BY `team_name`;

Many DBMS would not even allow this query to run. You have stated that some teams are duplicated, so consider the following dummy data:

team_id    team_name     game_id
------------------------------------
   1       The A-Team       35
   2       The A-Team       35
   3       The A-Team       35

When you group by team_name you are end up with one group, so if we start with a valid query:

SELECT team_name
FROM `teams` 
WHERE `game_id`= 35 
GROUP BY `team_name`;

We would expect one result:

team_name
--------------
The A-Team

When you add team_id in to the select, with no aggregate function, you need to pick one value for team_id , but the query engine has 3 different values to chose from, and none of them are more correct than any other. This is why anything in the select statement, must be contained within the group by (or functionally dependent on something that is), or part of an aggregate function.

The MySQL Docs state:

In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:

SELECT o.custid, c.name, MAX(o.payment)
  FROM orders AS o, customers AS c
  WHERE o.custid = c.custid
  GROUP BY o.custid;

For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group.

The reason this clause exists is valid, and can save some time, consider the following query:

SELECT t.team_id, t.team_name, COUNT(*) AS Players
FROM teams AS t
    LEFT JOIN Players AS p
        ON p.team_id = t.team_id
GROUP BY t.team_id;

Here, we can include team_name in the select list even though it is not in the group by, but we can do this safely since team_id is the primary key, therefore it would be impossible to have two different values of team_name for a single team_id .

Anyway, I digress, the problem you are most likely having is that the value returned for team_id in each of your queries will likely be different depending on the context of the query and the execution plan chosen.

You can get a distinct list of players and teams using DISTINCT :

SELECT DISTINCT t.team_name, P.player_name 
FROM teams AS t
    INNER JOIN Players AS p
        ON p.team_id = t.team_id;

This is essentially a hack, and while it does remove duplicate records it does not resolve the underlying issue, of duplicate records, and potentially a sub-optimal data structure.

If it is not too late, I would reconsider your design and make a few changes. If team names are supposed to be unique, then make them unique with a unique constraint, so instead of working around duplicate entries, you prevent them completely.

You should probably be using junction tables for players and games, ie have your main tables

Team (team_id, team_name, team_logo etc)
Game (game_id, game_name, etc)
Player (player_id, player_name, player_email, player_mobile etc)

Then tables to link them

Team_Game (team_id, game_id)
Team_Player (team_id, player_id)

This then allows one player to play for multiple teams, or one team to enter multiple events.

Answer 2

Select t.team_id , p.player_name from player p
JOIN teams t 
ON t.team_id = p.team_id
Where t.game_id = 35 AND p.player_name IS NOT NULL AND p.player_name <> ''
GROUP BY(t.team_name)
```

You should do a unique constraint on the team_name column, this way you are not allowing duplicate teams

Ps. I did not test the query but it should work

Subquery has been wrong all this time what do I do?

Question

2 answers

solution1
1 ACCPTED 2019-06-06 21:15:24

solution2
0 2019-06-06 20:42:52

Subquery has been wrong all this time what do I do?

Question

2 answers

solution1 1 ACCPTED 2019-06-06 21:15:24

solution2 0 2019-06-06 20:42:52

solution1
1 ACCPTED 2019-06-06 21:15:24

solution2
0 2019-06-06 20:42:52