简体   繁体   中英

Duplicated rows with subquery in select statement

I'm learning database with SQL Server and the duplication confuses me:

SELECT t.tno, t.tname, classes = (SELECT COUNT(*) FROM teaching te WHERE te.tno = t.tno)
FROM teacher t, teaching t2
WHERE t.tno = t2.tno AND t2.Llanguage = 'English'

if I use the code above, I will get a result with many duplicated rows like this:

+------+----------+-------+
|tno   |tname     |classes|
+------+----------+-------+
|T01   |t1        |3      |
|T01   |t1        |3      |
|T03   |t3        |4      |
|T03   |t3        |4      |
|T03   |t3        |4      |
|T04   |t4        |3      |
|T05   |t5        |3      |
|T05   |t5        |3      |
+------+----------+-------+

However, if I don't use the subquery, the duplication would not happen. Could some one help me?

Update 10/14 : the data inside teacher and teaching :

teacher:
+------+----------+---+----------+--------------------+
|tno   |tname     |sex|birthday  |title               |
+------+----------+---+----------+--------------------+
|T01   |t1        |m  |1980-06-10|lecturer            |
|T02   |t2        |f  |1970-03-14|professor           |
|T03   |t3        |m  |1973-04-20|associate professor |
|T04   |t4        |m  |1981-08-30|lecturer            |
|T05   |t5        |f  |1975-07-20| associate professor|
|T06   |t6        |m  |1980-09-19|lecturer            |
+------+----------+---+----------+--------------------+
teaching:
+------+------+----------+----+
|tno   |cno   |Llanguage |Year|
+------+------+----------+----+
|T01   |801   |English   |2018|
|T01   |803   |Bilingual |2018|
|T01   |804   |English   |2018|
|T02   |801   |Bilingual |2018|
|T02   |804   |Bilingual |2018|
|T03   |802   |English   |2018|
|T03   |804   |Bilingual |2018|
|T03   |805   |English   |2018|
|T03   |806   |English   |2018|
|T04   |802   |Bilingual |2018|
|T04   |803   |English   |2018|
|T04   |805   |Bilingual |2018|
|T05   |801   |Bilingual |2018|
|T05   |802   |English   |2018|
|T05   |803   |English   |2018|
|T06   |803   |Bilingual |2018|
|T06   |806   |Bilingual |2018|
+------+------+----------+----+

For the join style, the slide my teacher using is very out-dated (maybe 10 years ago?). In fact the code was written a few weeks ago and now I have been used to using the inner join . However, I still confuse about this problem.

You are getting duplicates, because a teacher can teach multiple English classes. You show each teacher as often as they have an English class.

What the query is supposed to do (or what I assume it is supposed to do): Show all teachers that teach at least one English class with their overall class count. The condition "that teach at least one English class" is a condition that you can check after aggregating the teachers' rows, so place it in the HAVING clause.

select t.tno, t.tname, count(t2.tno) as classes
from teacher t
inner join teaching t2 on t.tno = t2.tno
group by t.tno, t.tname
having count(case when t2.language = 'English' then 1 end) > 0

I don't know whether SQL Server actually requires t.tname in the GROUP BY clause. Standard SQL doesn't, because t.tno should uniquely define one teacher which includes their name.

And as has been mentioned by others: If you are taught to use comma-separated joins, then better quit your class, tutorial or book.

UPDATE: You have edited your request and still seem to be confused with the result of the join. To illustrate what is happening in the join:

teacher:

+------+----------+---+----------+--------------------+
|tno   |tname     |sex|birthday  |title               |
+------+----------+---+----------+--------------------+
|T01   |t1        |m  |1980-06-10|lecturer            |
|T02   |t2        |f  |1970-03-14|professor           |
+------+----------+---+----------+--------------------+

teaching:

+------+------+----------+----+
|tno   |cno   |Llanguage |Year|
+------+------+----------+----+
|T01   |801   |English   |2018|
|T01   |803   |Bilingual |2018|
|T01   |804   |English   |2018|
|T02   |801   |Bilingual |2018|
|T02   |804   |Bilingual |2018|
+------+------+----------+----+

Teachers joined with English classes:

+------+----------+------+-----------+--------------------+-------+-------+-------------+--------+
|t.tno |t.tname   |t.sex |t.birthday |t.title             |t2.tno |t2.cno |t2.Llanguage |t2.Year |
+------+----------+------+-----------+--------------------+-------+-------+-------------+--------+
|T01   |t1        |m     |1980-06-10 |lecturer            |T01    |801    |English      |2018    |
|T01   |t1        |m     |1980-06-10 |lecturer            |T01    |804    |English      |2018    |
+------+----------+------+-----------+--------------------+-------+-------+-------------+--------+

of which you are showing the first two columns plus the teaching count for the tno:

+------+----------+--------+
|t.tno |t.tname   |classes |
+------+----------+--------+
|T01   |t1        |3       |
|T01   |t1        |3       |
+------+----------+--------+
  1. Use explicit joins. Left outer or Inner only as you are learning. try not to use inner joins unless your tables force constraints.
  2. Never put a subscript in the select statement. On large tables you'll overwhelm the system.
  3. the group by allows control over definition of unique rows and aggregations.
SELECT t.tno, t.tname, classes =  COUNT(*)      
FROM teacher t
LEFT OUTER JOIN teaching t2 ON  t.tno = t2.tno    
WHERE t2.Llanguage = 'English'  AND t2.tno IS NOT NULL    
GROUP BY t.tno, t.tname

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM