简体   繁体   中英

MySQL - unique rows, corresponding to one of 3 tables only

The following query pulls data correctly as expected, however the left join with lnk_cat_isrc table and through that to catalogue table, brings back repeated data if there is more than one item in catalogue which has the same isrcs from isrc table:

SELECT 
                isrc.ISRC,
                isrc.Track_Name,
                isrc.ArtistName,
                isrc.TitleVersion,
                isrc.Track_Time,
                `isrc_performer`.`PerformerName` ,
                `performer_category`.`PerformerCategory` ,
                `isrc_performer`.`PerformerRole` ,
                `isrc`.`isrc_ID`,
                `isrc_performer`.`Perf_ID`

        FROM `isrc`

        LEFT JOIN `isrc_performer` ON (isrc.isrc_ID = isrc_performer.isrc_ID)
        LEFT JOIN `performer_category` ON (performer_category.PerfCat_ID = isrc_performer.PerfCat_ID)
        LEFT JOIN `lnk_cat_isrc` ON (lnk_cat_isrc.isrc_ID = isrc.isrc_ID)
        LEFT JOIN `catalogue` ON (catalogue.ID = lnk_cat_isrc.cat_id) 
        ORDER BY   isrc_ID     desc LIMIT 0 , 10
        ";

I cannot use group by on isrc , because the isrc_performer table can have more than one performer to an isrc.

So the relations are like this: Few items from catalogue table can have several identical items from isrc table. In turn, each isrc can have more than one entry in isrc_performer table.

What I want is to display all corresponding data from isrc_performer in relation to each isrc, but not repeating it for each item from catalogue table. I also want to display all the rest "empty" isrcs (those which don't have any data in isrc_performer table)

Can you give me any ideas?

PS despite I'm not pulling any data from catalogue table itself, I'm using it to search by a catalogue number, when user defines search criteria for $where_condition variable, hence I need to keep it in the query. ie $where_condition = "catalogue.Catalogue LIKE '%test%' OR ISRC LIKE '%test%' OR Track_Name LIKE '%test%' OR ArtistName LIKE '%test%' OR TitleVersion LIKE '%test%' OR PerformerName LIKE '%test%' OR PerformerCategory LIKE '%test%' OR PerformerRole LIKE '%test%'";

------UPD:

trying to graphically represent possible variation in these 3 tables relations:

cat1 - isrc1 - performer1
       isrc2 - performer1
             - performer2
             - performer3

cat2 - isrc2 - performer1
             - performer2
             - performer3
     - isrc3 - performer2
             - performer4

cat3 - isrc4
     - isrc1 - performer1

UPD (pics added)

Here are screen prints. As you can see on picture 1 there are 9 rows with same isrc number, however there are 3 repeated performers Jason, David, Paul.

在此处输入图片说明

This is because 3 different catalogue items have this exact isrc with 3 different performers as per pic 2

在此处输入图片说明

= 1(isrc) * 3(catalogue) * 3(performers) = 9 row on output

All I want is that Performers grid would only display 3 rows of this isrc for each performer.

---Rearrange the answer to put the "best" option up top.. .but is all of this for naught.. w/o any data from lnk_cat_isrc or catalogue being returned, why does filtering on catalog make a difference? we're returning all isrc regardless of any filtering because it's a left join...

So this brings into question given sample data what are the expected results.

Possibly more elegant ... (but not sure if it would be faster) moving away from exists and simply using a distinct in a subquery so catalog queries always return 1 row per isrc; solving the 1-M problem keeping the left join thereby keeping the isrc records not in the catalog limits. Return all isrc information performer information if it exists, performer category info if it exists and catalogue information If, and only if it matches the catalog filters.

SELECT isrc.ISRC
    , isrc.Track_Name
    , isrc.ArtistName
    , isrc.TitleVersion
    , isrc.Track_Time
    ,`isrc_performer`.`PerformerName` 
    ,`performer_category`.`PerformerCategory` 
    ,`isrc_performer`.`PerformerRole` 
    ,`isrc`.`isrc_ID`
    ,`isrc_performer`.`Perf_ID`
FROM `isrc`
LEFT JOIN `isrc_performer` 
  ON isrc.isrc_ID = isrc_performer.isrc_ID
LEFT JOIN `performer_category` 
  ON performer_category.PerfCat_ID = isrc_performer.PerfCat_ID
LEFT JOIN (SELECT distinct lnk_cat_isrc.isrc_ID
           FROM `lnk_cat_isrc` 
           INNER JOIN `catalogue` 
             ON catalogue.ID = lnk_cat_isrc.cat_id
           WHERE...) DCat
   ON Dcat.isrc_ID = isrc.isrc_ID
ORDER BY   isrc_ID     desc 
LIMIT 0 , 10;

As you pointed out the join is causing the problem. So eliminate the join and use the exists notation. Distinct would also work since you're not selecting any values from catalog; though exists should be faster.

Fast but doesn't include all isrc records... (not sure why the or not exists should bring them back in...)

SELECT isrc.ISRC
     , isrc.Track_Name
     ,isrc.ArtistName
     ,isrc.TitleVersion
     ,isrc.Track_Time
     ,`isrc_performer`.`PerformerName` 
     ,`performer_category`.`PerformerCategory` 
     ,`isrc_performer`.`PerformerRole` 
     ,`isrc`.`isrc_ID`
     ,`isrc_performer`.`Perf_ID`
    FROM `isrc`
    LEFT JOIN `isrc_performer` 
      ON (isrc.isrc_ID = isrc_performer.isrc_ID)
    LEFT JOIN `performer_category` 
      ON (performer_category.PerfCat_ID = isrc_performer.PerfCat_ID)
    WHERE EXISTS (SELECT * 
                  FROM  `lnk_cat_isrc` 
                  INNER JOIN `catalogue` 
                    ON catalogue.ID = lnk_cat_isrc.cat_id
                   --and your other criteria
                  WHERE (lnk_cat_isrc.isrc_ID = isrc.isrc_ID)
                  ) 
     OR NOT EXISTS (SELECT * 
                    FROM `lnk_cat_isrc` 
                    WHERE lnk_cat_isrc.isrc_ID = isrc.isrc_ID
    ORDER BY isrc_ID desc 
    LIMIT 0 , 10

Or using select distinct simple straight forward; but slow

 SELECT isrc.ISRC
     , isrc.Track_Name
     ,isrc.ArtistName
     ,isrc.TitleVersion
     ,isrc.Track_Time
     ,`isrc_performer`.`PerformerName` 
     ,`performer_category`.`PerformerCategory` 
     ,`isrc_performer`.`PerformerRole` 
     ,`isrc`.`isrc_ID`
     ,`isrc_performer`.`Perf_ID`
  FROM `isrc`
  LEFT JOIN `isrc_performer` 
    ON (isrc.isrc_ID = isrc_performer.isrc_ID)
  LEFT JOIN `performer_category` 
    ON (performer_category.PerfCat_ID = isrc_performer.PerfCat_ID)
  LEFT JOIN `lnk_cat_isrc` 
    ON (lnk_cat_isrc.isrc_ID = isrc.isrc_ID)
  LEFT JOIN `catalogue` 
    ON (catalogue.ID = lnk_cat_isrc.cat_id) 
   --AND (other criteria on catalog here, cause in a where clause you left joins will behave like inner joins)
  ORDER BY isrc_ID desc 
  LIMIT 0 , 10;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM