简体   繁体   中英

SQL listagg and concat, remove duplicates

Hello

I'm currently working on a school projet, about the Tour de France, where I have to create a website using php, oracle etc., and I'm struggling with this part :


Here are the three table I am working with, each specifying :

  1. The runner (keyed on the runner id)
  2. The runner's nationality (keyed on the runner id and IOC code)
  3. The runner's participations (keyed on the runner id and the year)

在此处输入图片说明


I have these three tables and I wish to concatenate them in a certain format to deal with them with php without having to manually count each participation programmatically

val1data1|val2data1|val3data1;val1data2|val2data2|val3data2

It works sort of well but there is still one issue : If the runner have multiple participations, the nationality will be duplicated by the number of participation and the same goes for the participations if the runner ran under multiple nationality Here is an example of what's happening.

在此处输入图片说明

Here is my SQL Query :

SELECT N_COUREUR, NOM, PRENOM, ANNEE_PREM, ANNEE_NAISSANCE,
        LISTAGG(".concatAll('|','annee','n_equipe','n_sponsor','n_dossard','jeune','valide').",';') WITHIN GROUP (ORDER BY N_COUREUR,NOM, PRENOM, ANNEE_PREM, ANNEE_NAISSANCE) participations, 
        LISTAGG(".concatAll('|','code_cio','ANNEE_DEBUT','ANNEE_FIN').",';') WITHIN GROUP (ORDER BY N_COUREUR,NOM, PRENOM, ANNEE_PREM, ANNEE_NAISSANCE) nationalites 
        FROM TDF_COUREUR 
        LEFT JOIN tdf_parti_coureur using(n_coureur) 
        LEFT JOIN tdf_app_nation using (n_coureur) 
        GROUP BY (N_COUREUR,NOM, PRENOM, ANNEE_PREM, ANNEE_NAISSANCE)
        ORDER BY N_COUREUR;

/*returns the correct concatenation for readability : (CONCAT(val1,CONCAT(val2,val3)) etc*/
function concatAll($separator, ...$arr){
            $string="";
            $count = count($arr);
            for($i = 0; $i<$count-1;$i++){
                $string .= "CONCAT(".$arr[$i].",CONCAT('$separator',";
            }
            $string .= $arr[$count-1];
            for($i = 0; $i<2*($count-1);$i++){
                $string .= ")";
            }
            return $string;
        }

I tried to add CODE_CIO or ANNEE to either of the WITHIN GROUP but couldn't work it out, am I missusing WITHIN GROUP ?

A runner can have multiple nationalities and participate in multiple races. You join the two tables, although a particular nationality isn't linked to a particular race. Thus you are generating an undesired cartesian product. The problem is hence that you are joining entities that are not completely related.

Joining all involved tables and then aggregate the inflated intermediate result is a commom mistake. Instead aggregate first in order to get to the entities that you really want to join.

What you want to do is show a runner with their race list and their nationality list. Create these lists and then join them:

select
  c.n_coureur,
  c.nom,
  c.prenom,
  c.annee_prem,
  c.annee_naissance,
  pc.participations, 
  an.nationalites 
from tdf_coureur c
left join 
(
  select
    n_coureur,
    listagg(annee || '|' || n_equipe || '|' || n_sponsor || '|' || n_dossard || '|' || jeune || '|' || valide, ';') 
      within group (order by n_coureur, nom, prenom, annee_prem, annee_naissance) as participations
  from tdf_parti_coureur 
  group by n_coureur
) pc using(n_coureur)
left join 
(
  select
    n_coureur,
    listagg(code_cio || '|' || annee_debut || '|' || annee_fin, ';') 
      within group (order by n_coureur, nom, prenom, annee_prem, annee_naissance) as nationalites 
  from tdf_app_nation
  group by n_coureur
) an using(n_coureur)
order by c.n_coureur;

Thumbs up to Thorsten !

Here is the correct query (I just changed some small issues but it worked wonderfully !) :

select
    n_coureur,
    nom,
    prenom,
    annee_prem,
    annee_naissance,
    participations, 
    nationalites 
  from tdf_coureur c
  left join 
  (
    select
      n_coureur,
      listagg(annee || '|' || n_equipe || '|' || n_sponsor || '|' || n_dossard || '|' || jeune || '|' || valide, ';') 
        within group (order by n_coureur) as participations
    from tdf_parti_coureur 
    group by n_coureur
  ) pc using(n_coureur)
  left join 
  (
    select
      n_coureur,
      listagg(code_cio || '|' || annee_debut || '|' || annee_fin, ';') 
        within group (order by n_coureur) as nationalites 
    from tdf_app_nation
    group by n_coureur
  ) an using(n_coureur)
  order by n_coureur 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM