简体   繁体   中英

MySQL Is there a way to select the first N records grouped by a column

So I am fairly new to MySQL so I apologize if this is a dumb question, but or simplicity, lets say I have 2 tables: patient and appointment. I want to get a given number of patients (50 in this example) and all of their appointments with given filters.

Right now I am doing the following which works but is quite slow:

SELECT
   t2.* 
FROM
   (
      SELECT
         patient_id 
      FROM
         patient 
         LEFT JOIN
            appointment 
            ON appointment.patient_id = patient.patient_id  
    /* a bunch of filters and other joins */
      GROUP BY
         patient_id LIMIT 50 
   )
   t1 
   LEFT JOIN
      (
         SELECT
            patient_id 
         FROM
            patient 
            LEFT JOIN
               appointment 
               ON appointment.patient_id = patient.patient_id   
        /* a bunch of filters and other joins */
      )
      t2 
      ON t1.patient_id = t2.patient_id

I am wondering if there is a better way of doing this. It just feels like I am doing extra work since I am essentially doing the same thing in t1 and t2 and just grouping t1. Is it somehow possible to get rid of t1 and LIMIT t2 by the first 50 distinct patient_ids? Thoughts? Thank you!

Your code seems to be choosing the top 50 patients based on the values of patientId . This is using the implicit ordering of GROUP BY , which has been deprecated. You should have an explicit ORDER BY before the LIMIT .

You can simplify your query by eliminating using rank() :

SELECT t.* 
FROM (SELECT . . ,
             DENSE_RANK() OVER (ORDER BY p.Patient_Id) as seqnum
      FROM patient p LEFT JOIN
           appointment a
           ON a.patient_id = p.patient_id  
    /* a bunch of filters and other joins */
      GROUP BY p.patient_id
     ) t 
WHERE seqnum <= 50;

EDIT:

In older versions, you can use variables for the same effect:

SELECT t.*
FROM (SELECT t.*,
             (@rn := if(@p = patient_id, @rn + 1,
                        if(@p := patient_id, 1, 1)
                       )
             ) as seqnum
      FROM (SELECT . . 
            FROM patient p LEFT JOIN
                 appointment a
                 ON a.patient_id = p.patient_id  
          /* a bunch of filters and other joins */
            GROUP BY p.patient_id
            ORDER BY p.patient_id
           ) t CROSS JOIN
           (SELECT @p := '', @rn := 0) params
     ) t
WHERE seqnum <= 50;

You can use a subquery to find the first 50 patients. Then a simple LEFT JOIN will do:

select
  p.*, a.*
from patient p
left join appointment a on a.patient_id = p.patient_id
where p.patient_id in (
    select patient_id from patient order by patient_id limit 50
  )
  and p.registration_date > '2019-01-01' -- example of extra filtering condition

Remember, that if you want to place a filtering condition on using an appointment column, you'll need to place it in the ON clause rather than the WHERE clause; otherwise it will [most likely] defeat the LEFT JOIN .

Your query can be simplified by following way.

Since the basic requirement is to get 50 patients record along with appointment details.

In this case equijoin would help.

An equijoin returns only the rows that have equivalent values for the specified columns.

  SELECT
  t1.PATIENT_ID ,t2.* 
  from PATIENT t1, APPOINTMENT t2
  where t1.PATIENT_ID = t2.PATIENT_ID
  order by t1.PATIENT_ID
  LIMIT 50;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM