简体   繁体   English

MySQL查询-用2个特定的DISTINCT列值选择2行

[英]MySQL Query - SELECT 2 rows with 2 specific DISTINCT column values

My goal is, for each PID , to select 2 records with test_sname values of ' want ' and ' want2 ' that occur on the same entry_date . 我的目标是,对于每个PID ,选择2条记录, test_sname值分别在相同entry_date上出现' want '和' want2 '。 I do this for the first 5 entry_dates that include both test_snames . 我在包含两个test_snames的前5个entry_dates中执行此test_snames

This is my query for accomplishing this: 这是我完成此查询:

queryBuilder = 
"""select PID, test_sname, test_value, units, ref_range, entry_date from labs
   where PID=%s and (test_sname='want' or test_sname='want2') and entry_date in

   (select entry_date from labs where PID=%s and test_sname in ('want', 'want2')
   group by entry_date having count(*) = 2) 

   order by entry_date limit 10;""" % (pid, pid)

It works as expected when an entry_date has only two rows that contain a test_sname of ' want ' or ' want2 '. 当entry_date只有两行包含test_sname为' want '或' want2 '时,它将按预期工作。

PID      |test_sname  |test_value  |units    |entry_date
10000000 | want       |         343 | U/L     | 2008-01-01 01:01:01
10000000 | want2      |      984.34 |         | 2008-01-01 01:01:01
10000000 | NA1        |          56 | %       | 2008-01-01 01:01:01
10000000 | NA2        |         420 | mg/dL   | 2008-01-01 01:01:01
10000000 | NA2        |         420 | mg/dL   | 2008-01-02 01:01:01

10000000 | want       |         343 | U/L     | 2008-01-02 01:01:01
10000000 | want2      |      984.34 |         | 2008-01-02 01:01:01
10000000 | NA1        |          26 | %       | 2008-01-02 01:01:01
10000000 | NA2        |         410 | mg/dL   | 2008-01-02 01:01:01
10000000 | NA2        |         455 | mg/dL   | 2008-01-02 01:01:01

Results of Query (which are correct): 查询结果(正确):

PID      |test_sname  |test_value  |units    |entry_date
10000000 | want       |         343 | U/L     | 2008-01-01 01:01:01
10000000 | want2      |      984.34 |         | 2008-01-01 01:01:01
10000000 | want       |         343 | U/L     | 2008-01-02 01:01:01
10000000 | want2      |      984.34 |         | 2008-01-02 01:01:01

The problem comes when, for instance, there are multiple rows from the test_sname of ' want ' on the same entry_date, because the having count(*) = 2 is no longer valid. 例如,当在同一entry_date上来自test_sname的' want '多个行时,就会出现问题,因为having count(*) = 2不再有效。 There are no results for data like this. 像这样的数据没有结果。

PID      |test_sname  |test_value  |units    |entry_date
11111111 | want       |         343 | U/L     | 2009-10-26 07:25:00
11111111 | want2      |      984.34 |         | 2009-10-26 07:25:00
11111111 | want       |        189 | U/L     | 2009-10-26 07:25:00
11111111 | NA1        |         50 | %       | 2009-10-26 07:25:00
11111111 | NA2        |         40 | mg/dL   | 2009-10-26 07:25:00
11111111 | NA3        |      84.55 |         | 2009-10-26 07:25:00
11111111 | NA4        |        4.5 | thou/uL | 2009-10-26 07:25:00
11111111 | NA5        |       14.6 | g/dL    | 2009-10-26 07:25:00
11111111 | NA6        |       0.96 | mg/dL   | 2009-10-26 07:25:00

11111111 | want       |         343 | U/L     | 2009-10-30 07:25:00
11111111 | want2      |      984.34 |         | 2009-10-30 07:25:00
11111111 | want       |        189 | U/L     | 2009-10-30 07:25:00
11111111 | NA1        |          6 | %       | 2009-10-30 07:25:00
11111111 | NA2        |         40 | mg/dL   | 2009-10-30 07:25:00
11111111 | NA3        |      84.55 |         | 2009-10-30 07:25:00
11111111 | NA4        |        4.5 | thou/uL | 2009-10-30 07:25:00
11111111 | NA5        |       14.6 | g/dL    | 2009-10-30 07:25:00
11111111 | NA6        |       0.96 | mg/dL   | 2009-10-30 07:25:00

As a restriction, I tried putting a limit 2 in the subquery (I know that by itself that won't fix the problem), but it gave this error, and I thought I had the most updated version of SQL, so apparently I can't use limit in the subquery. 作为限制,我尝试在子查询中设置limit 2 (我知道它本身无法解决问题),但是它给出了此错误,并且我认为我拥有SQL的最新版本,因此显然我可以在子查询中不使用limit

This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'

I realize there are multiple ways to fix this - I could select ALL the values and then programmatically take what I need with Python, but I'm looking for a mySQL query solution written using the Python mySQL-connector. 我意识到有多种方法可以解决此问题-我可以选择所有值,然后以编程方式获取Python所需的内容,但是我正在寻找使用Python mySQL-connector编写的mySQL查询解决方案。 I wouldn't complain about a python solution though. 我不会抱怨python解决方案。

I am using python v3.4.4 with mySQL-connector v2.1.3 and MySQL server v5.7.11 我正在将python v3.4.4与mySQL-connector v2.1.3和MySQL服务器v5.7.11一起使用

Thanks for your time! 谢谢你的时间!

Consider using a running count of your grouping via a subquery. 考虑通过子查询使用分组的运行计数。 Then, filter wherever RowNo is 1 or 2. In this way, you would not need to pass a parameter as all PIDs will be handled. 然后,在RowNo为1或2的位置进行过滤。这样,您将无需传递参数,因为将处理所有PID。 Below assumes the labs table has a unique identifier, ID : 下面假设labs表具有唯一的标识符ID

SELECT * 
FROM
   (SELECT PID, test_sname, test_value, units, ref_range, entry_date,    
           (SELECT count(*) FROM labs sub
            WHERE sub.test_sname in ('want', 'want2')
            AND sub.PID = labs.PID
            AND sub.entry_date = labs.entry_date
            AND sub.ID <= labs.ID) As RowNo
    FROM labs
    WHERE test_sname in ('want', 'want2')
   ) As dT
WHERE dT.RowNo <= 2

#  PID     test_sname   test_value      units   ref_range              entry_date   RowNo
#  10000000      want           33        U/L        4-40     2008-01-01 01:01:01       1
#  10000000     want2        98.34                            2008-01-01 01:01:01       2
#  10000000      want           33        U/L        4-40     2008-01-02 01:01:01       1
#  10000000     want2        98.34                            2008-01-02 01:01:01       2
#  11111111      want           33        U/L      Apr-40     2009-10-26 07:25:00       1
#  11111111     want2        98.34                            2009-10-26 07:25:00       2
#  11111111      want           33        U/L      Apr-40     2009-10-30 07:25:00       1
#  11111111     want2        98.34                            2009-10-30 07:25:00       2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 熊猫:选择特定列中具有不同值的前3行 - Pandas: Select First 3 Rows With Distinct Values in Specific Column 熊猫:仅当特定列中的值以开头时,才选择数据框行 - Pandas: select dataframe rows only if the values in a specific column start with Django检索不同列值的行 - Django retrieve rows for the distinct column values 使用python csv根据csv文件中特定列的不同值打印与另一列中的最小值相关的所有行 - Print all rows related to minimum values from another column based on distinct values of a specific column from csv file using python csv Pyspark-从每一列中选择不同的值 - Pyspark - Select the distinct values from each column 在 Pandas 中选择不同的值 groupby 列 - Select distinct values groupby column in pandas 选择一个不同的列,其他可以不相同。 -MySQL - Select one distinct column, others can be non distinct. - MySQL 用于获取多个值的Django MySQL独特查询 - Django MySQL distinct query for getting multiple values 如何 select 具有特定值的行? - How to select rows with specific values? 根据第 1 列中的不同值获取行,同时尽可能多地保留第 2 列中的不同值 - Get Rows based on distinct values from Column 1, while keep as many distinct values from column 2 as possible
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM