简体   繁体   中英

Converting Previous Row Subquery into a Join in MySQL

I have policy information in a policy table. Each row represents the policy status at a certain time (the time is stored in an updated_on column). Each row belongs to a policy iteration (multiple policy rows can belong to a single policy iteration). I want to look at status changes from row to row within a policy iteration.

The policy table:

CREATE TABLE `policy` (
  `policy_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `policy_iteration_id` int(10) unsigned NOT NULL,
  `policy_status_id` tinyint(3) unsigned NOT NULL,
  `updated_on` datetime NOT NULL,
  PRIMARY KEY (`policy_id`),
  KEY `policy_iteration_idx` (`policy_iteration_id`),
  KEY `policy_status_updated_idx` (`policy_status_id`,`updated_on`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

I want to be able to pass a date range and a "from" status and a "to" status and return the policy data for the "to" row. So in pseudo code, I need to group by policy iteration, find rows that satisfy the data range and the "to" status, then look at the previous row within that policy iteration to see if it has the "from" status. If so, return the "to" row's information.

This is the query I came up with:

SELECT
    pto.policy_iteration_id,
    pto.policy_id,
    pto.updated_on,
FROM
    policy AS pto
WHERE
    pto.updated_on >= $from_date AND
    pto.updated_on <= $to_date AND
    pto.policy_status_id = $to_status_id AND
    $from_status_id = 
    (SELECT
        pfrom.policy_status_id
    FROM
        policy AS pfrom
    WHERE
        pfrom.policy_iteration_id = pto.policy_iteration_id AND
        pfrom.policy_id < pto.policy_id
    ORDER BY
        pfrom.policy_id DESC
    LIMIT
        1);

This query works but is very inefficient because of the subquery having to be executed for each row. I'd like to make it more efficient by using subquery join(s) but I can't figure out how.

Any help would be appreciated. Thanks!

UPDATE #1

To help explain what I'm trying to do, here is an example data set:

+-----------+---------------------+------------------+---------------------+
| policy_id | policy_iteration_id | policy_status_id | updated_on          |
+-----------+---------------------+------------------+---------------------+
|    323705 |               27230 |                6 | 2014-08-01 10:27:11 |
|    325028 |               27230 |                2 | 2014-08-01 17:12:28 |
|    323999 |               27591 |                2 | 2014-08-01 12:07:31 |
|    324008 |               27591 |                6 | 2014-08-01 12:10:23 |
|    325909 |               27591 |                2 | 2014-08-02 14:59:12 |
|    327116 |               29083 |                6 | 2014-08-04 12:09:16 | 
|    327142 |               29083 |                6 | 2014-08-04 12:19:00 |
|    328067 |               29083 |                2 | 2014-08-04 17:58:41 |
|    327740 |               29666 |                3 | 2014-08-04 16:16:55 |
|    327749 |               29666 |                3 | 2014-08-04 16:19:01 |
+-----------+---------------------+------------------+---------------------+

Now if I run the query where from_date = '2014-08-02 00:00:00', to_date = '2014-08-05 00:00:00', from_status = 6 and to_status = 2, the result should be:

+-----------+---------------------+------------------+---------------------+
| policy_id | policy_iteration_id | policy_status_id | updated_on          |
+-----------+---------------------+------------------+---------------------+
|    325909 |               27591 |                2 | 2014-08-02 14:59:12 |
|    328067 |               29083 |                2 | 2014-08-04 17:58:41 |
+-----------+---------------------+------------------+---------------------+

Those two rows have a row with the selected "to_status" of 2 within the stated time period and have their previous row with the "from_status" of 6.

I don't believe joining a MAX policy id with a GROUP BY of policy_iteration_id will do the job since that would return the rows that are most recent, not the row that is previous to the row with the "to_status".

Any further help would be appreciated. Thanks!

You can use use max from.policy_id where from.policy_id < to.policy_id to help get the previous row as a set.

select
        p.policy_iteration_id,
        p.policy_id,
        p.updated_on
from 
    policy f
        inner join (
    select
        p.policy_iteration_id,
        p.policy_id,
        p.updated_on,
        max(f.policy_id) as prev_policy_id
    from
        policy p
            inner join 
        policy f 
            on f.policy_iteration_id = p.policy_iteration_id and
               f.policy_id < p.policy_id
    where
        p.updated_on >= $from_date and
        p.updated_on <= $to_date and
        p.policy_status_id = $to_status_id
    group by
        p.policy_iteration_id,
        p.policy_id,
        p.updated_on
) p
        on p.prev_policy_id = f.policy_id
where
    f.policy_status_id = $from_status_id

In a database with window functions there are simpler ways of achieving this.

Example SQLFiddle

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM