简体   繁体   中英

MySQL: Matching records that have x consecutive dates available between two dates

Background / Application

I have a MySQL database containing a table of rentable properties and a table of bookings for these properties. There is also a search feature for finding available properties between two provided dates. When searching, the user can enter the start date, the amount of days they wish to stay, and a date flexibility of up to +/- 7 days. A booking can start on the same day as another booking ends (party 1 leaves in the morning, party 2 arrives in the evening).

I am having difficulty implementing the flexibility feature efficiently.

Schema

CREATE TABLE IF NOT EXISTS `property` (
    `id` bigint(20) NOT NULL AUTO_INCREMENT,
    `name` varchar(60) COLLATE utf8_unicode_ci DEFAULT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE IF NOT EXISTS `property_booking` (
    `id` bigint(20) NOT NULL AUTO_INCREMENT,
    `property_id` bigint(20) DEFAULT NULL,
    `name` varchar(60) COLLATE utf8_unicode_ci DEFAULT NULL,
    `date_start` date DEFAULT NULL,
    `date_end` date DEFAULT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

Sample Data

INSERT INTO `property` (`name`) 
VALUES ('Property 1'), ('Property 2'), ('Property 3');

INSERT INTO `property_booking` (`property_id`,`name`,`date_start`,`date_end`) 
VALUES (1, 'Steve', '2011-03-01', '2011-03-08'), 
(2, 'Bob', '2011-03-13', '2011-03-20'), 
(3, 'Jim', '2011-03-16', '2011-03-23');

Sample Scenario

The user selects that they want to start their stay on 2011-03-10, they want to stay for 7 days, and they have a flexibility of +/- 2 days. I have compiled an image that visualises the data and parameters below. (Red: Booking 1, Green: Booking 2, Stripes: Booking 3, Blue: Date range (2011-03-10, + 7 days and +/- 2 days flexibility))

Expected Result

Property 1 (Bookings available throughout date range)
Property 3 (Bookings available starting on 2011-03-08 or 2011-03-09)

Current Method

My current query checks for overlap for all 7 day date ranges within the total searchable date range, like this:

SELECT p.`id`, p.`name` 
FROM `property` p 
WHERE (NOT (EXISTS (SELECT p2.`name` FROM `property_booking` p2 WHERE (p2.`property_id` = p.`id` AND '2011-03-10' < DATE_SUB(p2.`date_end`, INTERVAL 1 DAY) AND '2011-03-17' > DATE_ADD(p2.`date_start`, INTERVAL 1 DAY))))) 
OR (NOT (EXISTS (SELECT p3.`name` FROM `property_booking` p3 WHERE (p3.`property_id` = p.`id` AND '2011-03-11' < DATE_SUB(p3.`date_end`, INTERVAL 1 DAY) AND '2011-03-18' > DATE_ADD(p3.`date_start`, INTERVAL 1 DAY))))) 
OR (NOT (EXISTS (SELECT p4.`name` FROM `property_booking` p4 WHERE (p4.`property_id` = p.`id` AND '2011-03-09' < DATE_SUB(p4.`date_end`, INTERVAL 1 DAY) AND '2011-03-16' > DATE_ADD(p4.`date_start`, INTERVAL 1 DAY))))) 
OR (NOT (EXISTS (SELECT p5.`name` FROM `property_booking` p5 WHERE (p5.`property_id` = p.`id` AND '2011-03-12' < DATE_SUB(p5.`date_end`, INTERVAL 1 DAY) AND '2011-03-19' > DATE_ADD(p5.`date_start`, INTERVAL 1 DAY)))))
OR (NOT (EXISTS (SELECT p6.`name` FROM `property_booking` p6 WHERE (p6.`property_id` = p.`id` AND '2011-03-08' < DATE_SUB(p6.`date_end`, INTERVAL 1 DAY) AND '2011-03-15' > DATE_ADD(p6.`date_start`, INTERVAL 1 DAY)))));

On the sample dataset, it's reasonably quick, but on much larger datasets it's going to get pretty sluggish, even more so when you build the full +/- 7 day flexibility.

Does anyone have any suggestions as to how this query could be better written?

Ok, here's a tricky answer for a tricky question...

SELECT * FROM property AS p
LEFT JOIN  
(
  SELECT property_id, DATEDIFF(MAX(date_end),20110308) AS startblock, 
      DATEDIFF(20110319,MIN(date_start))-1 AS endblock
  FROM property_booking AS pb
  WHERE date_start < 20110319 || date_end >= 20110308 
  GROUP BY property_id
  HAVING LEAST(startblock,endblock) > 4
) AS p2 ON p.id = p2.property_id 
WHERE p2.property_id IS NULL;

The subquery selects all the properties that are not eligible. The LEFT JOIN with IS NULL basically works out the exclusion (negation on the ineligible properties)

  • 20110308 is the desired start date -2 days ( because +/-2 day flexibility)
  • 20110319 is the desired end date +2 days
  • The number 4 in the HAVING LEAST(startblock,endblock) > 4 in twice your +/- number (2*2)

It took me a while to work it out (but your question was interesting and I had time on my hand)

I've tested it with edge cases and it worked for all the test cases I've thrown at it...). The logic behind it is a bit odd but a good old pen and paper helped me work it out!

Edit

Unfortunately I realized that this will work for most cases but not all... (2 single day bookings at the very beginning and end of the lookup period makes a property unavailable even though it should be available).

The problem here is that you have to look up information that's not 'present' in the DB and reconstruct it from what data you have. Check out my comment on your question to see a better way to deal with the problem

I think this is what you're looking for:

   SELECT MAX( IF( (    b.date_start < '2011-03-08' + INTERVAL 7 DAY
                    AND b.date_end > '2011-03-08'), 1, 0)) AS is_booked,
          p.id,
          p.name
     FROM property p
LEFT JOIN property_booking b ON p.id = b.property_id
 GROUP BY p.id
   HAVING is_booked < 1

If you want to include the leeway, expand the MAX() aggregate to include the options:

   SELECT MAX( IF(    (    b.date_start < '2011-03-08' + INTERVAL 7 DAY
                       AND b.date_end > '2011-03-08')
                  AND (    b.date_start < '2011-03-08' + INTERVAL 7 DAY + INTERVAL 1 DAY
                       AND b.date_end > '2011-03-08' + INTERVAL 1 DAY)
                  AND (    b.date_start < '2011-03-08' + INTERVAL 7 DAY + INTERVAL 2 DAY
                       AND b.date_end > '2011-03-08' + INTERVAL 2 DAY), 1, 0)
             ) AS is_booked,
          p.id,
          p.name
     FROM property p
LEFT JOIN property_booking b ON p.id = b.property_id
 GROUP BY p.id
   HAVING is_booked < 1

If I'm understanding your question right, this GROUP BY query should cover it more efficiently than multiple subqueries.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM