简体   繁体   中英

MySQL query design for booking database with limited amount of goods and heavy traffic

We're running a site for booking salmon fishing licenses. The site has no problem handling the traffic 364 days a year. The 365th day is when the license sale opens, and that's where the problem occurs. The servers are struggling more and more each year due to increased traffic, and we have to further optimize our booking query.

The licenses are divided into many different types ( tbl_license_types ), and each license type are connected to one or more fishing zones ( tbl_zones ).

Each license type can have a seasonal quota, which is a single value set as an integer field in tbl_license_types .

Each zone can have a daily quota, a weekly quota and a seasonal quota. The daily quota is the same for all days, and the seasonal quota of course is a single value. Daily and seasonal are therefore integer fields in tbl_zones . The weekly quota however differs by week, therefore those are specified in the separate tbl_weekly_quotas .

Bookings can be for one or more consecutive dates, but are only stated as From_date and To_date in tbl_shopping_cart (and tbl_bookings ). For each booking attempt made by a user, the quotas have to be checked against already allowed bookings in both tbl_shopping_cart and tbl_bookings .

To be able to count/group by date, we use a view called view_season_calendar with a single column containing all the dates of the current season.

In the beginning we used a transaction where we first made a query to check the quotas, and if quotas allowed we would use a second query to insert the booking to tbl_bookings .

However that gave a lot of deadlocks under relatively moderate traffic, so we redesigned it to a single query (pseudo-code):

INSERT INTO tbl_bookings (_booking_)
WHERE _lowest_found_quota_ >= _requested_number_of_licenses_

where _lowest_found_quota_ is a ~330 lines long SELECT with multiple subqueries and the related tables being joined multiple times in order to check all quotas.

Example: User wants to book License type A, for zones 5 and 6, from 2020-05-19 to 2020-05-25. The system needs to

  • count previous bookings of license type A against the license type A seasonal quota.
  • count previous bookings in zone 5 for each of the 6 dates against zone 5 daily quota.
  • same count for zone 6.
  • count previous bookings in zone 5 for each of the two weeks the dates are part of, against zone 5 weekly quota.
  • same count for zone 6.
  • count all previous bookings in zone 5 for the current season against zone 5 seasonal quota.
  • same count for zone 6.

If all are within quotas, insert the booking.

As I said this was working well earlier, but due to higher traffic load we need to optimize this query further now. I have some thoughts on how to do this;

  1. Using isolation level READ UNCOMMITTED on each booking until quotas for the requested zones/license type are nearly full, then fallback to the default REPEATABLE READ . As long as there's a lot left of the quota, the count doesn't need to be 100% correct. This will greatly reduce lock waits and deadlocks, right?
  2. Creating one or more views which keeps count of all bookings for each date, week, zone and license type, then using those views in the WHERE clauses of the insert.
  3. If doing nr 2, use READ UNCOMMITTED in the views. If views report relevant quota near full, cancel the INSERT and start a new one with the design we're using today. (Hopefully traffic levels are coming down before quotas are becoming full)

I would greatly appreciate thoughts on how the query can be done as efficient as possible.

  • See if you can upgrade the AWS starting a day before the season opens, then downgrade after the rush. It's a small price to pay for what might be a plenty big performance boost.

  • Rather than the long complex query for counting, decrement some counters as you go. (This may or may not help, so play around with the idea.)

  • Your web server has some limit on the number of connections it will handle; limit that rather than letting 2K users get into MySQL and stumble over each other. Think of what a grocery store is like when the aisles are so crowded that no one is getting finished!

  • Be sure to use "transactions", but don't let them be too broad. If they encompass too many things, the traffic will degenerate to single file (and/or transactions will abort with deadlocks).

  • Do as much as you can outside of transactions -- such as collecting and checking user names/addresses, etc. If you do this after issuing the license, be ready to undo the license if something goes wrong. (This should be done in code, not via ROLLBACK .

(More)

  • VIEWs are syntactic sugar; they do not provide any performance or isolation. OTOH, if you make "materialized" views, there might be something useful.

  • A long "History list" is a potential performance problem (especially CPU). This can happen when lots of connections are in the middle of a transaction at the same time -- each needs to hang onto its 'snapshot' of the dataset.

  • Whereever possible terminate transactions as soon as possible -- even if you turn around and start a new one. An example in Data Warehousing is to do the 'normalization' before starting the main transaction. (Probably this example does not apply to your app.)

  • Ponder having a background task computing the quotas. The hope is that the regular tasks can run faster by not having the computation inside their transactions.

  • A technique used in the reservation industry: (And this sounds somewhat like your item 1.) Plow ahead with minimal locking. At the last moment, have the shortest possible transaction to make the reservation and verify that the room (plane seat, etc) is still available.

  • If the whole task can be split into (1) read some stuff, then (2) write (and reread to verify that the thing is still available), then... If the read step is heavier than the write step, add more Slaves ('replicas') and use them for the read step. Reserve the Master for the write step. Note that Replicas are easy (and cheap) to add and toss for a brief time.

Rate Per Second = RPS

Suggestions to consider for your AWS Parameters group

innodb_lru_scan_depth=100  # from 1024 to conserve 90% of CPU cycles used for function every second
innodb_flush_neighbors=2  # from 1 to reduce time required to lower innodb_buffer_pool_pages_dirty count when busy
read_rnd_buffer_size=128K  # from 512K to reduce handler_read_rnd_next RPS of 3,227
innodb_io_capacity=1900  # from 200 to use more of SSD IOPS
log_slow_verbosity=query_plan,explain  # to make slow query log more useful
have_symlink=NO  # from YES for some protection from ransomware 

You will find these changes will cause transactions to complete processing quicker. For additional assistance, view profile, Network profile for contact info and free downloadable Utility Scripts to assist with performance tuning. On our FAQ page you will find "Q. How can I find JOINS or QUERIES not using indexes?" to assist in reducing select_scan RPhr of 1,173. Com_rollback averages 1 every ~ 2,700 seconds, usually correctable with consistent read order in maintenance queries.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM