简体   繁体   中英

Many (many) SQL JOINs vs Multiple queries

I'm here to ask a question that many of you have already ask yourselves, I suppose. I am creating a PHP website, and everything has been running smoothly until I decided to populate my database with some test data (real data, which when the application starts being used for real, is going to be even bigger). Most of the things still work fine, but one particular (and really important) feature started having execution times of three to four seconds, and most of these time is spent in the MySQL server.

Here's the deal: I'm building an application for a school, and it needs to have all the schedules and lessons for every day, every person, every room, every class. The structure of the database is done, the indexes are created, etc... The problem is that since all this data is relational (and can be spread across many tables) one query to get them all might look like this:

SELECT field1, field2, etc
FROM schedules AS su
LEFT JOIN schedules_lessons AS sul
    ON sul.ID_SCHEDULE = su.ID
LEFT JOIN schedules_lessons_teachers AS sult
    ON sult.ID_LESSON = sul.ID
LEFT JOIN users AS u
    ON u.ID = sult.ID_TEACHER
LEFT JOIN schedules_periods AS sup
    ON sup.ID_SCHEDULE = su.ID
LEFT JOIN schedules_periods AS sulp
    ON sulp.ID_SCHEDULE = sul.ID_SCHEDULE AND sulp.period = sul.period
LEFT JOIN schools AS s
    ON s.ID = su.ID_SCHOOL
LEFT JOIN schools_buildings AS sb
    ON sb.ID_SCHOOL = s.ID
LEFT JOIN schools_rooms AS sr
    ON sr.ID = sul.ID_ROOM
LEFT JOIN schools_classes AS sc
    ON sc.ID = sul.ID_CLASS

Yeah, that's a lot of joins, I know. My question is: how should I get the best balance between the number of joins & the number or queries? Because I feel like this could be really improved, but I'm not sure how to achieve it.

Most of the tables will have the records count under 200, only the lessons table can have lots more. The minimum is something near 5k, and the maximum can be something like 30k, or more.

If you need this information and the tables are properly indexed, then your join query should be a very reasonable way to extract the data. You can check to see if the indexes are being used by adding explain before the query.

When you say "most of [the] time is spent in MySQL server", are you taking into account that returning thousands of rows takes time? You might try doing the same query, but replacing the select . . . select . . . with select count(*) to see what the underlying query performance is. Another way would be to add order by <something> limit 1 to the existing query -- the order by has to fully process the query before returning a result.

Finally, if this only started to be a problem, what has changed since it worked the way you want it to?

I'm not a database expert, but maybe it makes sense to only query the information from the database you currently need in your app or web page. This should be possible in a reasonably short time, I guess. The rest can then be queried from the database when it's actually needed.

Please note that the database server is building one big table in memory where all the joins are merged. If your server has too less memory, it might have difficulties to build this table. (Although that might probably not be the case in your scenario...)

As much as possible you should let the database handle the joins and avoid making more queries than necessary. In theory this should be optimal. Your query seems fine provided all the join fields are indexed. The stated volumes are nothing spectacular and response times should be fine (once again provided all indexes are created). Bear in mind that you should rarely if ever have queries that return many records (an exception being reports of course) - in the application you should control this with pagination.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM