简体   繁体   中英

Decreasing queries in MySQL with many one-to-many relationships (ORM)

I'm currently designing an application using PHP and MySQL, built on the Kohana framework. I'm making use of the built in ORM and it has proved to be extremely useful. Everything works fine, but I'm very concerned with the number of queries being run on certain pages.

Setting
For example, there's a page on which you can view a category full of sections, which are in turn full of products. This is listed out in tabular format. Each product has (possibly) many attributes, flags, tier pricing breaks. This must all be represented in the table.

How many queries?
As far as queries are concerned: The category must query all the sections within it, and those sections must query all the products they contain. Not too bad, but each product must then query all it's product attributes, tier pricing, and flags. So, adding more products to a category increases the queries many times over (since I'm currently using the ORM primarily). Having a few hundred products in a section will result in a couple hundred queries. Small queries, but that is still not good.

So far...
All the keys are indexed. I can pull all of the information with a single query (see edit below), however, as you could imagine, this will result in a lot of redundant data spread out across multiple rows per each product, per each extra (eg) attribute, flag, etc.

I'm not opposed to ditching the ORM for the displaying part of the application and going with query building or even raw SQL.

The solution for this could be actually be quite simple and I'm just ignorant of it right now, which would be a relief honestly. Or maybe it's not. I'm not sure. If any of my explanation was not adequate enough to understand the problem just ask and I'll try to give a better example. (Edit: Better example given, see below

Although, a side note... One thing that may have some relevance though: while I always want to have the application designed most efficiently, this isn't a site that's going to be hit dozens or hundreds of times a day. It's more of an administrative application, which probably won't be in use by more than a few individuals at once. I can't foresee too much reloading, as most of the editing of data on the page is done through AJAX. So, should I care as much if on this page it's running a couple hundred queries (fluctuating with how many products are in the currently viewed section) are running each time this particular page is loaded? Just a side thought, even so if it is possible to solve the main aforementioned problem I would prefer that.

Thank you very much!

EDIT
Based on a couple answers, it seems I didn't explain myself adequately. So, let me post an example so you see what's going on. Before the example though, I should also make two clarifications: (1) there are also a couple many-to-many relationships, (2) and you could possibly liken what I'm looking for to that of a crosstab query.

Let's simplify and say we have 3 main tables: products (product_id, product_name, product_date_added) product_attributes (product_attribute_id, product_id, value) notifications (notification_id, notification_label)

And 1 pivot talbe: product_notifications (notification_id, product_id)

We're going to list all the products in a table. It's simple enough in the ORM to call all the products. So per each 'products' we list the product_name and product_date_added. However, we also need to list all the products attributes out. There are a 0 or more of these per product. We also have to show what notifications a product has, of which there are 0 or more as well. So at the moment, how it works is basically:

foreach ($products->find_all() as $product) //given that $products is an ORM object
{
   echo $product->product_id; //lets just pretend these are surrounded by html
   echo $product->product_name;
   foreach ($products->product_attributes->find_all() as $attribute)
   {
       echo $attribute->value;
   }
   foreach ($products->notifications->find_all() as $notification)
   {
       echo $notification->notification_label; 
   }
 }

This is oversimplified of course, but this is the principle I'm talking about. This works great already. However , as you can see, for each product it must query all of it's attributes to get the appropriate collection or rows. The find_all() function will return the query results of something along the lines of: SELECT product_attributes.* FROM product_attributes WHERE product_id = '#' , and similarly for the notifications. And it makes these queries for each product.
So, for every product in the database, the number of queries is a few times that amount. So, although this works well, it does not scale well, as it may potentially result in hundreds of queries.

If I perform a query to grab all the data in one query, along the lines of:

SELECT p.*, pa.*, n.*
FROM products p
LEFT JOIN product_attributes pa ON pa.product_id = p.product_id
LEFT JOIN product_notifications pn ON pn.product_id = p.product_id
LEFT JOIN notifications n ON n.notification_id = pn.notification_id

(Again oversimplified). This gets the data per se, but per each attribute and notification a product has, an extra row with redundant information will be returned.

For example, if I have two products in the database; one has 1 attribute and 1 flag and the other has 3 attributes and 2 flags, it will return:

product_id, product_name, product_date_added, product_attribute_id, value, notification_id, notification_label
1, My Product, 10/10/10, 1, Color: Red, 1, Add This Product
2, Busy Product, 10/11/10, 2, Color: Blue, 1, Add This Product
2, Busy Product, 10/11/10, 2, Color: Blue, 2, Update This Product
2, Busy Product, 10/11/10, 3, Style: New, 1, Add This Product
2, Busy Product, 10/11/10, 3, Style: New, 2, Update This Product

Needless to say that's a lot of redundant information. The number of rows returned per product would be the number of attributes it has times the number of notifications it has.

The ORM (or, just creating the new queries in the loop in general) consolidates all of the information in each row into it's own object, allowing for the data to be handled more logically. That's the rock. Calling the information in one query eliminates the need for possibly hundreds of queries, but creates lots of redundant data in rows and therefore does not return the (one/many)-to-many relationship data in succinct sets. That's the hard place.

Sorry it's so long, trying to be thorough, haha, thanks!

An interesting alternative is to handle your reads and your writes with completely separate models. (Command Query Separation). Sophisticated object models (and ORMS) are great for modeling complex business behavior, but are lousy as interfaces for querying and displaying information to users. You mentioned that you weren't opposed to ditching the ORM for rendering displays -- well, that's exactly what many software architects nowadays suggest. Write a totally different interface (with its own optimized queries) for reading and reporting on data. The "read" model could query the same database that you use with your ORM backed "write" model, or it could be a separate one that is denormalized and optimized for the reports/screens you need to generate.

Check out these two presentations. It may sound like overkill (and it may be if your performance requirements are very low), but it's amazing how this technique makes so many problems just go away.

A good ORM should handle this for you. If you feel you must do it manually, you can do this.

Fetch all the categories you need in a single query and store the primary key ID's in a PHP array.

Run a query similar to this:

mysql_query('SELECT yourListOfFieldsHere FROM Products WHERE Product_id IN ('.implode(',', $categoryIDs).')');

This should give you all the products that you need in a single query. Then use PHP to map these to the correct categories and display accordingly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM