简体   繁体   中英

Performant mysql in-row text fields

For sake of argument, let's say that I am trying to represent a very simple filesystem within a MySQL table. Please note that this is not exactly what I am doing, it just makes for an easy basis of the question. So don't bother telling me of better ways to store files. The schema for the table is as follows:

varchar path
varchar filename
blob content

The trouble with the schema above is that it has bad performance whenever a query does not necessarily need the content field, since the content field could be quiet large. For example, if I want to execute a query which lists all of the files within a given path, the MySQL engine (in order to read the filename field) will read each row into memory which matches the where clause. That means that this content, which is unnecessary for the query, still needs to be loaded into memory, which hurts performance.

The typical solution to this problem is to move the content into a separate table which is always accessed directly by id. The trouble with this approach is that it adds complexity to the insertion and selection. It is no longer immediately obvious that the content is directly attached to a single row.

So, my question (finally!) is this. Is there a way to leave the blob within the schema but cause MySQL to only grab it when it is specifically requested? I'm wondering if there are alternate storage engines or modifiers that can be placed on the column. Thanks!

The short answer is not really (at least, not that I've ever seen). The table data is stored in a specific way in disk/memory and accessing it will always incur the penalty of your BLOB content.

One approach that will help speed things up, which you may or may not already have in place, would be using indexes on either the path and/or filename if you base queries on those a lot. Of course, the more data you begin inserting, the longer the queries will start taking regardless of index optimizations.

I would, personally, recommend going with the solution that you're anxious to avoid. It is an often-used approach and really doesn't add more complexity. It's a single additional INSERT statement and you can SELECT the data using a JOIN , or a second SELECT statement.

Regarding the " it is no longer immediately obvious that the content is directly attached to a single row " statement - you're the one designing the system, so it should be very obvious that the content is attached to a row in another table. If you name your tables and columns sufficiently, it should be (hopefully) obvious to others who work in your system as well. Something such as files (with id , path and filename ) and file_contents (with file_id , content ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM