简体   繁体   中英

User uploaded files in PHP: Storing in database VS storing in file system

I want to store files uploaded by (potentially untrusted) random users. My main concern is security. It seems to me that storing the files in an MySQL database is the most secure thing to do since they certainly can not be accessed/executed without my script loading them out of the SQL db, which it will only do for authorized users.

However I've read a lot about storing files in the file system would be the best approach, mostly without real explanations. The only drawback I found would be performance. Is it actually that much slower or are there any other drawbacks I'm unaware of?

It's incorrect to think a database is intrinsically more secure than files on disk. A database, after all, is files on disk. It's also typically a lot easier to bust into your MySQL server than it is to access the machine via shell, MySQL uses passwords and the shell, if properly configured, uses only SSH keys.

The other concern is that as you load more and more binary data into your database it becomes considerably more expensive to back-up properly. MySQL doesn't do differential backups very well, while files on disk are trivial to quickly and efficiently replicate with a tool like rsync .

File-systems, not surprisingly, are very good at storing large amounts of arbitrary binary data. Relational databases are not. Additionally a lot of work has been done at the operating system level to make serving files off of disk as efficient as possible.

Here's what the computer has to do to fetch a file from disk and send it to the network:

  1. Open the file.
  2. Make a system call like sendfile .
  3. The kernel handles reading from disk, writing to the network device.

Here's what you have to do to send it from a database like MySQL:

  1. Open a MySQL connection and authenticate.
  2. Compose a command like SELECT file FROM tablename WHERE id=?
  3. Encode that with the MySQL binary protocol and send it over the network connection to the MySQL server. This could be local or remote, and in the remote case even more overhead is involved.
  4. The server receives the command and decodes it, first unpacking the command.
  5. The server has to parse the command and interpret it.
  6. The server has to open the table in question as well as the index file, looking for the location of the data there.
  7. Once found, the data has to be decoded from the MySQL row format, then re-encoded for the MySQL result format.
  8. That data is transmitted back over the wire to the client.
  9. The client must receive and decode the result set.
  10. The client must extract the relevant binary information.
  11. The client must copy that data to another buffer to send it back out the network connection.
  12. The kernel needs to transfer that data from user-space to kernel space and feed it to the network driver.

That's considerably more work and involves a multitude of mandatory copies due to crossing the user-space/kernel-space boundary many times.

If you want a document store, look at something like Riak instead of an RDBMS like MySQL.

The general experience of web site operations people is that storing uploaded files on a file system is

  • faster, and
  • scales up better

than storing them in a database column.

Why? Web servers and web caching proxy servers are designed to deliver files to users from file systems. A cluster of web servers can do this very efficiently, as can just one server.

Delivering files from the dbms makes it into a bottleneck. You typically have multiple web servers and one dbms. Plus, BLOB data is slower to process than ordinary data.

This is such a widespread technique that the security issues are solved. They are real issues, but solved. The biggest issue is the ease of a bad guy guessing path names of private files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM