简体   繁体   中英

Improve performance when importing data to MySQL?

I'm using Django to build a website with a MySQL (MyISAM) backend.

The database data is imported from a number of XML files that an external script process and output as a JSON-file. Whenever a new JSON file differ from the old one, I need to wipe the old MySQL-db and recreate it using manage.py loaddata, (at least that's the easy way to do it, I guess I could check the differences between the JSON files and apply those to the database, but I haven't figured out a good solution for this (I'm neither a very good coder nor a web developer)).

Anyway, the JSON file is around 10 Mb, and ends up being about 21,000 rows of SQL (It's not expected to grow significantly). There are 7 tables, and they all look something like this:

class Subnetwork(models.Model):
   SubNetwork = models.CharField(max_length=50)
   NetworkElement = models.CharField(max_length=50)
   subNetworkId = models.IntegerField()
   longName = models.CharField(max_length=50)
   shortName = models.CharField(max_length=50)
   suffix = models.CharField(max_length=50)

It takes up to a minute (sometimes only 30 seconds) to import it into MySQL. I don't know if this is to be expected from a file of this size? What can I do (if anything) to improve perfomance?

For what it's worth, here's some profiler output https://gist.github.com/1287847

There are a couple of solutions, same decent than others, but here is a workaround to keep your system's "downtime" minimal, without needing to write a db synchronize mechanism (which would probably be a better solution in most times).:

  • Create a custom settings_build.py file, with from settings import * that chooses a random name for a new db (probably with the date in the db name), creates it by calling mysqladmin, and update the name into DATABASES .
  • Create a custom django management command (let's call it builddb ) by either cloning the loaddata command or calling it, and on successful result, it should write the db name to a dbname text file with one line and executes a shell command that reloads your django ( apache / gunicorn /?) server.
  • Modify your settings.py to load the database name from the text file.

And now run your build process like this:

./manage.py builddb --settings=settings_build

我通过将处理后的XML文件而不是json导出到csv来解决它,然后使用了一个名为mysqlimport的单独脚本来进行导入。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM