简体   繁体   中英

How do I manage a set of mysql tables in a production Rails app that are periodically recreated?

I have a production Rails app that serves data from a set of tables that are built from a LOAD DATA LOCAL INFILE MYSQL import of CSV files, via a ruby script. The tables are consistently named and the schema does not change. The script drops/creates the tables and schema, then loads the data.

However I want to re-do how I manage data changes. I need a suggestion on how to manage new published data over time, since the app is in production, so I can (1) push data updates frequently without breaking the application servicing user requests and (2) make the new set of data "testable" before it is live (with the ability to roll back to the previous tables/data if something went wrong).

What I'm thinking is keeping a table of "versions" and creating a record each time a new rebuild is done. The latest version ID could be stuck into the database.yml , and each model could specify a table name from database.yml . A script could move the version forward or backward to make sure everything is ok on the new import, without destroying the old version.

Is that a good approach? Any patterns like this already? It seems similar to Rails' migrations somewhat. Any plugins or gems that help with this sort of data management?

UPDATE/current solution: I ended up creating database.yml configuration and creating the tables at import time there. The data doesn't change based on the environment, so it is a "peer" to the environment-specific config. Since there are only four models to update, I added the database connection explicitly:

establish_connection Rails.configuration.database_configuration["other_db"]

This way migrations and queries work as normal with Rails. To I can keep running imports, I update the database name in the separate config for each import. I could manually specify the previous database version this way and restart the app if there was a problem.

config = YAML.load_file(File.join("config/database.yml"))
config["other_db"]["database"] = OTHER_DB_NAME
File.open(path, 'w'){|f| f.write(config.to_yaml)}

One option would be to use soft deletes or an "is active" column. If you need to know when records were replaced/deleted, you can also add columns for date imported and date deleted. When you load new data, default "is active" to false. Your application can preview the newly loaded data by using different queries than the production application, and when you're ready to promote the new data, you can do it in a single transaction so the production application gets the changes atomically.

This would be simpler than trying to maintain multiple tables, but there would be some complexity around separating previously deleted rows and incoming rows that were just imported but haven't been made active.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM