简体   繁体   English

使用 Django 1.7+ 和数据迁移加载初始数据

[英]Loading initial data with Django 1.7+ and data migrations

I recently switched from Django 1.6 to 1.7, and I began using migrations (I never used South).我最近从 Django 1.6 切换到 1.7,并开始使用迁移(我从未使用过 South)。

Before 1.7, I used to load initial data with a fixture/initial_data.json file, which was loaded with the python manage.py syncdb command (when creating the database).在 1.7 之前,我使用fixture/initial_data.json文件加载初始数据,该文件是使用python manage.py syncdb命令加载的(创建数据库时)。

Now, I started using migrations, and this behavior is deprecated:现在,我开始使用迁移,这种行为已被弃用:

If an application uses migrations, there is no automatic loading of fixtures.如果应用程序使用迁移,则不会自动加载固定装置。 Since migrations will be required for applications in Django 2.0, this behavior is considered deprecated.由于 Django 2.0 中的应用程序需要迁移,因此此行为被视为已弃用。 If you want to load initial data for an app, consider doing it in a data migration.如果您想为应用加载初始数据,请考虑在数据迁移中进行。 ( https://docs.djangoproject.com/en/1.7/howto/initial-data/#automatically-loading-initial-data-fixtures ) https://docs.djangoproject.com/en/1.7/howto/initial-data/#automatically-loading-initial-data-fixtures

The official documentation does not have a clear example on how to do it, so my question is: 官方文档没有关于如何做到这一点的明确示例,所以我的问题是:

What is the best way to import such initial data using data migrations:使用数据迁移导入此类初始数据的最佳方法是什么:

  1. Write Python code with multiple calls to mymodel.create(...) ,通过多次调用mymodel.create(...)编写 Python 代码,
  2. Use or write a Django function ( like calling loaddata ) to load data from a JSON fixture file.使用或编写 Django function( 如调用loaddata )从 JSON 夹具文件加载数据。

I prefer the second option.我更喜欢第二种选择。

I don't want to use South, as Django seems to be able to do it natively now.我不想使用 South,因为 Django 现在似乎可以原生地做到这一点。

Update : See @GwynBleidD's comment below for the problems this solution can cause, and see @Rockallite's answer below for an approach that's more durable to future model changes. 更新 :请参阅下面的@ GwynBleidD关于此解决方案可能导致的问题的评论,并参阅下面的@ Rockallite的答案,了解对未来模型更改更持久的方法。


Assuming you have a fixture file in <yourapp>/fixtures/initial_data.json 假设您在<yourapp>/fixtures/initial_data.json有一个fixture文件

  1. Create your empty migration: 创建空迁移:

    In Django 1.7: 在Django 1.7中:

     python manage.py makemigrations --empty <yourapp> 

    In Django 1.8+, you can provide a name: 在Django 1.8+中,您可以提供一个名称:

     python manage.py makemigrations --empty <yourapp> --name load_intial_data 
  2. Edit your migration file <yourapp>/migrations/0002_auto_xxx.py 编辑迁移文件<yourapp>/migrations/0002_auto_xxx.py

    2.1. 2.1。 Custom implementation, inspired by Django' loaddata (initial answer): 自定义实现,灵感来自Django的loaddata (初始答案):

     import os from sys import path from django.core import serializers fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures')) fixture_filename = 'initial_data.json' def load_fixture(apps, schema_editor): fixture_file = os.path.join(fixture_dir, fixture_filename) fixture = open(fixture_file, 'rb') objects = serializers.deserialize('json', fixture, ignorenonexistent=True) for obj in objects: obj.save() fixture.close() def unload_fixture(apps, schema_editor): "Brutally deleting all entries for this model..." MyModel = apps.get_model("yourapp", "ModelName") MyModel.objects.all().delete() class Migration(migrations.Migration): dependencies = [ ('yourapp', '0001_initial'), ] operations = [ migrations.RunPython(load_fixture, reverse_code=unload_fixture), ] 

    2.2. 2.2。 A simpler solution for load_fixture (per @juliocesar's suggestion): load_fixture一个更简单的解决方案(根据@juliocesar的建议):

     from django.core.management import call_command fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures')) fixture_filename = 'initial_data.json' def load_fixture(apps, schema_editor): fixture_file = os.path.join(fixture_dir, fixture_filename) call_command('loaddata', fixture_file) 

    Useful if you want to use a custom directory. 如果要使用自定义目录,则很有用。

    2.3. 2.3。 Simplest: calling loaddata with app_label will load fixtures from the <yourapp> 's fixtures dir automatically : 最简单:使用app_label调用loaddata将自动加载<yourapp>fixtures目录中的fixtures

     from django.core.management import call_command fixture = 'initial_data' def load_fixture(apps, schema_editor): call_command('loaddata', fixture, app_label='yourapp') 

    If you don't specify app_label , loaddata will try to load fixture filename from all apps fixtures directories (which you probably don't want). 如果你没有指定app_label ,loaddata将尝试从所有应用程序fixtures目录(你可能不想要)加载fixture文件名。

  3. Run it 运行

     python manage.py migrate <yourapp> 

Short version 精简版

You should NOT use loaddata management command directly in a data migration. 应该使用loaddata在数据迁移管理直接命令。

# Bad example for a data migration
from django.db import migrations
from django.core.management import call_command


def load_fixture(apps, schema_editor):
    # No, it's wrong. DON'T DO THIS!
    call_command('loaddata', 'your_data.json', app_label='yourapp')


class Migration(migrations.Migration):
    dependencies = [
        # Dependencies to other migrations
    ]

    operations = [
        migrations.RunPython(load_fixture),
    ]

Long version 长版

loaddata utilizes django.core.serializers.python.Deserializer which uses the most up-to-date models to deserialize historical data in a migration. loaddata使用django.core.serializers.python.Deserializer ,它使用最新的模型对迁移中的历史数据进行反序列化。 That's incorrect behavior. 这是不正确的行为。

For example, supposed that there is a data migration which utilizes loaddata management command to load data from a fixture, and it's already applied on your development environment. 例如,假设存在利用loaddata管理命令从夹具加载数据的数据迁移,并且它已经应用于您的开发环境。

Later, you decide to add a new required field to the corresponding model, so you do it and make a new migration against your updated model (and possibly provide a one-off value to the new field when ./manage.py makemigrations prompts you). 稍后,您决定向相应的模型添加新的必填字段,因此您可以执行此操作并针对更新的模型进行新的迁移(当./manage.py makemigrations提示您时,可能会为新字段提供一次性值。 )。

You run the next migration, and all is well. 你运行下一次迁移,一切都很顺利。

Finally, you're done developing your Django application, and you deploy it on the production server. 最后,您已经完成了Django应用程序的开发,并将其部署在生产服务器上。 Now it's time for you to run the whole migrations from scratch on the production environment. 现在是时候在生产环境中从头开始运行整个迁移了。

However, the data migration fails . 但是, 数据迁移失败 That's because the deserialized model from loaddata command, which represents the current code, can't be saved with empty data for the new required field you added. 这是因为来自loaddata命令的反序列化模型(代表当前代码)无法与您添加的新必填字段的空数据一起保存。 The original fixture lacks necessary data for it! 原始夹具缺少必要的数据!

But even if you update the fixture with required data for the new field, the data migration still fails . 但即使您使用新字段所需的数据更新夹具, 数据迁移仍会失败 When the data migration is running, the next migration which adds the corresponding column to the database, is not applied yet. 数据迁移正在运行时,尚未应用将相应列添加到数据库的下一次迁移。 You can't save data to a column which does not exist! 您无法将数据保存到不存在的列!

Conclusion: in a data migration, the loaddata command introduces potential inconsistency between the model and the database. 结论:在数据迁移中, loaddata命令在模型和数据库之间引入了潜在的不一致。 You should definitely NOT use it directly in a data migration. 绝对应该在数据迁移中直接使用它。

The Solution 解决方案

loaddata command relies on django.core.serializers.python._get_model function to get the corresponding model from a fixture, which will return the most up-to-date version of a model. loaddata命令依赖于django.core.serializers.python._get_model函数从夹具中获取相应的模型,该模型将返回最新版本的模型。 We need to monkey-patch it so it gets the historical model. 我们需要对其进行修补,以便获得历史模型。

(The following code works for Django 1.8.x) (以下代码适用于Django 1.8.x)

# Good example for a data migration
from django.db import migrations
from django.core.serializers import base, python
from django.core.management import call_command


def load_fixture(apps, schema_editor):
    # Save the old _get_model() function
    old_get_model = python._get_model

    # Define new _get_model() function here, which utilizes the apps argument to
    # get the historical version of a model. This piece of code is directly stolen
    # from django.core.serializers.python._get_model, unchanged. However, here it
    # has a different context, specifically, the apps variable.
    def _get_model(model_identifier):
        try:
            return apps.get_model(model_identifier)
        except (LookupError, TypeError):
            raise base.DeserializationError("Invalid model identifier: '%s'" % model_identifier)

    # Replace the _get_model() function on the module, so loaddata can utilize it.
    python._get_model = _get_model

    try:
        # Call loaddata command
        call_command('loaddata', 'your_data.json', app_label='yourapp')
    finally:
        # Restore old _get_model() function
        python._get_model = old_get_model


class Migration(migrations.Migration):
    dependencies = [
        # Dependencies to other migrations
    ]

    operations = [
        migrations.RunPython(load_fixture),
    ]

Inspired by some of the comments (namely n__o's) and the fact that I have a lot of initial_data.* files spread out over multiple apps I decided to create a Django app that would facilitate the creation of these data migrations. 受到一些评论(即n__o)的启发以及我有很多initial_data.*文件分布在多个应用程序中的事实,我决定创建一个Django应用程序,以便于创建这些数据迁移。

Using django-migration-fixture you can simply run the following management command and it will search through all your INSTALLED_APPS for initial_data.* files and turn them into data migrations. 使用django-migration-fixture,您只需运行以下管理命令,它将搜索所有INSTALLED_APPS中的initial_data.*文件,并将它们转换为数据迁移。

./manage.py create_initial_data_fixtures
Migrations for 'eggs':
  0002_auto_20150107_0817.py:
Migrations for 'sausage':
  Ignoring 'initial_data.yaml' - migration already exists.
Migrations for 'foo':
  Ignoring 'initial_data.yaml' - not migrated.

See django-migration-fixture for install/usage instructions. 有关安装/使用说明,请参阅django-migration-fixture

In order to give your database some initial data, write a data migration. 为了给数据库提供一些初始数据,请编写数据迁移。 In the data migration, use the RunPython function to load your data. 在数据迁移中,使用RunPython函数加载数据。

Don't write any loaddata command as this way is deprecated. 不要写任何loaddata命令,因为这种方式已被弃用。

Your data migrations will be run only once. 您的数据迁移只会运行一次。 The migrations are an ordered sequence of migrations. 迁移是有序的迁移序列。 When the 003_xxxx.py migrations is run, django migrations writes in the database that this app is migrated until this one (003), and will run the following migrations only. 运行003_xxxx.py迁移时,django迁移会在数据库中写入此应用程序迁移到此应用程序(003),并且仅运行以下迁移。

The solutions presented above didn't work for me unfortunately. 不幸的是,上面提出的解决方案对我不起作用。 I found that every time I change my models I have to update my fixtures. 我发现每次更换模型时都要更新我的灯具。 Ideally I would instead write data migrations to modify created data and fixture-loaded data similarly. 理想情况下,我会编写数据迁移来修改创建的数据和夹具加载的数据。

To facilitate this I wrote a quick function which will look in the fixtures directory of the current app and load a fixture. 为了方便这一点, 我编写了一个快速函数 ,它将查看当前应用程序的fixtures目录并加载一个fixture。 Put this function into a migration in the point of the model history that matches the fields in the migration. 将此函数放入模型历史记录中与迁移中的字段匹配的位置。

In my opinion fixtures are a bit bad. 在我看来,装置有点不好。 If your database changes frequently, keeping them up-to-date will came a nightmare soon. 如果您的数据库经常更改,那么让它们保持最新将很快成为一场噩梦。 Actually, it's not only my opinion, in the book "Two Scoops of Django" it's explained much better. 实际上,不仅仅是我的观点,在“两个Django的Scoops”一书中,它的解释要好得多。

Instead I'll write a Python file to provide initial setup. 相反,我会编写一个Python文件来提供初始设置。 If you need something more I'll suggest you look at Factory boy . 如果你需要更多东西,我建议你看看工厂男孩

If you need to migrate some data you should use data migrations . 如果您需要迁移某些数据,则应使用数据迁移

There's also "Burn Your Fixtures, Use Model Factories" about using fixtures. 关于使用灯具还有“刻录你的灯具,使用模型工厂”

On Django 2.1, I wanted to load some models (Like country names for example) with initial data. 在Django 2.1上,我想用初始数据加载一些模型(例如国家名称)。

But I wanted this to happen automatically right after the execution of initial migrations. 但我希望在执行初始迁移后立即自动执行此操作。

So I thought that it would be great to have an sql/ folder inside each application that required initial data to be loaded. 所以我认为在每个需要加载初始数据的应用程序中都有一个sql/文件夹会很棒。

Then within that sql/ folder I would have .sql files with the required DMLs to load the initial data into the corresponding models, for example: 然后在那个sql/文件夹中,我会使用带有所需DML的.sql文件将初始数据加载到相应的模型中,例如:

INSERT INTO appName_modelName(fieldName)
VALUES
    ("country 1"),
    ("country 2"),
    ("country 3"),
    ("country 4");

To be more descriptive, this is how an app containing an sql/ folder would look: 为了更具描述性,这是包含sql/文件夹的应用程序的外观: 在此输入图像描述

Also I found some cases where I needed the sql scripts to be executed in a specific order. 我还发现了一些需要以特定顺序执行sql脚本的情况。 So I decided to prefix the file names with a consecutive number as seen in the image above. 所以我决定在文件名前加一个连续的数字,如上图所示。

Then I needed a way to load any SQLs available inside any application folder automatically by doing python manage.py migrate . 然后我需要一种方法来通过执行python manage.py migrate自动加载任何应用程序文件夹中可用的任何SQLs

So I created another application named initial_data_migrations and then I added this app to the list of INSTALLED_APPS in settings.py file. 所以我创建了另一个名为initial_data_migrations应用程序,然后我将此应用程序添加到了settings.py文件中的INSTALLED_APPS列表中。 Then I created a migrations folder inside and added a file called run_sql_scripts.py ( Which actually is a custom migration ). 然后我在里面创建了一个migrations文件夹,并添加了一个名为run_sql_scripts.py的文件( 实际上是一个自定义迁移 )。 As seen in the image below: 如下图所示:

在此输入图像描述

I created run_sql_scripts.py so that it takes care of running all sql scripts available within each application. 我创建了run_sql_scripts.py以便它负责运行每个应用程序中可用的所有sql脚本。 This one is then fired when someone runs python manage.py migrate . 当有人运行python manage.py migrate时,会触发这个。 This custom migration also adds the involved applications as dependencies, that way it attempts to run the sql statements only after the required applications have executed their 0001_initial.py migrations (We don't want to attempt running a SQL statement against a non-existent table). 此自定义migration还会将所涉及的应用程序添加为依赖项,这样它只会在所需的应用程序执行其0001_initial.py迁移后尝试运行sql语句(我们不希望尝试针对不存在的表运行SQL语句) )。

Here is the source of that script: 以下是该脚本的来源:

import os
import itertools

from django.db import migrations
from YourDjangoProjectName.settings import BASE_DIR, INSTALLED_APPS

SQL_FOLDER = "/sql/"

APP_SQL_FOLDERS = [
    (os.path.join(BASE_DIR, app + SQL_FOLDER), app) for app in INSTALLED_APPS
    if os.path.isdir(os.path.join(BASE_DIR, app + SQL_FOLDER))
]

SQL_FILES = [
    sorted([path + file for file in os.listdir(path) if file.lower().endswith('.sql')])
    for path, app in APP_SQL_FOLDERS
]


def load_file(path):
    with open(path, 'r') as f:
        return f.read()


class Migration(migrations.Migration):

    dependencies = [
        (app, '__first__') for path, app in APP_SQL_FOLDERS
    ]

    operations = [
        migrations.RunSQL(load_file(f)) for f in list(itertools.chain.from_iterable(SQL_FILES))
    ]

I hope someone finds this helpful, it worked just fine for me!. 我希望有人觉得这很有帮助,对我来说效果很好! If you have any questions please let me know. 如果您有任何疑问,请告诉我。

NOTE: This might not be the best solution since I'm just getting started with django, however still wanted to share this "How-to" with you all since I didn't find much information while googling about this. 注意:这可能不是最好的解决方案,因为我刚刚开始使用django,但是仍然希望与大家分享这个“操作方法”,因为我在google搜索时没有找到太多信息。

What about natural keys?自然键呢?

Although @rockallite's answer is excellent, it does not explain how to handle fixtures with natural keys instead of integer pk values.尽管@rockallite 的答案非常好,但它没有解释如何使用自然键而不是 integer pk值来处理固定装置。

Simplified version简化版

First, note that @rockallite's solution can be simplified by using unittest.mock.patch as a context manager, and by patching apps instead of _get_model :首先,请注意@rockallite 的解决方案可以通过使用unittest.mock.patch作为上下文管理器并通过修补apps而不是_get_model来简化:

...
from unittest.mock import patch
...

def load_fixture(apps, schema_editor):
    with patch('django.core.serializers.python.apps', apps):
        call_command('loaddata', 'your_data.json', ...)

...

This works well, as long as your fixtures do not rely on natural keys .这很好用,只要您的灯具依赖自然键

If they do , you're likely to see a DeserializationError: ... value must be an integer... .如果他们这样做,您可能会看到DeserializationError: ... value must be an integer...

The problem with natural keys自然键的问题

Under the hood , loaddata uses django.core.serializers.deserialize() to load your fixture objects. 后台, loaddata使用django.core.serializers.deserialize()来加载您的夹具对象。

The deserialization of fixtures based on natural keys relies on two things :基于自然键的夹具反序列化依赖于两件事

The get_by_natural_key() method is necessary for the deserializer to know how to interpret the natural key, instead of an integer pk value. get_by_natural_key()方法是反序列化器知道如何解释自然键所必需的,而不是 integer pk值。

Both methods are necessary for the deserializer to get existing objects from the database by natural key, as also explained here .这两种方法都是反序列化器通过自然键从数据库中get现有对象所必需的,这里也有解释。

However, the apps registry which is available in your migrations uses historical models , and these do not have access to custom managers or custom methods such as natural_key() .但是,迁移中可用的apps注册表使用历史模型,这些模型无法访问自定义管理器或自定义方法,例如natural_key()

Possible solution: step 1可能的解决方案:步骤 1

The problem of the missing get_by_natural_key() method from our custom model manager is relatively easy to solve: Just set use_in_migrations=True on your custom manager, as described in the documentation .我们自定义的 model 管理器中缺少get_by_natural_key()方法的问题相对容易解决:只需在自定义管理器上设置use_in_migrations=True如文档中所述

This ensures that your historical models can access the current get_by_natural_key() during migrations, and fixture loading should now succeed.这确保您的历史模型可以在迁移期间访问当前的get_by_natural_key() ,并且夹具加载现在应该成功。

However, your historical models still don't have a natural_key() method.但是,您的历史模型仍然没有natural_key()方法。 As a result, your fixtures will be treated as new objects, even if they are already present in the database.因此,您的设备将被视为新对象,即使它们已经存在于数据库中。 This may lead to a variety of errors if the data-migration is ever re-applied, such as:如果重新应用数据迁移,这可能会导致各种错误,例如:

  • unique-constraint violations (if your models have unique-constraints)违反唯一约束(如果您的模型具有唯一约束)
  • duplicate fixture objects (if your models do not have unique-constraints)重复的夹具对象(如果您的模型没有唯一约束)
  • "get returned multiple objects" errors (due to duplicate fixture objects created previously) “获取返回多个对象”错误(由于先前创建的重复夹具对象)

So, effectively, you're still missing out on a kind of get_or_create -like behavior during deserialization.因此,实际上,您在反序列化期间仍然错过了一种类似get_or_create的行为。

To experience this, just apply a data-migration as described above (in a test environment), then roll back the same data-migration (without removing the data), then re-apply the data-migration.要体验这一点,只需如上所述应用数据迁移(在测试环境中),然后回滚相同的数据迁移(不删除数据),然后重新应用数据迁移。

Possible solution: step 2可能的解决方案:步骤 2

The problem of the missing natural_key() method from the model itself is a bit more difficult to solve. model 本身缺少natural_key()方法的问题有点难以解决。 One solution would be to assign the natural_key() method from the current model to the historical model, for example:一种解决方案是将natural_key()方法从当前 model 分配给历史 model,例如:

...
from unittest.mock import patch

from django.apps import apps as current_apps
from django.core.management import call_command
...


def load_fixture(apps, schema_editor):
    def _get_model_patch(app_label):
        """ add natural_key method from current model to historical model """
        HistoricalModel = apps.get_model(app_label=app_label)
        CurrentModel = current_apps.get_model(app_label=app_label)
        HistoricalModel.natural_key = CurrentModel.natural_key
        return HistoricalModel

    with patch('django.core.serializers.python._get_model', _get_model_patch):
        call_command('loaddata', 'your_data.json', ...)

...

Notes:笔记:

  • For clarity, I omitted things like error handling and attribute checking from the example.为了清楚起见,我在示例中省略了错误处理和属性检查等内容。 You should implement those where necessary.您应该在必要时实施那些。
  • This solution uses the current model's natural_key method, which may still lead to trouble in certain scenarios, but the same goes for Django's use_in_migrations option for model managers.该解决方案使用当前模型的natural_key方法,在某些场景下可能仍然会导致问题,但对于 Django 的 model 管理器的use_in_migrations选项也是如此。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM