使用 Django 1.7+ 和数据迁移加载初始数据

Question

我最近从 Django 1.6 切换到 1.7，并开始使用迁移（我从未使用过 South）。

在 1.7 之前，我使用fixture/initial_data.json文件加载初始数据，该文件是使用python manage.py syncdb命令加载的（创建数据库时）。

现在，我开始使用迁移，这种行为已被弃用：

如果应用程序使用迁移，则不会自动加载固定装置。 由于 Django 2.0 中的应用程序需要迁移，因此此行为被视为已弃用。 如果您想为应用加载初始数据，请考虑在数据迁移中进行。 （ https://docs.djangoproject.com/en/1.7/howto/initial-data/#automatically-loading-initial-data-fixtures ）

官方文档没有关于如何做到这一点的明确示例，所以我的问题是：

使用数据迁移导入此类初始数据的最佳方法是什么：

通过多次调用mymodel.create(...)编写 Python 代码，
使用或编写 Django function（如调用loaddata ）从 JSON 夹具文件加载数据。

我更喜欢第二种选择。

我不想使用 South，因为 Django 现在似乎可以原生地做到这一点。

Answer 1

更新：请参阅下面的@ GwynBleidD关于此解决方案可能导致的问题的评论，并参阅下面的@ Rockallite的答案，了解对未来模型更改更持久的方法。

假设您在<yourapp>/fixtures/initial_data.json有一个fixture文件

创建空迁移：

在Django 1.7中：

 python manage.py makemigrations --empty <yourapp>

在Django 1.8+中，您可以提供一个名称：

 python manage.py makemigrations --empty <yourapp> --name load_intial_data

编辑迁移文件<yourapp>/migrations/0002_auto_xxx.py

2.1。 自定义实现，灵感来自Django的loaddata （初始答案）：

 import os from sys import path from django.core import serializers fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures')) fixture_filename = 'initial_data.json' def load_fixture(apps, schema_editor): fixture_file = os.path.join(fixture_dir, fixture_filename) fixture = open(fixture_file, 'rb') objects = serializers.deserialize('json', fixture, ignorenonexistent=True) for obj in objects: obj.save() fixture.close() def unload_fixture(apps, schema_editor): "Brutally deleting all entries for this model..." MyModel = apps.get_model("yourapp", "ModelName") MyModel.objects.all().delete() class Migration(migrations.Migration): dependencies = [ ('yourapp', '0001_initial'), ] operations = [ migrations.RunPython(load_fixture, reverse_code=unload_fixture), ]

2.2。 load_fixture一个更简单的解决方案（根据@juliocesar的建议）：

 from django.core.management import call_command fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures')) fixture_filename = 'initial_data.json' def load_fixture(apps, schema_editor): fixture_file = os.path.join(fixture_dir, fixture_filename) call_command('loaddata', fixture_file)

如果要使用自定义目录，则很有用。

2.3。 最简单：使用app_label调用loaddata将自动加载<yourapp>的fixtures目录中的fixtures ：

 from django.core.management import call_command fixture = 'initial_data' def load_fixture(apps, schema_editor): call_command('loaddata', fixture, app_label='yourapp')

如果你没有指定app_label ，loaddata将尝试从所有应用程序fixtures目录（你可能不想要）加载fixture文件名。

运行
```
 python manage.py migrate <yourapp> 
```

Answer 2

精简版

你不应该使用loaddata在数据迁移管理直接命令。

# Bad example for a data migration
from django.db import migrations
from django.core.management import call_command


def load_fixture(apps, schema_editor):
    # No, it's wrong. DON'T DO THIS!
    call_command('loaddata', 'your_data.json', app_label='yourapp')


class Migration(migrations.Migration):
    dependencies = [
        # Dependencies to other migrations
    ]

    operations = [
        migrations.RunPython(load_fixture),
    ]

长版

loaddata使用django.core.serializers.python.Deserializer ，它使用最新的模型对迁移中的历史数据进行反序列化。 这是不正确的行为。

例如，假设存在利用loaddata管理命令从夹具加载数据的数据迁移，并且它已经应用于您的开发环境。

稍后，您决定向相应的模型添加新的必填字段，因此您可以执行此操作并针对更新的模型进行新的迁移（当./manage.py makemigrations提示您时，可能会为新字段提供一次性值。）。

你运行下一次迁移，一切都很顺利。

最后，您已经完成了Django应用程序的开发，并将其部署在生产服务器上。 现在是时候在生产环境中从头开始运行整个迁移了。

但是， 数据迁移失败 。 这是因为来自loaddata命令的反序列化模型（代表当前代码）无法与您添加的新必填字段的空数据一起保存。 原始夹具缺少必要的数据！

但即使您使用新字段所需的数据更新夹具， 数据迁移仍会失败 。 数据迁移正在运行时，尚未应用将相应列添加到数据库的下一次迁移。 您无法将数据保存到不存在的列！

结论：在数据迁移中， loaddata命令在模型和数据库之间引入了潜在的不一致。 绝对不应该在数据迁移中直接使用它。

解决方案

loaddata命令依赖于django.core.serializers.python._get_model函数从夹具中获取相应的模型，该模型将返回最新版本的模型。 我们需要对其进行修补，以便获得历史模型。

（以下代码适用于Django 1.8.x）

# Good example for a data migration
from django.db import migrations
from django.core.serializers import base, python
from django.core.management import call_command


def load_fixture(apps, schema_editor):
    # Save the old _get_model() function
    old_get_model = python._get_model

    # Define new _get_model() function here, which utilizes the apps argument to
    # get the historical version of a model. This piece of code is directly stolen
    # from django.core.serializers.python._get_model, unchanged. However, here it
    # has a different context, specifically, the apps variable.
    def _get_model(model_identifier):
        try:
            return apps.get_model(model_identifier)
        except (LookupError, TypeError):
            raise base.DeserializationError("Invalid model identifier: '%s'" % model_identifier)

    # Replace the _get_model() function on the module, so loaddata can utilize it.
    python._get_model = _get_model

    try:
        # Call loaddata command
        call_command('loaddata', 'your_data.json', app_label='yourapp')
    finally:
        # Restore old _get_model() function
        python._get_model = old_get_model


class Migration(migrations.Migration):
    dependencies = [
        # Dependencies to other migrations
    ]

    operations = [
        migrations.RunPython(load_fixture),
    ]

Answer 3

受到一些评论（即n__o）的启发以及我有很多initial_data.*文件分布在多个应用程序中的事实，我决定创建一个Django应用程序，以便于创建这些数据迁移。

使用django-migration-fixture，您只需运行以下管理命令，它将搜索所有INSTALLED_APPS中的initial_data.*文件，并将它们转换为数据迁移。

./manage.py create_initial_data_fixtures
Migrations for 'eggs':
  0002_auto_20150107_0817.py:
Migrations for 'sausage':
  Ignoring 'initial_data.yaml' - migration already exists.
Migrations for 'foo':
  Ignoring 'initial_data.yaml' - not migrated.

有关安装/使用说明，请参阅django-migration-fixture 。

Answer 4

为了给数据库提供一些初始数据，请编写数据迁移。 在数据迁移中，使用RunPython函数加载数据。

不要写任何loaddata命令，因为这种方式已被弃用。

您的数据迁移只会运行一次。 迁移是有序的迁移序列。 运行003_xxxx.py迁移时，django迁移会在数据库中写入此应用程序迁移到此应用程序（003），并且仅运行以下迁移。

Answer 5

不幸的是，上面提出的解决方案对我不起作用。 我发现每次更换模型时都要更新我的灯具。 理想情况下，我会编写数据迁移来修改创建的数据和夹具加载的数据。

为了方便这一点，我编写了一个快速函数，它将查看当前应用程序的fixtures目录并加载一个fixture。 将此函数放入模型历史记录中与迁移中的字段匹配的位置。

Answer 6

在我看来，装置有点不好。 如果您的数据库经常更改，那么让它们保持最新将很快成为一场噩梦。 实际上，不仅仅是我的观点，在“两个Django的Scoops”一书中，它的解释要好得多。

相反，我会编写一个Python文件来提供初始设置。 如果你需要更多东西，我建议你看看工厂男孩。

如果您需要迁移某些数据，则应使用数据迁移。

关于使用灯具，还有“刻录你的灯具，使用模型工厂” 。

Answer 7

在Django 2.1上，我想用初始数据加载一些模型（例如国家名称）。

但我希望在执行初始迁移后立即自动执行此操作。

所以我认为在每个需要加载初始数据的应用程序中都有一个sql/文件夹会很棒。

然后在那个sql/文件夹中，我会使用带有所需DML的.sql文件将初始数据加载到相应的模型中，例如：

INSERT INTO appName_modelName(fieldName)
VALUES
    ("country 1"),
    ("country 2"),
    ("country 3"),
    ("country 4");

为了更具描述性，这是包含sql/文件夹的应用程序的外观：

我还发现了一些需要以特定顺序执行sql脚本的情况。 所以我决定在文件名前加一个连续的数字，如上图所示。

然后我需要一种方法来通过执行python manage.py migrate自动加载任何应用程序文件夹中可用的任何SQLs 。

所以我创建了另一个名为initial_data_migrations应用程序，然后我将此应用程序添加到了settings.py文件中的INSTALLED_APPS列表中。 然后我在里面创建了一个migrations文件夹，并添加了一个名为run_sql_scripts.py的文件（ 实际上是一个自定义迁移 ）。 如下图所示：

我创建了run_sql_scripts.py以便它负责运行每个应用程序中可用的所有sql脚本。 当有人运行python manage.py migrate时，会触发这个。 此自定义migration还会将所涉及的应用程序添加为依赖项，这样它只会在所需的应用程序执行其0001_initial.py迁移后尝试运行sql语句（我们不希望尝试针对不存在的表运行SQL语句））。

以下是该脚本的来源：

import os
import itertools

from django.db import migrations
from YourDjangoProjectName.settings import BASE_DIR, INSTALLED_APPS

SQL_FOLDER = "/sql/"

APP_SQL_FOLDERS = [
    (os.path.join(BASE_DIR, app + SQL_FOLDER), app) for app in INSTALLED_APPS
    if os.path.isdir(os.path.join(BASE_DIR, app + SQL_FOLDER))
]

SQL_FILES = [
    sorted([path + file for file in os.listdir(path) if file.lower().endswith('.sql')])
    for path, app in APP_SQL_FOLDERS
]


def load_file(path):
    with open(path, 'r') as f:
        return f.read()


class Migration(migrations.Migration):

    dependencies = [
        (app, '__first__') for path, app in APP_SQL_FOLDERS
    ]

    operations = [
        migrations.RunSQL(load_file(f)) for f in list(itertools.chain.from_iterable(SQL_FILES))
    ]

我希望有人觉得这很有帮助，对我来说效果很好！ 如果您有任何疑问，请告诉我。

注意：这可能不是最好的解决方案，因为我刚刚开始使用django，但是仍然希望与大家分享这个“操作方法”，因为我在google搜索时没有找到太多信息。

Answer 8

自然键呢？

尽管@rockallite 的答案非常好，但它没有解释如何使用自然键而不是 integer pk值来处理固定装置。

简化版

首先，请注意@rockallite 的解决方案可以通过使用unittest.mock.patch作为上下文管理器并通过修补apps而不是_get_model来简化：

...
from unittest.mock import patch
...

def load_fixture(apps, schema_editor):
    with patch('django.core.serializers.python.apps', apps):
        call_command('loaddata', 'your_data.json', ...)

...

这很好用，只要您的灯具不依赖自然键。

如果他们这样做，您可能会看到DeserializationError: ... value must be an integer... 。

自然键的问题

在后台， loaddata使用django.core.serializers.deserialize()来加载您的夹具对象。

基于自然键的夹具反序列化依赖于两件事：

模型的默认管理器上存在get_by_natural_key()方法
model 本身存在natural_key()方法

get_by_natural_key()方法是反序列化器知道如何解释自然键所必需的，而不是 integer pk值。

这两种方法都是反序列化器通过自然键从数据库中get现有对象所必需的，这里也有解释。

但是，迁移中可用的apps注册表使用历史模型，这些模型无法访问自定义管理器或自定义方法，例如natural_key() 。

可能的解决方案：步骤 1

我们自定义的 model 管理器中缺少get_by_natural_key()方法的问题相对容易解决：只需在自定义管理器上设置use_in_migrations=True ，如文档中所述。

这确保您的历史模型可以在迁移期间访问当前的get_by_natural_key() ，并且夹具加载现在应该成功。

但是，您的历史模型仍然没有natural_key()方法。 因此，您的设备将被视为新对象，即使它们已经存在于数据库中。 如果重新应用数据迁移，这可能会导致各种错误，例如：

违反唯一约束（如果您的模型具有唯一约束）
重复的夹具对象（如果您的模型没有唯一约束）
“获取返回多个对象”错误（由于先前创建的重复夹具对象）

因此，实际上，您在反序列化期间仍然错过了一种类似get_or_create的行为。

要体验这一点，只需如上所述应用数据迁移（在测试环境中），然后回滚相同的数据迁移（不删除数据），然后重新应用数据迁移。

可能的解决方案：步骤 2

model 本身缺少natural_key()方法的问题有点难以解决。 一种解决方案是将natural_key()方法从当前 model 分配给历史 model，例如：

...
from unittest.mock import patch

from django.apps import apps as current_apps
from django.core.management import call_command
...


def load_fixture(apps, schema_editor):
    def _get_model_patch(app_label):
        """ add natural_key method from current model to historical model """
        HistoricalModel = apps.get_model(app_label=app_label)
        CurrentModel = current_apps.get_model(app_label=app_label)
        HistoricalModel.natural_key = CurrentModel.natural_key
        return HistoricalModel

    with patch('django.core.serializers.python._get_model', _get_model_patch):
        call_command('loaddata', 'your_data.json', ...)

...

笔记：

为了清楚起见，我在示例中省略了错误处理和属性检查等内容。 您应该在必要时实施那些。
该解决方案使用当前模型的natural_key方法，在某些场景下可能仍然会导致问题，但对于 Django 的 model 管理器的use_in_migrations选项也是如此。

使用 Django 1.7+ 和数据迁移加载初始数据

问题描述

8 个解决方案

解决方案1
78 已采纳 2014-09-22 19:38:14

解决方案2
43 2016-09-28 09:38:05

精简版

长版

解决方案

解决方案3
6 2015-01-06 21:27:00

解决方案4
2 2014-09-22 07:17:43

解决方案5
1 2015-10-10 21:07:05

解决方案6
0 2014-09-21 16:52:30

解决方案7
0 2019-01-24 21:26:06

解决方案8
0 2022-09-15 14:52:41

自然键呢？

简化版

自然键的问题

可能的解决方案：步骤 1

可能的解决方案：步骤 2

使用 Django 1.7+ 和数据迁移加载初始数据

问题描述

8 个解决方案

解决方案1 78 已采纳 2014-09-22 19:38:14

解决方案2 43 2016-09-28 09:38:05

精简版

长版

解决方案

解决方案3 6 2015-01-06 21:27:00

解决方案4 2 2014-09-22 07:17:43

解决方案5 1 2015-10-10 21:07:05

解决方案6 0 2014-09-21 16:52:30

解决方案7 0 2019-01-24 21:26:06

解决方案8 0 2022-09-15 14:52:41

自然键呢？

简化版

自然键的问题

可能的解决方案：步骤 1

可能的解决方案：步骤 2

解决方案1
78 已采纳 2014-09-22 19:38:14

解决方案2
43 2016-09-28 09:38:05

解决方案3
6 2015-01-06 21:27:00

解决方案4
2 2014-09-22 07:17:43

解决方案5
1 2015-10-10 21:07:05

解决方案6
0 2014-09-21 16:52:30

解决方案7
0 2019-01-24 21:26:06

解决方案8
0 2022-09-15 14:52:41