简体   繁体   English

Django迁移1100万行,需要将其分解

[英]Django migration 11 million rows, need to break it down

I have a table which I am working on and it contains 11 million rows there abouts... I need to run a migration on this table but since Django trys to store it all in cache I run out of ram or disk space which ever comes first and it comes to abrupt halt. 我有一个我正在处理的表,它包含大约1100万行...我需要在这个表上运行一个迁移,但是由于Django试图将它全部存储在缓存中,所以我用完ram或磁盘空间首先,它突然停止。

I'm curious to know if anyone has faced this issue and has come up with a solution to essentially "paginate" migrations maybe into blocks of 10-20k rows at a time? 我很想知道是否有人遇到过这个问题并提出了一个基本上“分页”迁移的解决方案,可能一次分成10-20k行的块?

Just to give a bit of background I am using Django 1.10 and Postgres 9.4 and I want to keep this automated still if possible (which I still think it can be) 为了给出一些背景知识,我正在使用Django 1.10和Postgres 9.4,如果可能的话,我想保持这种自动化(我仍然认为它可以)

Thanks Sam 谢谢山姆

The issue comes from a Postgresql which rewrites each row on adding a new column (field). 问题来自Postgresql,它在添加新列(字段)时重写每一行。

What you would need to do is to write your own data migration in the following way: 您需要做的是以下列方式编写自己的数据迁移:

  1. Add a new column with null=True . 添加一个null=True的新列。 In this case data will not be rewritten and migration will finish pretty fast. 在这种情况下,数据不会被重写,迁移将很快完成。
  2. Migrate it 迁移它
  3. Add a default value 添加默认值
  4. Migrate it again. 再次迁移它。

That is basically a simple pattern on how to deal with adding a new row in a huge postgres database. 这基本上是一个关于如何处理在巨大的postgres数据库中添加新行的简单模式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM