简体   繁体   中英

How do I perform a batch insert in Django?

In mysql, you can insert multiple rows to a table in one query for n > 0:

INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9), ..., (n-2, n-1, n);

Is there a way to achieve the above with Django queryset methods? Here's an example:

values = [(1, 2, 3), (4, 5, 6), ...]

for value in values:
    SomeModel.objects.create(first=value[0], second=value[1], third=value[2])

I believe the above is calling an insert query for each iteration of the for loop. I'm looking for a single query, is that possible in Django?

These answers are outdated. bulk_create has been brought in Django 1.4:

https://docs.djangoproject.com/en/dev/ref/models/querysets/#bulk-create

I recently looked for such a thing myself (inspired by QuerySet.update() , as I imagine you are too). To my knowledge, no bulk create exists in the current production framework (1.1.1 as of today). We ended up creating a custom manager for the model that needed bulk-create, and created a function on that manager to build an appropriate SQL statement with the sequence of VALUES parameters.

Something like (apologies if this does not work... hopefully I've adapted this runnably from our code):

from django.db import models, connection

class MyManager(models.Manager):

    def create_in_bulk(self, values):
        base_sql = "INSERT INTO tbl_name (a,b,c) VALUES "
        values_sql = []
        values_data = []

        for value_list in values:
            placeholders = ['%s' for i in range(len(value_list))]
            values_sql.append("(%s)" % ','.join(placeholders))
            values_data.extend(value_list)

        sql = '%s%s' % (base_sql, ', '.join(values_sql))

        curs = connection.cursor()
        curs.execute(sql, values_data)

class MyObject(models.Model):
    # model definition as usual... assume:
    foo = models.CharField(max_length=128)

    # custom manager
    objects = MyManager()

MyObject.objects.create_in_bulk( [('hello',), ('bye',), ('c', )] )

This approach does run the risk of being very specific to a particular database. In our case, we wanted the function to return the IDs just created, so we had a postgres-specific query in the function to generate the requisite number of IDs from the primary key sequence for the table that represents the object. That said, it does perform significantly better in tests versus iterating over the data and issuing separate QuerySet.create() statements.

Here is way to do batch inserts that still goes through Django's ORM (and thus retains the many benefits the ORM provides). This approach involves subclassing the InsertQuery class as well as creating a custom manager that prepares model instances for insertion into the database in much the same way that Django's save() method uses. Most of the code for the BatchInsertQuery class below is straight from the InsertQuery class, with just a few key lines added or modified. To use the batch_insert method, pass in a set of model instances that you want to insert into the database. This approach frees up the code in your views from having to worry about translating model instances into valid SQL values; the manager class in conjunction with the BatchInsertQuery class handles that.

from django.db import models, connection
from django.db.models.sql import InsertQuery

class BatchInsertQuery( InsertQuery ):

    ####################################################################

    def as_sql(self):
        """
        Constructs a SQL statement for inserting all of the model instances
        into the database.

        Differences from base class method:        

        - The VALUES clause is constructed differently to account for the
        grouping of the values (actually, placeholders) into
        parenthetically-enclosed groups. I.e., VALUES (a,b,c),(d,e,f)
        """
        qn = self.connection.ops.quote_name
        opts = self.model._meta
        result = ['INSERT INTO %s' % qn(opts.db_table)]
        result.append('(%s)' % ', '.join([qn(c) for c in self.columns]))
        result.append( 'VALUES %s' % ', '.join( '(%s)' % ', '.join( 
            values_group ) for values_group in self.values ) ) # This line is different
        params = self.params
        if self.return_id and self.connection.features.can_return_id_from_insert:
            col = "%s.%s" % (qn(opts.db_table), qn(opts.pk.column))
            r_fmt, r_params = self.connection.ops.return_insert_id()
            result.append(r_fmt % col)
            params = params + r_params
        return ' '.join(result), params

    ####################################################################

    def insert_values( self, insert_values ):
        """
        Adds the insert values to the instance. Can be called multiple times
        for multiple instances of the same model class.

        Differences from base class method:

        -Clears self.columns so that self.columns won't be duplicated for each
        set of inserted_values.        
        -appends the insert_values to self.values instead of extends so that
        the values (actually the placeholders) remain grouped separately for
        the VALUES clause of the SQL statement. I.e., VALUES (a,b,c),(d,e,f)
        -Removes inapplicable code
        """
        self.columns = [] # This line is new

        placeholders, values = [], []
        for field, val in insert_values:
            placeholders.append('%s')

            self.columns.append(field.column)
            values.append(val)

        self.params += tuple(values)
        self.values.append( placeholders ) # This line is different

########################################################################

class ManagerEx( models.Manager ):
    """
    Extended model manager class.
    """
    def batch_insert( self, *instances ):
        """
        Issues a batch INSERT using the specified model instances.
        """
        cls = instances[0].__class__
        query = BatchInsertQuery( cls, connection )
        for instance in instances:

             values = [ (f, f.get_db_prep_save( f.pre_save( instance, True ) ) ) \
                 for f in cls._meta.local_fields ]
            query.insert_values( values )

        return query.execute_sql()

########################################################################

class MyModel( models.Model ):
    myfield = models.CharField(max_length=255)
    objects = ManagerEx()

########################################################################

# USAGE:
object1 = MyModel(myfield="foo")
object2 = MyModel(myfield="bar") 
object3 = MyModel(myfield="bam")
MyModels.objects.batch_insert(object1,object2,object3)

You might get the performance you need by doing manual transactions. What this will allow you to do is to create all the inserts in one transaction, then commit the transaction all at once. Hopefully this will help you: http://docs.djangoproject.com/en/dev/topics/db/transactions/

No it is not possible because django models are objects rather than a table. so table actions are not applicable to django models. and django creates an object then inserts data in to the table therefore you can't create multiple object in one time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM