简体   繁体   English

Django在批量插入/更新/删除时“模拟”数据库触发器行为

[英]Django “emulate” database trigger behavior on bulk insert/update/delete

It's a self expaining question but here we go. 这是一个自我解释的问题,但是我们开始。 I'm creating a business app in Django, and i didn't wanted to "spread" all the logic across app AND database, but in the other hand, i didn't wanted to let the Database handle this task (its possible through the use of Triggers ). 我正在Django中创建一个业务应用程序,但我不想在应用程序和数据库之间“散布”所有逻辑,但另一方面,我不想让数据库处理此任务(可能通过使用触发器 )。

So I wanted to "reproduce" the behavior of the Databse Triggers, but inside the Model Class in Django (um currently using Django 1.4). 因此,我想“重现” Databse触发器的行为,但要在Django的Model类内部(目前使用Django 1.4)。

After some research, I figured out that with single objects, I could override the "save" and "delete" methods of "models.Model" class, inserting the "before" and "after" hooks so they could be executed before and after the parent's save/delete. 经过研究,我发现使用单个对象可以覆盖“ models.Model”类的“ save”和“ delete”方法,插入“ before”和“ after”钩子,以便可以在之前和之后执行父母的保存/删除。 Like This: 像这样:

     class MyModel(models.Model):

         def __before(self):
             pass

         def __after(self):
            pass

         @commit_on_success #the decorator is only to ensure that everything occurs inside the same transaction
         def save(self, *args, *kwargs):
             self.__before()
             super(MyModel,self).save(args, kwargs)
             self.__after()

The BIG problem is with bulk operations. 大问题在于批量操作。 Django doesn't triggers the save/delete of the models when running the "update()"/"delete()" from it's QuerySet. 当从QuerySet运行“ update()” /“ delete()”时,Django不会触发模型的保存/删除。 Insted, it uses the QuerySet's own method. 插入后,它使用QuerySet自己的方法。 And to get a little bit worst, it doesn't trigger any signal either. 更糟糕的是,它也不会触发任何信号。

Edit: Just to be a little more specific: the model loading inside the view is dynamic, so it's impossible to define a "model specific" way. 编辑:只是更具体一点:视图内的模型加载是动态的,因此不可能定义“特定于模型”的方式。 In this case, I should create an Abstract Class and handle it there. 在这种情况下,我应该创建一个Abstract类并在那里进行处理。

My last attempt was to create a custom Manager, and in this custom manager, override the update method, looping over the models inside the queryset, and trigering the "save()" of each model (take in consideration the implementation above, or the "signals" system). 我的最后一次尝试是创建一个自定义管理器,并在此自定义管理器中覆盖update方法,遍历queryset内部的模型,并触发每个模型的“ save()”(考虑上述实现或“信号”系统)。 It works, but results in a database "overload" (imagine a 10k rows queryset being updated). 它可以工作,但是会导致数据库“过载”(假设正在更新1万行的查询集)。

First, instead of overriding save to add __before and __after methods, you can use the built-in pre_save , post_save, pre_delete, and post_delete signals. 首先,您可以使用内置的pre_savepost_save, pre_delete,post_delete信号,而不是覆盖save来添加__before__after方法。 https://docs.djangoproject.com/en/1.4/topics/signals/ https://docs.djangoproject.com/en/1.4/topics/signals/

from django.db.models.signals import post_save

class YourModel(models.Model):
    pass

def after_save_your_model(sender, instance, **kwargs):
     pass

# register the signal
post_save.connect(after_save_your_model, sender=YourModel, dispatch_uid=__file__)

pre_delete and post_delete will get triggered when you call delete() on a queryset. 当您在查询集上调用delete()时, pre_deletepost_delete将被触发。

For bulk updating, you'll have to manually call the function you want to trigger yourself, however. 但是,对于批量更新,您必须手动调用要触发的函数。 And you can throw it all in a transaction as well. 您也可以将其全部放入事务中。

To call the proper trigger function if you're using dynamic models, you can inspect the model's ContentType. 如果使用动态模型,则要调用适当的触发器函数,可以检查模型的ContentType。 For example: 例如:

from django.contrib.contenttypes.models import ContentType

def view(request, app, model_name, method):
    ...
    model = get_model(app, model_name)
    content_type = ContentType.objects.get_for_model(model)
    if content_type == ContenType.objects.get_for_model(YourModel):
        after_save_your_model(model)
    elif content_type == Contentype.objects.get_for_model(AnotherModel):
        another_trigger_function(model)

With a few caveats, you can override the queryset's update method to fire the signals, while still using an SQL UPDATE statement: 有一些警告,您可以覆盖queryset的update方法以触发信号,同时仍使用SQL UPDATE语句:

from django.db.models.signals import pre_save, post_save

def CustomQuerySet(QuerySet):
    @commit_on_success
    def update(self, **kwargs):
        for instance in self:
            pre_save.send(sender=instance.__class__, instance=instance, raw=False, 
                          using=self.db, update_fields=kwargs.keys())
        # use self instead of self.all() if you want to reload all data 
        # from the db for the post_save signal
        result = super(CustomQuerySet, self.all()).update(**kwargs)
        for instance in self:
            post_save.send(sender=instance.__class__, instance=instance, created=False,
                           raw=False, using=self.db, update_fields=kwargs.keys())
        return result

    update.alters_data = True

I clone the current queryset (using self.all() ), because the update method will clear the cache of the queryset object. 我克隆了当前的queryset(使用self.all() ),因为update方法将清除queryset对象的缓存。

There are a few issues that may or may not break your code. 有一些问题可能会或可能不会破坏您的代码。 First of all it will introduce a race condition. 首先,它将引入竞争条件。 You do something in the pre_save signal's receivers based on data that may no longer be accurate when you update the database. 您基于更新数据库时可能不再准确的数据在pre_save信号的接收器中执行某些操作。

There may also be some serious performance issues with large querysets. 大查询集可能还会出现一些严重的性能问题。 Unlike the update method, all models will have to be loaded into memory, and then the signals still need to be executed. update方法不同,所有模型都必须加载到内存中,然后仍然需要执行信号。 Especially if the signals themselves have to interact with the database, performance can be unacceptably slow. 特别是如果信号本身必须与数据库进行交互,则性能可能会令人无法接受地变慢。 And unlike the regular pre_save signal, changing the model instance will not automatically cause the database to be updated, as the model instance is not used to save the new data. 并且与常规的pre_save信号不同,更改模型实例不会自动导致数据库被更新,因为模型实例不用于保存新数据。

There are probably some more issues that will cause a problem in a few edge cases. 在少数情况下,可能还有更多问题会引起问题。

Anyway, if you can handle these issues without having some serious problems, I think this is the best way to do this. 无论如何,如果您可以解决这些问题而又不会遇到一些严重的问题,那么我认为这是最好的方法。 It produces as little overhead as possible while still loading the models into memory, which is pretty much required to correctly execute the various signals. 在将模型加载到内存的同时,它产生的开销尽可能小,这对于正确执行各种信号非常必要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM