[英]How to do a bulk insert or increment type operation with the Django ORM
I have a model as defined here: 我有一个在这里定义的模型:
class VectorSet(models.Model):
word = models.CharField(max_length=255)
weight = models.IntegerField()
session = models.ForeignKey(ResearchSession)
I want to write a function that will take a list of words and a ResearchSession, and for each word in that list of words if it's doesn't already exist, create a new row with a weight of 1, otherwise take that row and increment weight by 1. 我想编写一个函数,该函数将包含一个单词列表和一个ResearchSession,并且对于该单词列表中的每个单词(如果尚不存在),创建一个权重为1的新行,否则采用该行并递增重量减1。
So far I've gotten this: 到目前为止,我已经做到了:
def train(words, session):
for i in words:
result, created = VectorSet.objects.get_or_create(word=i, session=session,
defaults={'weight' : 1})
if not created:
result.weight = F('weight') + 1
result.save()
I'm fairly confident that there is a way to do this with one query however I can't quite figure out what that might be or if it's possible to do with django code over raw SQL. 我相当有信心,有一种方法可以对一个查询执行此操作,但是我无法弄清楚这可能是什么,或者是否有可能通过原始SQL使用Django代码。
There is currently no out-of-the box solution for doing bulk inserts other than bulk_create I think. 除了我认为的bulk_create,目前没有开箱即用的解决方案来进行大容量插入。 Another solution, depending on your database, is to perform get_or_create within a transaction by using atomic .
根据您的数据库,另一种解决方案是通过使用atomic在事务内执行get_or_create。 For example:
例如:
from django.db import transaction
@transaction.atomic
def train(words, session):
for i in words:
result, created = VectorSet.objects.get_or_create(word=i, session=session,
defaults={'weight' : 1})
if not created:
result.weight = F('weight') + 1
result.save()
Otherwise, you might be able to use the DB API executemany : 否则,您也许可以使用DB API executemany :
cursor.executemany('INSERT INTO vectorset (field1, field2, field3) VALUES (?, ?, ?)', data)
Logic is simple, but we need to hit DB several times, which means several queries: 逻辑很简单,但是我们需要多次访问数据库,这意味着要执行几个查询:
qs = VectorSet.objects.filter(word__in=words, session=session)
qs.update(weiget=models.F('weight')+1)
VectorSet.objects.bulk_insert(VectorSet(session=session, word=w, weight=1)
for w in words if w not in qs.value_list('word', flat=True))
There is also a update_or_create
in Django 1.7, but currently it does not distinguish defaults for update from defaults for create: Django 1.7中也有一个
update_or_create
,但是当前它不能区分update的默认值和create的默认值:
for w in words:
VectorSet.objects.update_or_create(word=w, session=session,
defaults={'weight': models.F('weight')+1})
Thus the above code will fail in creating by int(models.F('weight')+1)
(We could override the __int__
method, but too hack to make sense...IMO) 因此,上述代码将无法通过
int(models.F('weight')+1)
(我们可以覆盖__int__
方法,但太过分__int__
……IMO)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.