[英]Django multiple queries optimization
I have a Django view that return a map of the edit distance between thousands of strings. 我有一个Django视图,该视图返回数千个字符串之间的编辑距离图。 These strings are arguments of the MyModel
class. 这些字符串是MyModel
类的参数。 I calculate the distance in the myView
function. 我在myView
函数中计算距离。
I profiled this code and realized that the queryset
inside the loop consumes a lot of time. 我分析了此代码,并意识到循环内的queryset
消耗大量时间。
How could I optimize this? 我该如何优化呢?
# models.py
class MyModel(models.Model):
str1 = models.CharField(max_length=300)
str2 = models.CharField(max_length=300)
# views.py
def compare(a, b):
return Levenshtein.distance(a, b) / max(len(a), len(b))
def myView(request):
query_set = MyModel.objects.filter(....)
size = query_set.count()
arr = numpy.zeros(size ** 2).reshape(size, size)
for i in range(size):
m1 = query_set[i].str1
for j in range(size):
m2 = query_set[j].str1
arr[i][j] = compare(m1, m2)
json_out = json.dumps({'data': arr.tolist()})
return HttpResponse(json_out, content_type="application/json")
EDIT 编辑
I think the problem is related to the database access because I tried a similar approach, but using an external txt file to store the data and it was much faster: 我认为问题与数据库访问有关,因为我尝试了类似的方法,但是使用外部txt文件存储数据并且速度更快:
# file.txt
[{'par1': ....}, {'par1': ....}, ...]
# views.py
def myView(request):
with open('file.txt', 'r') as out:
data = out.read()
size = len(data)
arr = numpy.zeros(size ** 2).reshape(size, size)
for i in range(size):
for j in range(size):
m1 = data[i]['par1']
m2 = data[j]['par1']
arr[i][j] = compare(m1, m2)
json_out = json.dumps({'data': arr.tolist()})
return HttpResponse(json_out, content_type="application/json")
How many queries is myView actually doing? myView实际在执行几个查询? It should be doing 1 - or possibly 2 for count() and then the actual data. 应该对count()进行1或可能为2,然后对实际数据进行2。 But I would start by verifying that. 但我首先要进行验证。 I use https://github.com/dobarkod/django-queryinspect but most folks use https://github.com/jazzband/django-debug-toolbar to find out how many queries are being done. 我使用https://github.com/dobarkod/django-queryinspect,但是大多数人使用https://github.com/jazzband/django-debug-toolbar来查找正在执行的查询数量。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.