简体   繁体   English

Google App Engine上的Django数据库运行缓慢

[英]Slow Django database operations on Google App Engine

I'm testing Google App Engine and Django-nonrel with free quota. 我正在使用免费配额测试Google App Engine和Django-nonrel。 It seems to me, that the database operations to Datastore a hideously slow. 在我看来,对数据存储的数据库操作速度非常慢。

Take for example this simplified function processing a request, which takes in a multipart/form-data of XML blobs, parses them and inserts them to the database: 以处理请求的简化函数为例,该请求接收XML Blob的多部分/表单数据,对其进行解析并将其插入数据库中:

def post(request):
    fields = cgi.FieldStorage(request)
    with transaction.commit_on_success():
        for xmlblob in fields.getlist('xmlblob'):
            blob_object = parse_xml(xmlblob)
            blob_object.save()

Blob_object has five fields, all of them of type CharField. Blob_object具有五个字段,所有字段均为CharField类型。

For just ca. 对于只是。 30 blobs (with about 1 kB of XML altogether), that function takes 5 seconds to return, and uses over 30000 api_cpu_ms. 30个Blob(总共约有1 kB的XML),该函数需要5秒钟才能返回,并使用了30000多个api_cpu_ms。 CPU time should equivalent to the amount of work a 1,2 GHz Intel x86 processor could do in that time, but I am pretty sure it would not take 30 seconds to insert 30 rows to a database for any x86 processor available. CPU时间应相当于那段时间1,2 GHz Intel x86处理器可以完成的工作量,但是我很确定,对于任何可用的x86处理器,在数据库中插入30行不会花费30秒。

Without saving objects to database (that is, just parsing the XML and throwing away the result) the request takes merely milliseconds. 在不将对象保存到数据库的情况下(即,仅解析XML并丢弃结果),该请求仅需毫秒。

So should Google App Engine really be so slow, that I can't save even a few dozen entities to the Datastore in a normal request, or am I missing something here? 那么Google App Engine真的应该这么慢,以至于我不能在正常请求中将几十个实体保存到数据存储中吗,还是我在这里丢失了一些东西? And of course, even if I would do the inserts in some Backend or by using a Task Queue, it would still cost hundreds of times more that what would seem acceptable. 当然,即使我会在某些后端或使用任务队列进行插入,它的成本仍然比看起来可接受的价格高出数百倍。

Edit: I found out, that by default, GAE does two index writes per property for each entity. 编辑:我发现,默认情况下,GAE对每个实体的每个属性执行两次索引写入。 Most of those properties should not be indexed, so the question is: how can I set properties unindexed on Django-nonrel? 这些属性中的大多数都不应该建立索引,因此问题是:如何在Django-nonrel上设置未建立索引的属性?

I still do feel though, that even with index writes, the database operation is taking ridiculous amount of time. 我仍然确实感觉到,即使使用索引写入,数据库操作也要花费大量的时间。

In the absence of batch operations, there's not much you can do to reduce wallclock times. 在没有批处理操作的情况下,您无法做很多事情来减少挂钟时间。 Batch operations are pretty essential to reducing wallclock time on App Engine (or any distributed platform with RPCs, really). 批处理操作对于减少App Engine(或实际上具有RPC的任何分布式平台)上的时钟时间非常重要。

Under the current billing model, CPU milliseconds reported by the datastore reflect the cost of the operation rather than the actual time it took, and are a way of billing for resources. 在当前的计费模式下,数据存储区报告的CPU毫秒反映的是操作成本,而不是实际花费的时间,是一种资源计费方式。 Under the new billing model, these will be billed explicitly as datastore operations, instead. 在新的计费模式下,这些将显式地作为数据存储区操作计费。

I have not found a real answer yet, but I made some calculations for the cost. 我还没有找到真正的答案,但是我对成本进行了一些计算。 Currently every indexed property field costs around $0.20 to $0.30 per 10k inserts. 目前,每个索引属性字段的成本约为每1万次插入$ 0.20至$ 0.30。 With the upcoming billing model ( Pricing FAQ ) the cost will be exactly $0.1 per 100k operations, or $0.2 per indexed field per 100k inserts with 2 index write operations per insert. 使用即将到来的计费模型( 定价常见问题解答 ),成本将精确为每10万次操作$ 0.1,或每10万次插入每个索引字段0.2美元,每个插入有2个索引写入操作。

So as the price seems to go down by a factor of ten, the observed slowness is indeed unexpected behaviour. 因此,当价格似乎下降了十分之一时,观察到的缓慢确实是意料之外的行为。 As the free quota is well enough for my test runs, and the with new pricing model coming, I wont let it bother me at this time. 由于免费配额足以满足我的测试需求,并且随着新的定价模式的到来,因此我暂时不会让我感到困扰。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM