简体   繁体   中英

Google App Engine DB Query Memory Usage

When I run a query on a large set of small objects (15k objects with only a few short string and boolean properties), without doing anything with these objects, I see my instance's memory usage continuously increasing (70Mb increase). The memory increase doesn't look proportional to the amount of data it ever needs to keep in memory for just the query.

The loop I use is the following:

cursor = None
while True:
  query = MyModel.all()
  if cursor:
    query.with_cursor(cursor)
  fetched = 0
  for result in query.run(batch_size = 500):
    fetched += 1

    # Do something with 'result' here. Actually leaving it empty for 
    # testing to be sure I don't retain anything myself

    if fetched == 500:
      cursor = query.cursor()
      break
  else:
    break

To be sure this is not due to appstats, I call appstats.recording.dont_record() to not record any stats.

Does anyone have any clue what might be going on? Or any pointers on how to debug/profile this?

Update 1 : I turned on gc.set_debug(gc.DEBUG_STATS) on the production code, and I see the garbage collector being called regularly, so it is trying to collect garbage. When I call a gc.collect() at the end of the loop (also the end of the request); it returns 0 , and doesn't help.

Update 2 : I did some hacking to get guppy to work on dev_appserver, and this seemed to point that, after an explicit gc.collect() at the end of the loop, most of the memory was consumed by a 'dict of google.appengine.datastore.entity_pb.Property'.

Each model entity has some over head.

You query returns objects as Protobufs for starters.

So you will a series of batched protobufs for the result set.

Then it is decoded. Each decoded entity includes the property names as well as the data for each entity. You have 15K entities. How big are your property names for instance.

So you have at least two copies of the result set in memory in various forms (possibly more), not including anything else you do with instances of the model class.

You code/loop has no opportunity for garbage collections, and that can/will happen later.

Have a look at tools like apptrace to help memory profiling.

我已将此问题报告给App Engine团队,他们似乎确认这实际上是一个问题(怀疑是与游标的处理有关)。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM