简体   繁体   中英

Google AppEngine - Big datastore reads

I need to read all the entries in a Google AppEngine datastore to do some initialization work. There are a lot of entities (80k currently) and this continues to grow. I'm starting to hit the 30 second datastore query timeout limit.

Are there any best practices for how to shard these types of huge reads in the datastore? Any examples?

You can tackle this in several ways:

  1. Execute your code on Task Queue which has 10min timeout instead of 30s (more like 60s in practice). The easiest way to do this is via DeferredTask .

    Warning : DeferredTask must be serializable, so it's hard to pass it complex data. Also dont make it an inner class.

  2. See backends . Requests served by backend instance do not have time limit.

  3. Finally, if you need to break-up a big task and execute in parallel than look at mapreduce .

This answer on StackExchange served me well:

Expired queries and appengine

I had to slightly modify it to work for me:

def loop_over_objects_in_batches(batch_size, object_class, callback):

    num_els = object_class.count() 
    num_loops = num_els / batch_size
    remainder = num_els - num_loops * batch_size
    logging.info("Calling batched loop with batch_size: %d, num_els: %s, num_loops: %s, remainder: %s, object_class: %s, callback: %s," % (batch_size, num_els, num_loops, remainder, object_class, callback))    
    offset = 0
    while offset < num_loops * batch_size:
        logging.info("Processing batch (%d:%d)" % (offset, offset+batch_size))
        query = object_class[offset:offset + batch_size]
        for q in query:
            callback(q)

        offset = offset + batch_size

    if remainder:
        logging.info("Processing remainder batch (%d:%d)" % (offset, num_els))
        query = object_class[offset:num_els]
        for q in query:
            callback(q)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM