简体   繁体   中英

Hundreds of parallel dynamodb queries

I'm trying to find the best practice for doing hundreds of parallel dynamodb queries for one single request. I am currently using python, but I'm open to any languages and frameworks that works best for this use case. Here is basically what I want to do, I shortened it to only 4 values here but in the end I would like it to query 500 at once.

import boto3
import time
from boto3.dynamodb.conditions import Key

variables = {'random1':None,'random2':None,'random3':None,'random500':None}

table = boto3.resource('dynamodb','eu-west-1').Table('sometable')
for v in variables:
    variables[v]=table.query(KeyConditionExpression=Key('k').eq(v),Select='COUNT')['Count']

print(variables)
# expected output: {'random1': 12, 'random2': 30, 'random3': 230, 'random500': 5}

So I'm doing select count queries to get the distinct count for each key in the table. The output of this "function" is something I need to return in the service. For each of these queries, the response time is great, like 40ms. But obviously, running this sequentially will scale linearly which doesn't work as I would want to end up with a sub 150ms (maximum) for all of these 500 variables.

Has anyone done anything similar? Any advice would be greatly appreciated!

My advice would be to not do this.

If you need aggregations in DDB, the the preferred approach would be to enable streams and have a Lamba update/write an aggregation entry in the existing table (or a new one).

Here's a good article... Real-Time Aggregation with DynamoDB Streams

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM