简体   繁体   中英

Query all data in DynamoDB without using primary partition key

I am very new to DynamoDB. I want to query all data within certain time range.

column = "timerange" is primary sort key
column = "name" is primary partition key.

I want to get all data within 2 timerange.

This is my query.

from decimal import *
from boto3.dynamodb.conditions import Key, Attr
import boto3
import time
from datetime import datetime

dynamo = boto3.resource('dynamodb')
table = dynamo.Table('tablename')
a = time.mktime(datetime.strptime('2020-03-26 14:29:10','%Y-%m-%d %H:%M:%S').timetuple())
b = time.mktime(datetime.strptime('2020-03-26 14:30:10','%Y-%m-%d %H:%M:%S').timetuple())

response = table.query(
    KeyConditionExpression =
        Key('timerange').between(Decimal(a), Decimal(b)))

Which gives me an error ClientError: An error occurred (ValidationException) when calling the Query operation: Query condition missed key schema element: After searching the internet I found out that you need to have primary partition key inside your query so I tried the Contains method from https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Condition.html

response = table.query(
    KeyConditionExpression =
        Key('name').contains('a', 'b') &
        Key('timerange').between(Decimal(a), Decimal(b)))

Which I clearly do not understand fully.

How can I get all data within given timerange [a,b]?

You cannot solve this kind of problem easily in DynamoDB, at least not in the general case that would allow you to make one query, and only one query, to get all records within an arbitrary date range, regardless of name (the partition key).

DynamoDB is a key/value database. Your queries are typically against individual keys, optionally with a range of values for a sort key. A query for name=A and timestamp between X and Y is perfect and can be queried very efficiently.

To do what you want, you would most typically create a Global Secondary Index whose primary key was a composite of:

  • YYMMDD of the timestamp
  • the timestamp

Now, you can query for items with timestamp in a certain range, regardless of name , but they must be on the same date . If you needed this query to work more broadly, say with ranges up to a month, then your GSI would have a primary key that was a composite of:

  • YYMM of the timestamp
  • the timestamp

And now you can query on all items within a given range of dates/times in the same month.

Here are a couple of useful resources:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM