简体   繁体   中英

How to query DynamoDB GSI with compound conditions

I have a DynamoDB table called 'frank' with a single GSI. The partition key is called PK, the sort key is called SK, the GSI partition key is called GSI1_PK and the GSI sort key is called GSI1_SK. I have a single 'data' map storing the actual data.

Populated with some test data it looks like this: 在此处输入图像描述

The GSI partition key and sort key map directly to the attributes with the same names within the table.

I can run a partiql query to grab the results that are shown in the image. Here's the partiql code:

select PK, SK, GSI1_PK, GSI1_SK, data from "frank"."GSI1"
where 
("GSI1_PK"='tesla')
and 
(
 (  "GSI1_SK" >= 'A_VISITOR#2021-06-01-00-00-00-000' and  "GSI1_SK" <= 'A_VISITOR#2021-06-20-23-59-59-999' )
 or
 (  "GSI1_SK" >= 'B_INTERACTION#2021-06-01-00-00-00-000'  and   "GSI1_SK" <= 'B_INTERACTION#2021-06-20-23-59-59-999' )
)

Note how the partiql code references "GSI1_SK" multiple times. The partiql query works, and returns the data shown in the image. All great so far.

However, I now want to move this into a Lambda function. How do I structure a AWS.DynamoDB.DocumentClient query to do exactly what this partiql query is doing?

I can get this to work in my Lambda function:

const visitorStart="A_VISITOR#2021-06-01-00-00-00-000";
        const visitorEnd="A_VISITOR#2021-06-20-23-59-59-999";
        
        var params = {
          TableName: "frank",
          IndexName: "GSI1",
          KeyConditionExpression: "#GSI1_PK=:tmn AND #GSI1_SK BETWEEN :visitorStart AND :visitorEnd",
          ExpressionAttributeNames :{  "#GSI1_PK":"GSI1_PK", "#GSI1_SK":"GSI1_SK" },
          ExpressionAttributeValues: {
            ":tmn": lowerCaseTeamName,
            ":visitorStart": visitorStart,
            ":visitorEnd": visitorEnd
          }
        };
        
        const data = await documentClient.query(params).promise();
        console.log(data); 

But as soon as I try a more complex compound condition I get this error:

ValidationException: Invalid operator used in KeyConditionExpression: OR

Here is the more complex attempt:

const visitorStart="A_VISITOR#2021-06-01-00-00-00-000";
        const visitorEnd="A_VISITOR#2021-06-20-23-59-59-999";
        const interactionStart="B_INTERACTION#2021-06-01-00-00-00-000";
        const interactionEnd="B_INTERACTION#2021-06-20-23-59-59-999";
        
        var params = {
          TableName: "frank",
          IndexName: "GSI1",
          KeyConditionExpression: "#GSI1_PK=:tmn AND (#GSI1_SK BETWEEN :visitorStart AND :visitorEnd OR #GSI1_SK BETWEEN :interactionStart AND :interactionEnd) ",
          ExpressionAttributeNames :{  "#GSI1_PK":"GSI1_PK", "#GSI1_SK":"GSI1_SK" },
          ExpressionAttributeValues: {
            ":tmn": lowerCaseTeamName,
            ":visitorStart": visitorStart,
            ":visitorEnd": visitorEnd,
            ":interactionStart": interactionStart,
            ":interactionEnd": interactionEnd
          }
        };
        
        const data = await documentClient.query(params).promise();
        console.log(data);  

The docs say that KeyConditionExpressions don't support 'OR'. So, how do I replicate my more complex partiql query in Lambda using AWS.DynamoDB.DocumentClient?

If you look at the documentation of PartiQL for DynamoDB they do warn you, that PartiQL has no scruples to use a full table scan to get you your data: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ql-reference.select.html#ql-reference.select.syntax

To ensure that a SELECT statement does not result in a full table scan, the WHERE clause condition must specify a partition key. Use the equality or IN operator.

In those cases PartiQL would run a scan and use a FilterExpression to filter out the data.

Of course in your example you provided a partition key, so I'd assume that PartiQL would run a query with the partition key and a FilterExpression to apply the rest of the condition.

You could replicate it that way, and depending on the size of your partitions this might work just fine. However, if the partition will grow beyond 1MB and most of the data would be filtered out, you'll need to deal with pagination even though you won't get any data.

Because of that I'd suggest you to simply split it up and run each or condition as a separate query, and merge the data on the client.

Unfortunately, DynamoDB does not support multiple boolean operations in the KeyConditionExpression . The partiql query you are executing is probably performing a full table scan to return the results.

If you want to replicate the partiql query using the DocumentClient, you could use the scan operation. If you want to avoid using scan , you could perform two separate query operations and join the results in your application code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM