简体   繁体   中英

DynamoDB scan returns multiple scan results

So I've written the below function. This version is a bit abridged and I've anonymized the data but the critical components are there.

The function basically takes in a list of parameters from an API-Gateway call, queries a db for each of them then returns the results.

I'm finding that the scan runs perfectly with one parameter, but returns duplicate data when more than 1 are called . From the logs I can see that the scans are running multiple times when multiple params are passed

For example, with one param the function logs return

2020-03-19 20:27:42.974 Starting the 0 scan with 3 as the id 
2020-03-19 20:27:43.047 The 0 scan has completed successfully

With two params the logs are

2020-03-19 20:28:42.189 Starting the 0 scan with 2 as the id
2020-03-19 20:28:42.261 The 0 scan has completed successfully
2020-03-19 20:28:42.262 Starting the 1 scan with 3 as the id
2020-03-19 20:28:42.267 The 0 scan has completed successfully
2020-03-19 20:28:42.293 The 1 scan has completed successfully

And with 3 params the logs are

2020-03-19 20:29:49.209 Starting the 0 scan with 1 as the id
2020-03-19 20:29:49.323 The 0 scan has completed successfully
2020-03-19 20:29:49.325 Starting the 1 scan with 2 as the id
2020-03-19 20:29:49.329 The 0 scan has completed successfully
2020-03-19 20:29:49.380 The 1 scan has completed successfully
2020-03-19 20:29:49.381 Starting the 2 scan with 3 as the id
2020-03-19 20:29:49.385 The 1 scan has completed successfully
2020-03-19 20:29:49.437 The 2 scan has completed successfully

Here is the code that runs the for loop and the scan. I've hardcoded the parameters and excluded some non-pertinent stuff

     const params = ['1','2','3'];
     for (let i = 0; i < params.length; i++) {
      console.log("Starting the " + i + " scan with " + params[i] + " as the scan parameter")
      const scanParams = {
      TableName: "Dynamo_Table",
      FilterExpression: "Org = :Org",
      ExpressionAttributeValues: { ":Org": params[i] },
      ProjectionExpression: "User_ID, Org, first_name, last_name"
     };
     await dynamoClient.scan(scanParams, function(err, data) {
      if (err) {
        console.log("data retrival failed, error logged is :" + err);
        return err;
      }
      else {
        console.log("The " + i +" scan has completed successfully")
        //console.log("data retrival successful: " + JSON.stringify(data));
        userData = userData.concat(data.Items)
        //console.log("partial data structure is " + data)
      }
    }).promise();
  }
      responseData = JSON.stringify(userData)
      console.log("Complete response is " + responseData)
      console.log("data after execution scan is " + data)

I've tried to force the program to wait on the scan's competition by defining a wait and using AWS's .promise() function. However, these don't seem to be blocking the thread execution. I'm not sure exactly why its launching multiple scans though. The for loop isn't running more times than it should, so why is the search function getting called?

Whenever you want to search something in your DynamoDB database it's recommended that you use the Query option instead of Scan

This is because the Scan reads each and every item of the database whereas Query only looks for the mentioned Hask key (primary key).

If you want to look for data with a particular "attribute" in your mind you can use Global Secondary Index wherein you can set the "attribute" as the Hash key and at the same time pick a Sort key of your choice. This might solve your problem wherein the table is returning the answer multiple times.

Here is an example of how to use the DynamoDB DocumentClient to query multiple items by partition key and collect the results. This uses the promisified variant of the query() call, and waits for all query promises to be fulfilled using Promise.all() .

var AWS = require('aws-sdk');
AWS.config.update({ region: 'us-east-1' });

const dc = new AWS.DynamoDB.DocumentClient();

// Array of organization IDs we want to query
const orgs = ['1', '2', '3'];

// Async function to query for one specific organization ID
const queryOrg = async org => {
  const params = {
    TableName: 'orgs',
    KeyConditionExpression: 'org = :o1',
    ExpressionAttributeValues: { ':o1': org, },
  };

  return dc.query(params).promise();
}

// Async IIFE because you cannot use await outside of an async function
(async () => {
  // Array of promises representing async organization queries made
  const promises = orgs.map(org => queryOrg(org));

  // Wait for all queries to complete and collect the results in an array
  const items = await Promise.all(promises);

  // Results are present in the same order that the queries were mapped
  for (const item of items) {
    console.log('Item:', item.Items[0]);
  }
})();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM