繁体   English   中英

使用分区键和排序键使用 bash 删除 DynamoDB 表中的所有项目

[英]Delete all items in a DynamoDB table using bash with both partition and sort keys

我正在尝试在 bash 中使用 AWS CLI 删除具有分区键和排序键的 DynamoDB 表中的所有项目。 到目前为止我发现的最好的东西是:

aws dynamodb scan --table-name $TABLE_NAME --attributes-to-get "$KEY" \
--query "Items[].$KEY.S" --output text | \
tr "\t" "\n" | \
xargs -t -I keyvalue aws dynamodb delete-item --table-name $TABLE_NAME \
--key "{\"$KEY\": {\"S\": \"keyvalue\"}}"

但这不适用于同时具有分区键和排序键的表,而且我还无法使其适用于这样的表。 知道如何修改脚本以使其适用于具有复合键的表吗?

根据您桌子的大小,这可能太昂贵并导致停机。 请记住,删除的成本与写入的成本相同,因此您将受到预配置的 WCU 的限制。 删除并重新创建表会更简单、更快

# this uses jq but basically we're just removing 
# some of the json fields that describe an existing 
# ddb table and are not actually part of the table schema/defintion
aws dynamodb describe-table --table-name $table_name | jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .ProvisionedThroughput.NumberOfDecreasesToday)' > schema.json
# delete the table
aws dynamodb delete-table --table-name $table_name
# create table with same schema (including name and provisioned capacity)
aws dynamodb create-table --cli-input-json file://schema.json

如果你真的想要你可以单独删除每个项目并且你在正确的轨道上你只需要在你的扫描投影和删除命令中指定哈希和范围键。

aws dynamodb scan \
  --attributes-to-get $HASH_KEY $RANGE_KEY \
  --table-name $TABLE_NAME --query "Items[*]" \
  # use jq to get each item on its own line
  | jq --compact-output '.[]' \
  # replace newlines with null terminated so 
  # we can tell xargs to ignore special characters 
  | tr '\n' '\0' \
  | xargs -0 -t -I keyItem \
    # use the whole item as the key to delete (dynamo keys *are* dynamo items)
    aws dynamodb delete-item --table-name $TABLE_NAME --key=keyItem

如果你想变得超级花哨,你可以使用describe-table调用来获取哈希和范围键来填充$HASH_KEY$RANGE_KEY但我会把它留给你作为练习。

更正@Cheruvian 发布的内容。 以下命令有效,在创建 schema.json 时我们需要排除更多字段。

aws dynamodb describe-table --table-name $table_name | jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .LatestStreamArn, .LatestStreamLabel, .ProvisionedThroughput.NumberOfDecreasesToday, .ProvisionedThroughput.LastIncreaseDateTime)' > schema.json

aws dynamodb delete-table --table-name $table_name

aws dynamodb create-table --cli-input-json file://schema.json

如果您对使用 Node.js 感兴趣,请查看此示例(我在这里使用的是 TypeScript)。 更多相关信息可以在AWS 文档中找到。

import AWS from 'aws-sdk';
const DynamoDb = new AWS.DynamoDB.DocumentClient({
region: 'eu-west-1'

});
export const getAllItemsFromTable = async TableName => {
   const Res = await DynamoDb.scan({ TableName }).promise();
   return Res.Items;
};

export const deleteAllItemsFromTable = async (TableName = '', items:{ id: string }, hashKey) => {
  var counter = 0;
  //split items into patches of 25
  // 25 items is max for batchWrite
  asyncForEach(split(items, 25), async (patch, i) => {
    const RequestItems = {
      TableName: patch.map(item => {
        return {
          DeleteRequest: {
            Key: {
              id: item.id
            }
          }
        };
      })
    };
    await DynamoDb.batchWrite({ RequestItems }).promise();
    counter += patch.length;
    console.log('counter : ', counter);
  });
};

function split(arr, n) {
  var res = [];
  while (arr.length) {
    res.push(arr.splice(0, n));
  }
  return res;
}

async function asyncForEach(array, callback) {
  for (let index = 0; index < array.length; index++) {
    await callback(array[index], index, array);
  }
}

const tableName = "table"
// assuming table hashKey is named "id"
deleteAllItemsFromTable(tableName,getAllItemsFromTable(tableName))

我们有一些带有索引的表,因此必须删除更多字段,另外还有“.ProvisionedThroughput.LastDecreaseDateTime”。 因为我对 jq 完全陌生,所以需要做一点工作;-) 但这就是它对我们的工作方式:

    aws dynamodb describe-table --table-name $table_name | jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .LatestStreamArn, .LatestStreamLabel, .ProvisionedThroughput.NumberOfDecreasesToday, .ProvisionedThroughput.LastIncreaseDateTime, .ProvisionedThroughput.LastDecreaseDateTime, .GlobalSecondaryIndexes[].IndexSizeBytes, .GlobalSecondaryIndexes[].ProvisionedThroughput.NumberOfDecreasesToday, .GlobalSecondaryIndexes[].IndexStatus, .GlobalSecondaryIndexes[].IndexArn, .GlobalSecondaryIndexes[].ItemCount)' > schema.json

从这里的@Adel 和@codeperson 答案中,我使用 Amplify CLI(使用 Hello World 模板)创建了一个函数,其中必须使用事件对象传递表名:

/* Amplify Params - DO NOT EDIT
    API_DEALSPOON_GRAPHQLAPIENDPOINTOUTPUT
    API_DEALSPOON_GRAPHQLAPIIDOUTPUT
Amplify Params - DO NOT EDIT */

const AWS = require('aws-sdk')
const environment = process.env.ENV
const region = process.env.REGION
const apiDealspoonGraphQLAPIIdOutput = process.env.API_DEALSPOON_GRAPHQLAPIIDOUTPUT

exports.handler = async (event) => {

    const DynamoDb = new AWS.DynamoDB.DocumentClient({region});

    // const tableName = "dev-invite";
    // const hashKey = "InviteToken";
    let {tableName, hashKey} = event
    
    tableName = `${tableName}-${apiDealspoonGraphQLAPIIdOutput}'-'${environment}`
    
    // Customization 4: add logic to determine which (return true if you want to delete the respective item)
    // If you don't want to filter anything out, then just return true in this function (or remove the filter step below, where this filter is used)
    const shouldDeleteItem = (item) => {
        return item.Type === "SECURE_MESSAGE" || item.Type === "PATIENT";
    };

    const getAllItemsFromTable = async (lastEvaluatedKey) => {
        const res = await DynamoDb.scan({
            TableName: tableName,
            ExclusiveStartKey: lastEvaluatedKey
        }).promise();
        return {items: res.Items, lastEvaluatedKey: res.LastEvaluatedKey};
    };

    const deleteAllItemsFromTable = async (items) => {
        let numItemsDeleted = 0;
        // Split items into patches of 25
        // 25 items is max for batchWrite
        await asyncForEach(split(items, 25), async (patch, i) => {
            const requestItems = {
                [tableName]: patch.filter(shouldDeleteItem).map(item => {
                    numItemsDeleted++;
                    return {
                        DeleteRequest: {
                            Key: {
                                [hashKey]: item[hashKey]
                            }
                        }
                    };
                })
            };
            if (requestItems[tableName].length > 0) {
                await DynamoDb.batchWrite({RequestItems: requestItems}).promise();
                console.log(`finished deleting ${numItemsDeleted} items this batch`);
            }
        });

        return {numItemsDeleted};
    };

    function split(arr, n) {
        const res = [];
        while (arr.length) {
            res.push(arr.splice(0, n));
        }
        return res;
    }

    async function asyncForEach(array, callback) {
        for (let index = 0; index < array.length; index++) {
            await callback(array[index], index, array);
        }
    }

    let lastEvaluatedKey;
    let totalItemsFetched = 0;
    let totalItemsDeleted = 0;

    console.log(`------ Deleting from table ${tableName}`);

    do {
        const {items, lastEvaluatedKey: lek} = await getAllItemsFromTable(lastEvaluatedKey);
        totalItemsFetched += items.length;
        console.log(`--- a group of ${items.length} was fetched`);

        const {numItemsDeleted} = await deleteAllItemsFromTable(items);
        totalItemsDeleted += numItemsDeleted;
        console.log(`--- ${numItemsDeleted} items deleted`);

        lastEvaluatedKey = lek;
    } while (!!lastEvaluatedKey);

    console.log("Done!");
    console.log(`${totalItemsFetched} items total fetched`);
    console.log(`${totalItemsDeleted} items total deleted`);
};

我创建了一个节点模块来执行此操作:

https://www.npmjs.com/package/dynamodb-empty

yarn global add dynamodb-empty
dynamodb-empty --table tableName

我在这里使用了一些示例并创建了一个代码,该代码实际上采用参数,删除并重新创建表......工作正常:

TABLE_NAME='<your_table_name>' ;\
aws dynamodb describe-table --table-name $TABLE_NAME \
|jq '.Table + .Table.BillingModeSummary
|del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, 
.TableStatus, .ProvisionedThroughput, .BillingModeSummary, 
.LastUpdateToPayPerRequestDateTime, .GlobalSecondaryIndexes[].IndexStatus, 
.GlobalSecondaryIndexes[].IndexSizeBytes, 
.GlobalSecondaryIndexes[].ItemCount, .GlobalSecondaryIndexes[].IndexArn, 
.GlobalSecondaryIndexes[].ProvisionedThroughput)' > tb_schema.json
aws dynamodb delete-table --table-name $TABLE_NAME
aws dynamodb create-table --cli-input-json file://tb_schema.json

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM