Dynamo DB 扫描查询以获取 JSON 格式的输出文件

Question

I am quite new to Dynamo DB using boto3.我对使用 boto3 的 Dynamo DB 很陌生。 I would like to: obtain a scan of all the rows in Dynamo DB and store it in JSON format, in a file, for additional data processing.我想：获取 Dynamo DB 中所有行的扫描并将其以JSON格式存储在文件中，以进行额外的数据处理。

I am presently using the script shown below to fetch the details (pagination will be involved) :我目前正在使用下面显示的脚本来获取详细信息（将涉及分页）：

from __future__ import print_function # Python 2/3 compatibility
import boto3
import json
import decimal
from boto3.dynamodb.conditions import Key, Attr

# Helper class to convert a DynamoDB item to JSON.
class DecimalEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, decimal.Decimal):
            if o % 1 > 0:
                return float(o)
            else:
                return int(o)
        return super(DecimalEncoder, self).default(o)

dynamodb = boto3.resource('dynamodb')

table = dynamodb.Table('Movies')

#fe = Key('year').between(1951, 1964)
pe = "#yr, title, info.rating"
# Expression Attribute Names for Projection Expression only.
ean = { "#yr": "year", }
esk = None


response = table.scan(
#    FilterExpression=fe,
    ProjectionExpression=pe,
    ExpressionAttributeNames=ean
    )

for i in response['Items']:
    print(json.dumps(i, cls=DecimalEncoder)) 

while 'LastEvaluatedKey' in response:
    response = table.scan(
        ProjectionExpression=pe,
#        FilterExpression=fe,
        ExpressionAttributeNames= ean,
        ExclusiveStartKey=response['LastEvaluatedKey']
        )

    for i in response['Items']:
        print(json.dumps(i, cls=DecimalEncoder),)

This gives me sample output of 5000 rows:这给了我 5000 行的示例输出：

{"info": {"rating": 6.5}, "year": 2004, "title": "The Polar Express"}
{"info": {"rating": 5.7}, "year": 2004, "title": "The Prince & Me"}
{"info": {"rating": 5.3}, "year": 2004, "title": "The Princess Diaries 2: Royal Engagement"}
{"info": {"rating": 6.3}, "year": 2004, "title": "The Punisher"}
{"info": {"rating": 6.8}, "year": 2004, "title": "The SpongeBob SquarePants Movie"}

I am unable to get the output in the desired format (as shown below).我无法获得所需格式的输出（如下所示）。 I am expecting a file here.我在这里期待一个文件。

[
    {"info": {"rating": 6.5}, "year": 2004, "title": "The Polar Express"},
    {"info": {"rating": 5.7}, "year": 2004, "title": "The Prince & Me"},
    {"info": {"rating": 5.3}, "year": 2004, "title": "The Princess Diaries 2: Royal Engagement"},
    {"info": {"rating": 6.3}, "year": 2004, "title": "The Punisher"},
    {"info": {"rating": 6.8}, "year": 2004, "title": "The SpongeBob SquarePants Movie"}
]

Could anyone please provide me some hints or pointers on how to investigate this further?任何人都可以向我提供一些有关如何进一步调查的提示或指示吗？

Answer 1

Isn't response['Items'] not the list you are looking for? response['Items'] 不是您要查找的列表吗？ Response is a dictionnary with the list inside, it can be only a subset so you need to iterate over multiple responses if needed. Response 是一个包含列表的字典，它只能是一个子集，因此您需要根据需要迭代多个响应。

Reference: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Client.scan参考： https : //boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Client.scan

Dynamo DB 扫描查询以获取 JSON 格式的输出文件

问题描述

1 个解决方案

解决方案1
0 2020-02-04 10:01:50

Dynamo DB 扫描查询以获取 JSON 格式的输出文件

问题描述

1 个解决方案

解决方案1 0 2020-02-04 10:01:50

解决方案1
0 2020-02-04 10:01:50