[英]How to set retry delay options for DynamoDB using Boto3 with Python?
我試圖通過使用“base”選項設置自定義指數退避來避免ProvisionedThroughputExceededException ,就像我們可以根據這個答案在 JavaScript 中做的那樣:
AWS.config.update({
maxRetries: 15,
retryDelayOptions: {base: 500}
});
正如本文檔中所解釋的,“base”參數定義了用於增加延遲的數字,因此“base: 500”會像這樣增加延遲:500、1000、1500...
我正在嘗試使用 Python 3.8 對 Boto3 進行相同的設置,但Boto3 文檔似乎只允許設置最大重試次數,而不是延遲選項:
from botocore.client import Config
config = Config(
retries = {
'max_attempts': 10,
'mode': 'standard'
}
)
“mode”選項僅獲得三個值:“legacy”、“standard”和“adaptive”。 該文檔還提到了一個_retry.json文件,其中描述了這些選項,並且看起來“base”選項在此文件中被硬編碼:
"dynamodb": {
"__default__": {
"max_attempts": 10,
"delay": {
"type": "exponential",
"base": 0.05,
"growth_factor": 2
}
所以,我的問題是:有沒有辦法使用 Boto3 和 Python 為 DynamoDB 設置指數退避?
不幸的是,這是為 Boto3 硬編碼的,並且無法使用 Python SDK 修改基本重試延遲。 除非您圍繞 SDK 調用編寫自己的包裝器,否則這是不可能的。
可能值得在公共存儲庫中創建一個問題以供拾取或直接貢獻。
正如@Ermiya所提到的,我必須自己實現它。 我不想修改 boto3 默認設置,所以我們必須捕獲異常並從它停止的地方繼續分頁。
我們可以這樣做:
from time import sleep
import boto3
from boto3.dynamodb.conditions import Key
dynamodb = boto3.resource('dynamodb', region_name='us-west-2')
db_table = dynamodb.Table('<my_table_name>')
query_params = {
"TableName": '<my_dynamodb_table>',
"IndexName": 'movies-index',
"KeyConditionExpression": Key('movies').eq('thriller'),
}
retries = 1
max_retries = 6
while True:
if retries >= max_retries:
raise Exception(f'ProvisionedThroughputExceededException: max_retries was reached: {max_retries}')
try:
page = db_table.query(**selected_table_type)
except ClientError as err:
if 'ProvisionedThroughputExceededException' not in err.response['Error']['Code']:
raise
sleep(2 ** retries)
retries += 1
continue
else:
retries = 1
yield from page.get('Items')
if page.get('LastEvaluatedKey'):
selected_table_type.update(
ExclusiveStartKey=page['LastEvaluatedKey']
)
sleep(2)
else:
break
from time import sleep
import boto3
from boto3.dynamodb.conditions import Key
from boto3.dynamodb.conditions import ConditionExpressionBuilder
from boto3.dynamodb.types import TypeSerializer, TypeDeserializer
db_client = boto3.client('dynamodb', region_name='us-west-2')
td = TypeDeserializer()
ts = TypeSerializer()
query_params = {
"TableName": '<my_dynamodb_table>',
"IndexName": 'movies-index',
"KeyConditionExpression": Key('movies').eq('thriller'),
}
builder = ConditionExpressionBuilder()
condition = query_params["KeyConditionExpression"]
expr = builder.build_expression(condition, is_key_condition=True)
query_params.update({
"KeyConditionExpression": expr.condition_expression,
"ExpressionAttributeNames": expr.attribute_name_placeholders,
"ExpressionAttributeValues": {k: ts.serialize(v) for k, v in expr.attribute_value_placeholders.items()},
})
total = 0
paginator = db_client.get_paginator('query')
pages = paginator.paginate(**query_params)
retries = 1
max_retries = 6
while True:
if retries >= max_retries:
raise Exception(f'ProvisionedThroughputExceededException: max_retries was reached: {max_retries}')
try:
for page in pages:
retries = 1
next_token = page.get('NextToken')
for db_item in page.get('Items'):
db_item = {k: td.deserialize(v) for k, v in db_item.items()}
yield db_item
total += page.get('Count')
print(f"{total=}", end='\r')
except ClientError as err:
if 'ProvisionedThroughputExceededException' not in err.response['Error']['Code']:
raise
query_params.update(StartingToken=next_token)
sleep(2 ** retries)
retries += 1
else:
break
print(f"{total=}")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.