简体   繁体   English

使用新索引更新完整的 dynamodb 表

[英]Updating the complete dynamodb table with new index

I am using dynamo db having around 15000 items.我正在使用大约 15000 个项目的 dynamo db。 Each item has 4 indexes "url","date","html_data","org".每个项目有 4 个索引 "url","date","html_data","org"。 I added one more index named "base_url" to this table.我在此表中添加了一个名为“base_url”的索引。

Here the index "url" contains links of websites like https://stackoverflow.com/questions/ask , https://www.goal.com/en/news/neville-calls-man-utd-bunch-of-whingebags-ronaldo-seen/blt71f0824c3e8eaf1e etc.这里的索引“url”包含网站的链接,如https://stackoverflow.com/questions/ask、https ://www.goal.com/en/news/neville-calls-man-utd-bunch-of-whingebags -ronaldo-seen/blt71f0824c3e8eaf1e

and I need to update the new index "base_url" with the base urls of links present in index "url".我需要使用索引“url”中存在的链接的基本 url 更新新索引“base_url”。

Here I am referring base url like https://stackoverflow.com/ , https://www.goal.com/这里我指的是基础 url 像https://stackoverflow.com/https://www.goal.com/

I can update each of these individually but how can it be done for the complete 15000 items.我可以单独更新其中的每一个,但如何为完整的 15000 个项目完成。 I found there is batchwrite item but didnt find anything like batchupdate.我发现有batchwrite 项目,但没有找到像batchupdate 这样的东西。

I am using python boto3 for doing this.我正在使用 python boto3 来执行此操作。

response = table.get_item(
    Key={
        'url': 'https://stackoverflow.com/questions/ask',
        "date" : "2021-12-28"
    }
)
item = response['Item']
print(item)

and

table.update_item(
    Key={
        'url': 'https://stackoverflow.com/questions/ask',
        "date" : "2021-12-28"
    },
    UpdateExpression='SET base_url = :val1',
    ExpressionAttributeValues={
        ':val1': "https://stackoverflow.com/"
    }
)

Base url can be obtained using基础 url 可以使用

url = "https://stackoverflow.com/questions/ask"
print(requests.urllib3.util.parse_url(url)).path)

useawswrangler to insert your data either as a dataframe using put_df or a list of dict using put_items使用awswrangler将您的数据插入为 dataframe 使用put_df或使用put_items的字典列表

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM