从 Python 中的 S3 存储桶中读取 xml 个文件 - 仅存储最后一个文件的内容

Question

I have 4 XML files inside the S3 bucket directory.我在 S3 存储桶目录中有 4 个 XML 文件。 When I'm trying to read the content of all the files, I find that only the content of the last file (XML4) is getting stored.当我尝试读取所有文件的内容时，我发现只存储了最后一个文件 (XML4) 的内容。

s3_bucket_name='test'
bucket=s3.Bucket(s3_bucket_name)
bucket_list = []
for file in bucket.objects.filter(Prefix = 'auto'):
    file_name=file.key
    if file_name.find(".xml")!=-1:
        bucket_list.append(file.key)

In the 'bucket_list', I can see that there are 4 files在“bucket_list”中，我可以看到有 4 个文件

for file in bucket_list:
    obj = s3.Object(s3_bucket_name,file)
    data = (obj.get()['Body'].read())
    
    
tree = ET.ElementTree(ET.fromstring(data))

What changes should be made in the code to read the content of all the XML files?要读取所有 XML 文件的内容，代码应该做哪些更改？

Answer 1

As mentioned, since you have a list of files, you need a corresponding list of trees.如前所述，由于您有一个文件列表，因此您需要一个相应的树列表。

tree_list = []

for file in bucket_list:
    obj = s3.Object(s3_bucket_name,file)
    data = (obj.get()['Body'].read())
    tree_list.append(ET.ElementTree(ET.fromstring(data)))

Then you can start using tree_list for whatever purpose.然后你可以开始使用tree_list用于任何目的。

从 Python 中的 S3 存储桶中读取 xml 个文件 - 仅存储最后一个文件的内容

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-05-05 02:33:07

从 Python 中的 S3 存储桶中读取 xml 个文件 - 仅存储最后一个文件的内容

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-05-05 02:33:07

解决方案1
1 已采纳 2022-05-05 02:33:07