简体   繁体   中英

Reading xml files from S3 bucket in Python - Only the content of the last file is getting stored

I have 4 XML files inside the S3 bucket directory. When I'm trying to read the content of all the files, I find that only the content of the last file (XML4) is getting stored.

s3_bucket_name='test'
bucket=s3.Bucket(s3_bucket_name)
bucket_list = []
for file in bucket.objects.filter(Prefix = 'auto'):
    file_name=file.key
    if file_name.find(".xml")!=-1:
        bucket_list.append(file.key)

In the 'bucket_list', I can see that there are 4 files

for file in bucket_list:
    obj = s3.Object(s3_bucket_name,file)
    data = (obj.get()['Body'].read())
    
    
tree = ET.ElementTree(ET.fromstring(data))

What changes should be made in the code to read the content of all the XML files?

As mentioned, since you have a list of files, you need a corresponding list of trees.

tree_list = []

for file in bucket_list:
    obj = s3.Object(s3_bucket_name,file)
    data = (obj.get()['Body'].read())
    tree_list.append(ET.ElementTree(ET.fromstring(data)))

Then you can start using tree_list for whatever purpose.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM