简体   繁体   English

如何使用 python 从 S3 存储桶中的文件夹读取文件内容?

[英]How to read content of a file from a folder in S3 bucket using python?

I was trying to read a file from a folder structure in S3 bucket using python with boto3.我试图使用 python 和 boto3 从 S3 存储桶中的文件夹结构中读取文件。

I want to return boolean value wether the report is present in S3 bucket or not.无论报告是否存在于 S3 存储桶中,我都想返回 boolean 值。

Code代码

import boto3
import json

S3_BUCKET_NAME = ''
KEY = '@@@/%%%.json'


def notification():
    report = get_report()
    print(report)


def get_report():
    s3_client = boto3.client('s3')
    response = s3_client.get_object(Bucket=S3_BUCKET_NAME, Prefix=PREFIX, Key=KEY)
    data = response['Body'].read()
    report = json.loads(data)
    return report

How to check if the report is present and return a boolean value?如何检查报告是否存在并返回 boolean 值?

2 answers to your questions: 2个回答你的问题:

  1. How to read content of a file from a folder in S3 bucket using python?如何使用 python 从 S3 存储桶中的文件夹读取文件内容?
  2. How to check if the report is present and return a boolean value?如何检查报告是否存在并返回 boolean 值?

Get S3-object获取 S3 对象

S3-object as bytes S3 对象作为字节

    s3_client = boto3.client('s3')
    response = s3_client.get_object(Bucket=S3_BUCKET_NAME, Prefix=PREFIX, Key=KEY)
    bytes = response['Body'].read()  # returns bytes since Python 3.6+

NOTE: For Python 3.6+ read() returns bytes .注意:对于 Python 3.6+ read()返回bytes So if you want to get a string out of it, you must use .decode(charset) on it:所以如果你想从中得到一个字符串,你必须在它上面使用.decode(charset)

pythonObject = json.loads(obj['Body'].read().decode('utf-8'))

S3-object as string S3 对象作为字符串

See Open S3 object as a string with Boto3 .请参阅Open S3 object 作为带有 Boto3 的字符串

Check if S3-object is present检查 S3 对象是否存在

For example to check the availability of the report as S3.Object just retrieve it and test on the key attribute:例如,要检查作为S3.Object的报告的可用性,只需检索它并测试key属性:

import boto3
import json

S3_BUCKET_NAME = ''
KEY = 'fee_summary/fee_summary_report.json'


def send_fee_summary_notification():
    fee_summary_report = get_fee_summary_report()
    print(fee_summary_report)


def get_fee_summary_report():
    s3_client = boto3.client('s3')
    response = s3_client.get_object(Bucket=S3_BUCKET_NAME, Prefix=PREFIX, Key=KEY)
    data = response['Body'].read()
    fee_summary_report = json.loads(data)
    return fee_summary_report


def has_fee_summary_report():    
    s3 = boto3.client('s3')
    obj = s3.Object(S3_BUCKET_NAME, KEY).get()  # define object with KEY (report) and get
    return obj.key != None # returns False if not found

Use paging to literally scan (for debugging)使用分页字面扫描(调试用)

You can also iterate over all objects in you bucket via paging and test, if the desired report (with specified KEY) exists:如果所需的报告(具有指定的 KEY)存在,您还可以通过分页和测试遍历存储桶中的所有对象:

for page in s3.Bucket('boto3').objects.pages():
    for obj in page:
        print(obj.key)  # debug print
        if obj.key == KEY:
            return True
    return False

See example below, I have created for you..请参见下面的示例,我为您创建了..

S3 存储桶和 JSON 文件

import json
import boto3


def lambda_handler(event, context):
    
    S3_BUCKET_NAME = ''
    KEY = 'fee_summary_report.json'
    s3_client = boto3.client('s3')
    response = s3_client.get_object(Bucket='feesummarybucketmmmm', Key=KEY)
    data = response['Body'].read()
    print(response)
    print(data)
    fee_summary_report = json.loads(data)
    
    # TODO implement
    return {
        'statusCode': 200,
        'body': fee_summary_report
    }

输出

https://github.com/mmakadiya/public_files/blob/main/read_s3_file.py https://github.com/mmakadiya/public_files/blob/main/read_s3_file.py

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM