简体   繁体   English

如何打开流而不是将整个文件加载到pyhton lambda内部的内存中

[英]How to open stream rather than loading whole file into memory inside pyhton lambda

Hi I am new to lambda and python. 嗨,我是lambda和python的新手。 I have a use case to read the content of a large file let's say gretaer than 1 GB and just log its content line by line. 我有一个用例来读取一个大文件的内容,比如说大于1 GB,并逐行记录其内容。

I have made a lambda function as below : 我做了如下的lambda函数:

import boto3

def lambda_handler(event, context):
    """Read file from s3 on trigger."""
    s3 = boto3.resource('s3')
    file_obj = event['Records'][0]

    bucketname = str(file_obj['s3']['bucket']['name'])
    filename = str(file_obj['s3']['object']['key'])

    iterator = s3.Object(bucketname, filename).get()['Body'].iter_lines()
    for line in iterator:
        print(line)

    return 'Lambda executed successfully.'

But it is not printing anything inside logs. 但是它没有在日志中打印任何内容。

I think s3.Object(bucketname, filename).get()['Body'] is trying to load the whole file into memory. 我认为s3.Object(bucketname, filename).get()['Body']正在尝试将整个文件加载到内存中。 Is this my understanding is correct? 我的理解正确吗? because this is working fine for small files. 因为这对于小文件来说效果很好。

If yes, then how can I open a file as a stream without loading it fully inside the memory. 如果是,那么如何在不将文件完全加载到内存中的情况下将其作为流打开文件。

This is work for me 这对我来说是工作

s3 = boto3.resource('s3')
obj = s3.Object(BUCKET, key)
for line in obj.get()['Body']._raw_stream:
    # do something with line

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从流附加到文件而不是在Python中覆盖 - How to append to a file from stream rather than overwrite in Python 如何在 python 中打开一个 csv 文件,一次读取一行,而不将整个 csv 文件加载到内存中? - How can I open a csv file in python, and read one line at a time, without loading the whole csv file in memory? 如何将 telnet 作为文本文件而不是二进制文件打开 - How to open telnet as a textfile rather than a binary file 有没有办法在 tkinter window 中显示文件浏览器而不是打开文件对话框? - Is there any way to show the file browser INSIDE a tkinter window rather than an open file dialogue? 如何在 pyhton 中报废装载项目 - How to scrap a loading item in pyhton 文件只读取数字的第一位而不是整个数字 - File reading only first digit of a number rather than the whole number 如何将 Numpy Eig 与 Pyhton Lambda function 一起使用? - How to use Numpy Eig with Pyhton Lambda function? 如何在 django 中验证特定对象而不是整个类的用户? - How to authenticate user for a specific object rather than whole class in django? 如何在 Python 字典中查找单个条目而不是整行? - How to find a single entry in a Python dictionary rather than a whole line? 如何使用python转置/枢转csv文件,而无需将整个文件加载到内存中? - How do I transpose/pivot a csv file with python *without* loading the whole file into memory?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM