使用 lambda s3 Aws 将 txt 文件转换为 csv

Question

I have this code that must pass a file from a TXT type source bucket and must convert it to CSV in a destination bucket, it returns as a response that the variable or object (z) that should contain the CSV file cannot be opened because it is null. It seems that the code that I use is not transforming the object correctly.我有这段代码必须从 TXT 类型的源存储桶中传递一个文件，并且必须将其转换为目标存储桶中的 CSV，它作为响应返回应该包含 CSV 文件的变量或 object (z) 无法打开，因为它是 null。看来我使用的代码没有正确转换 object。 Please, I need help to correct it.拜托，我需要帮助来纠正它。

The code is the following:代码如下：

import pandas as pd import json import boto3 from io import BytesIO导入 pandas 作为 pd 导入 json 导入 boto3 从 io 导入 BytesIO

def lambda_handler(evenBytesIOt,context): def lambda_handler（evenBytesIOt，上下文）：

s3_resource = boto3.resource('s3')
source_bucket = 'testsigma2'
target_bucket = 'testsigma3'

my_bucket = s3_resource.Bucket(source_bucket)

for file in my_bucket.objects.all():
    if(str(file.key).endswith('.txt')):
        
       zip_obj = s3_resource.Object(bucket_name=source_bucket, key=file.key)
       
       buffer= BytesIO(zip_obj.get()['Body'].read())
       
       dataframe1=pd.read_csv(buffer)
       z = dataframe1.to_csv(buffer,index=None) 
       
       response = s3_resource.meta.client.upload_fileobj(
                    z.open(filename),
                    Bucket = target_bucket,
                    key = f'{filename}'

                )

    else:
        print(file.key + 'is not a zip file.')

Response
{
  "errorMessage": "'NoneType' object has no attribute 'open'",
  "errorType": "AttributeError",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 25, in lambda_handler\n    z.open(filename),\n"
  ]
}

Answer 1

It looks like you are trying to open the z object after calling the to_csv method, but the to_csv method does not return a file object. Instead, it writes the CSV data directly to the buffer object that you provided as an argument.看起来您在调用 to_csv 方法后尝试打开 z object，但 to_csv 方法不返回文件 object。相反，它将 CSV 数据直接写入您作为参数提供的缓冲区 object。 You can confirm this by calling the seek method on the buffer object after calling to_csv to reset the position of the file pointer to the beginning of the file:您可以通过在调用 to_csv 将文件指针的 position 重置为文件开头后，在缓冲区 object 上调用 seek 方法来确认这一点：

dataframe1=pd.read_csv(buffer)
z = dataframe1.to_csv(buffer,index=None) 

//Reset the position of the file pointer to the beginning of the file
buffer.seek(0)

response = s3_resource.meta.client.upload_fileobj(
             buffer,
             Bucket = target_bucket,
             key = f'{filename}'
          )

You can then use the buffer object as the file object to be uploaded to S3.然后，您可以使用缓冲区 object 作为要上传到 S3 的文件 object。

Answer 2

The to_csv method on a pandas dataframe doesn't return the buffer, instead it returns none and writes to the buffer buffer . to_csv方法不返回缓冲区，而是返回 none 并写入缓冲区buffer 。 Thus when it returns None and you try and open filename it will error.因此，当它返回None并且您尝试打开filename时，它将出错。 try passing the buffer to the upload_fileobj .尝试将buffer传递给upload_fileobj 。 Additionally, I don't think filename is defined anywhere so be aware of that.此外，我不认为filename是在任何地方定义的，所以请注意这一点。

For documentation on the specific resources your using, check this out: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html df有关您使用的特定资源的文档，请查看： https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html https://boto3.amazonaws.com/ v1/documentation/api/latest/guide/s3-uploading-files.html df

使用 lambda s3 Aws 将 txt 文件转换为 csv

问题描述

2 个解决方案

解决方案1
0 2022-12-12 18:51:57

解决方案2
0 2022-12-12 19:35:20

使用 lambda s3 Aws 将 txt 文件转换为 csv

问题描述

2 个解决方案

解决方案1 0 2022-12-12 18:51:57

解决方案2 0 2022-12-12 19:35:20

解决方案1
0 2022-12-12 18:51:57

解决方案2
0 2022-12-12 19:35:20