无法获取 Amazon S3 文件的内容并使用 python 和 boto3 编辑该文件

Question

I am trying to get the data from a file in Amazon S3, manipulate the content and then save it to another bucket.我正在尝试从 Amazon S3 中的文件中获取数据，处理内容，然后将其保存到另一个存储桶中。

import json
import urllib.parse
import boto3

print('Loading function')


s3 = boto3.client('s3')

def lambda_handler(event, context):
    
    bucket = event['Records'][0]['s3']['bucket']['name']
    file_name = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
    s3_object = s3.get_object(Bucket=bucket, Key=file_name)
    file_content = s3_object['Body'].read()
    
    initial_data = json.load(file_content)
    # some file manipulation comes here
    
    
    data=json.dumps(initial_data, ensure_ascii=False)
    s3.put_object(Bucket="new bucket name", Body=data, Key=file_name)

error message leads me to think that this has something to do with encoding:错误消息使我认为这与编码有关：

Response:回复：

{
  "errorMessage": "'bytes' object has no attribute 'read'",
  "errorType": "AttributeError",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 25, in lambda_handler\n    data_initlal = json.load(file_content)\n",
    "  File \"/var/lang/lib/python3.8/json/__init__.py\", line 293, in load\n    return loads(fp.read(),\n"
  ]
}

Additionally, if I remove the following line from my code:此外，如果我从代码中删除以下行：

initial_data = json.load(file_content)

I get the error:我得到错误：

Response:
{
  "errorMessage": "Object of type bytes is not JSON serializable",
  "errorType": "TypeError",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 29, in lambda_handler\n    data=json.dumps(file_content, ensure_ascii=False)\n",
    "  File \"/var/lang/lib/python3.8/json/__init__.py\", line 234, in dumps\n    return cls(\n",
    "  File \"/var/lang/lib/python3.8/json/encoder.py\", line 199, in encode\n    chunks = self.iterencode(o, _one_shot=True)\n",
    "  File \"/var/lang/lib/python3.8/json/encoder.py\", line 257, in iterencode\n    return _iterencode(o, 0)\n",
    "  File \"/var/lang/lib/python3.8/json/encoder.py\", line 179, in default\n    raise TypeError(f'Object of type {o.__class__.__name__} '\n"
  ]
}

The file that I am trying to edit is a json format and the output should also be json.我要编辑的文件是 json 格式，output 也应该是 json。

Answer 1

This line:这一行：

initial_data = json.load(file_content)

Should be:应该：

initial_data = json.loads(file_content)

Alternatively, replace these two lines:或者，替换这两行：

file_content = s3_object['Body'].read()
    
initial_data = json.load(file_content)

with:和：

initial_data = json.load(s3_object['Body'])

The difference is json.load() vs json.loads() .区别在于json.load()与json.loads() 。

Answer 2

The file_content that you are trying to read is utf-8 encoded.您尝试读取的 file_content 是 utf-8 编码的。 You need to decode that before converting it to json.在将其转换为 json 之前，您需要对其进行解码。

Try this:尝试这个：

initial_data = json.loads(file_content.decode('utf-8'))

无法获取 Amazon S3 文件的内容并使用 python 和 boto3 编辑该文件

问题描述

2 个解决方案

解决方案1
4 2020-06-28 09:18:51

解决方案2
0 2020-06-28 09:18:38

无法获取 Amazon S3 文件的内容并使用 python 和 boto3 编辑该文件

问题描述

2 个解决方案

解决方案1 4 2020-06-28 09:18:51

解决方案2 0 2020-06-28 09:18:38

解决方案1
4 2020-06-28 09:18:51

解决方案2
0 2020-06-28 09:18:38