繁体   English   中英

使用 AWS Lambda 将存储在 S3 存储桶中的 JSON 文件转换为 CSV

[英]Using AWS Lambda to convert JSON files stored in S3 Bucket to CSV

我是 Lambda 和 Python 的新手,我的 Lambda 函数遇到了问题。 我有几个 JSON 文件存储在 S3 存储桶中,我希望将所有 JSON 文件转换为 CSV 格式。 正如我指的是本教程中发布的 Lambda 函数: https : //sysadmins.co.za/convert-csv-to-json-files-with-aws-lambda-and-s3-events/

拉姆达函数:

import json
import csv
import boto3
import os
import datetime as dt

s3 = boto3.client('s3')

def lambda_handler(event, context):
    
    datestamp = dt.datetime.now().strftime("%Y/%m/%d")
    timestamp = dt.datetime.now().strftime("%s")
    
    filename_json = "/tmp/file_{ts}.json".format(ts=timestamp)
    filename_csv = "/tmp/file_{ts}.csv".format(ts=timestamp)
    keyname_s3 = "uploads/output/{ds}/{ts}.json".format(ds=datestamp, ts=timestamp)
    
    json_data = []

    for record in event['Records']:
        bucket_name = record['s3']['bucket']['name']
        key_name = record['s3']['object']['key']
        
    s3_object = s3.get_object(Bucket=bucket_name, Key=key_name)
    data = s3_object['Body'].read()
    contents = data.decode('utf-8')
    
    with open(filename_csv, 'a') as csv_data:
        csv_data.write(contents)
    
    with open(filename_csv) as csv_data:
        csv_reader = csv.DictReader(csv_data)
        for csv_row in csv_reader:
            json_data.append(csv_row)
            
    with open(filename_json, 'w') as json_file:
        json_file.write(json.dumps(json_data))
    
    with open(filename_json, 'r') as json_file_contents:
        response = s3.put_object(Bucket=bucket_name, Key=keyname_s3, Body=json_file_contents.read())

    os.remove(filename_csv)
    os.remove(filename_json)

    return {
        'statusCode': 200,
        'body': json.dumps('CSV converted to JSON and available at: {bucket}/{key}'.format(bucket=bucket_name,key=keyname_s3))
    }

我想使用 Lambda 实现类似的结果,但是从 JSON 到 CSV。 我该怎么做呢?

我建议看看convtools库:

from convtools import conversion as c
from convtools.contrib.tables import Table
import json

# input.json
"""
{
    "records": [
        {"a": 1, "b": "c"},
        {"a": 2, "b": "d"},
        {"a": 3, "b": "e"}
    ]
}
"""
with open("input.json") as f:
    input_data = json.load(f)


Table.from_rows(input_data["records"]).into_csv("output.csv")

# output.csv
"""
a,b
1,c
2,d
3,e
"""

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM