简体   繁体   中英

Using AWS Lambda to convert JSON files stored in S3 Bucket to CSV

I'm new to Lambda and Python and I've faced an issue with my Lambda function. I have several JSON files stored in a S3 bucket, and I wish to convert all JSON files to CSV format. As I was referring to the Lambda function posted in this tutorial: https://sysadmins.co.za/convert-csv-to-json-files-with-aws-lambda-and-s3-events/

Lambda function:

import json
import csv
import boto3
import os
import datetime as dt

s3 = boto3.client('s3')

def lambda_handler(event, context):
    
    datestamp = dt.datetime.now().strftime("%Y/%m/%d")
    timestamp = dt.datetime.now().strftime("%s")
    
    filename_json = "/tmp/file_{ts}.json".format(ts=timestamp)
    filename_csv = "/tmp/file_{ts}.csv".format(ts=timestamp)
    keyname_s3 = "uploads/output/{ds}/{ts}.json".format(ds=datestamp, ts=timestamp)
    
    json_data = []

    for record in event['Records']:
        bucket_name = record['s3']['bucket']['name']
        key_name = record['s3']['object']['key']
        
    s3_object = s3.get_object(Bucket=bucket_name, Key=key_name)
    data = s3_object['Body'].read()
    contents = data.decode('utf-8')
    
    with open(filename_csv, 'a') as csv_data:
        csv_data.write(contents)
    
    with open(filename_csv) as csv_data:
        csv_reader = csv.DictReader(csv_data)
        for csv_row in csv_reader:
            json_data.append(csv_row)
            
    with open(filename_json, 'w') as json_file:
        json_file.write(json.dumps(json_data))
    
    with open(filename_json, 'r') as json_file_contents:
        response = s3.put_object(Bucket=bucket_name, Key=keyname_s3, Body=json_file_contents.read())

    os.remove(filename_csv)
    os.remove(filename_json)

    return {
        'statusCode': 200,
        'body': json.dumps('CSV converted to JSON and available at: {bucket}/{key}'.format(bucket=bucket_name,key=keyname_s3))
    }

I want to achieve a similar outcome using Lambda, but from JSON to CSV instead. How may I go about doing this?

I'd suggest having a look at convtools library:

from convtools import conversion as c
from convtools.contrib.tables import Table
import json

# input.json
"""
{
    "records": [
        {"a": 1, "b": "c"},
        {"a": 2, "b": "d"},
        {"a": 3, "b": "e"}
    ]
}
"""
with open("input.json") as f:
    input_data = json.load(f)


Table.from_rows(input_data["records"]).into_csv("output.csv")

# output.csv
"""
a,b
1,c
2,d
3,e
"""

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM