I am strucked in my work where my requirement is combining multiple json files into single json file and need to compress it in s3 folder
Somehow I did but the json contents are merging in dictionary and I know I have used Dictionary to load my json content from files because I tried with loading as List but it throws mw JSONDecodeError "Extra data:line 1 column 432(431)"
my file looks like below: file1 (no.json extension will be there)
{"abc":"bcd","12354":"31354321"}
file 2
{"abc":"bcd","12354":"31354321":"hqeddeqf":"5765354"}
my code-
import json
import boto3
s3_client=boto3.client('s3')
bucket_name='<my bucket>'
def lambda_handler(event,context):
key='<Bucket key>'
jsonfilesname = ['<name of the json files which stored in list>']
result=[]
json_data={}
for f in (range(len(jsonfilesname))):
s3_client.download_file(bucket_name,key+jsonfilesname[f],'/tmp/'+key+jsonfilesname[f])
infile = open('/tmp/'+jsonfilesname[f]).read()
json_data[infile] = result
with open('/tmp/merged_file','w') as outfile:
json.dump(json_data,outfile)
my output for the outfile by the above code is
{
"{"abc":"bcd","12354":"31354321"}: []",
"{"abc":"bcd","12354":"31354321":"hqeddeqf":"5765354"} :[]"
}
my expectation is:
{"abc":"bcd","12354":"31354321"},{"abc":"bcd","12354":"31354321":"hqeddeqf":"5765354"}
Please someone help and advice what needs to be done to get as like my expected output
First of all:
file 2
is not a valid JSON file, correctly it should be:
{
"abc": "bcd",
"12354": "31354321",
"hqeddeqf": "5765354"
}
Also, the output is not a valid JSON file, what you would expect after merging 2 JSON files is an array of JSON objects:
[
{
"abc": "bcd",
"12354": "31354321"
},
{
"abc": "bcd",
"12354": "31354321",
"hqeddeqf": "5765354"
}
]
Knowing this, we could write a Lamdda to merge JSONS files:
import json
import boto3
s3 = boto3.client('s3')
def lambda_handler(event,context):
bucket = '...'
jsonfilesname = ['file1.json', 'file2.json']
result=[]
for key in jsonfilesname:
data = s3.get_object(Bucket=bucket, Key=key)
content = json.loads(data['Body'].read().decode("utf-8"))
result.append(content)
# Do something with the merged content
print(json.dumps(result))
If you are using AWS, I would recommend using S3DistCp for json file merging as it provides a fault-tolerant, distributed way that can keep up with large files as well by leveraging MapReduce. However, it does not seem to support in-place
merging.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.