I have JSON data coming to S3 in the format:
{\n \"data\": {\n \"event_type\": \"message.received\",\n \"id\": \"819\",\n \"occurred_at\": \"2020-10\",\n \"payload\": {\n \"cc\": [],\n \"completed_at\": null,\n \"cost\": null,\n \"direction\": \"inbound\",\n \"encoding\": \"GSM-7\",\n \"errors\": [],\n \"from\": {\n \"carrier\": \"Verizon\",\n \"line_type\": \"Wireless\",\n \"phone_number\": \"+111111111\"\n },\n \"id\": \"e8e0d1e3-dce3-\",\n \"media\": [],\n \"messaging_profile_id\": \"400176\",\n \"organization_id\": \"717d556f-ba4f-\",\n \"parts\": 1,\n \"received_at\": \"2020-1\",\n \"record_type\": \"message\",\n \"sent_at\": null,\n \"tags\": [],\n \"text\": \"Hi \",\n \"to\": [\n {\n \"carrier\": \"carr\",\n \"line_type\": \"Wireless\",\n \"phone_number\": \"+111111111\",\n}\n}"
I want it to be converted like this:
{
"data": {
"event_type": "message.received",
"id": "76a60230",
"occurred_at": "2020-12-1",
"payload": {
"cc": [],
"completed_at": null,
"cost": null,
"direction": "inbound",
"encoding": "GSM-7",
"errors": [],
"from": {
"carrier": "Verizon",
"line_type": "Wireless",
"phone_number": "+1111111111"
},
"id": "06c9c765",
"media": [],
"messaging_profile_id": "40017",
"organization_id": "717d5",
"parts": 1,
"received_at": "2020-1",
"record_type": "message",
"sent_at": null,
"tags": [],
"text": "Hi",
"to": [
{
"carrier": "abc",
"line_type": "Wireless",
"phone_number": "+1111111111",
"status": "delivered"
}
],
"type": "SMS",
"valid_until": null,
"failover_url": null,
"url": "https://639hpj"
},
"record_type": "event"
},
"meta": {
"attempt": 1,
"delivered_to": "https://639hpj"
}
}
The first JSON data I kept came in lines and not in the Struct format. I did not keep the actual JSON data but it was in that similar format (but valid). I would like to run a lambda function in which the JSON data is free from \n
and white spaces.
The above 2 JSON data are not the same but I will be receiving the first type of JSON data and I would like to convert it to the second type which is free of white spaces and \n
.
Did you realize that spaces and newlines are what print
uses for formatting?
Lets us call t
your first json (I fixed it by adding the missing brackets at the end):
t = '''{\n "data": {\n "event_type": "message.received",\n "id": "819",\n "occurred_at": "2020-10",\n "payload": {\n "cc": [],\n "completed_at": null,\n "cost": null,\n "direction": "inbound",\n "encoding": "GSM-7",\n "errors": [],\n "from": {\n "carrier": "Verizon",\n "line_type": "Wireless",\n "phone_number": "+111111111"\n },\n "id": "e8e0d1e3-dce3-",\n "media": [],\n "messaging_profile_id": "400176",\n "organization_id": "717d556f-ba4f-",\n "parts": 1,\n "received_at": "2020-1",\n "record_type": "message",\n "sent_at": null,\n "tags": [],\n "text": "Hi ",\n "to": [\n {\n "carrier": "carr",\n "line_type": "Wireless",\n "phone_number": "+111111111"\n}\n]\n}}}'''
It prints as:
>>> print(t)
{
"data": {
"event_type": "message.received",
"id": "819",
"occurred_at": "2020-10",
"payload": {
"cc": [],
"completed_at": null,
"cost": null,
"direction": "inbound",
"encoding": "GSM-7",
"errors": [],
...
To obtain the expected representation you should:
load it into a Python object: js = json.loads(t)
dump it back into a string with 2 as indentation: t2 = json.dumps(js)
t2
actually looks like '{\n "data": {\n "event_type": "message.received",\n "id": "819",\n "occurred_at": "2020-10",\n "payload": {\n "cc": [],\n "completed_at": null,\n "cost": null,\n...
print it:
>>> print(t2) { "data": { "event_type": "message.received", "id": "819", "occurred_at": "2020-10", "payload": { "cc": [], "completed_at": null, "cost": null, "direction": "inbound", "encoding": "GSM-7", "errors": [], "from": { "carrier": "Verizon", "line_type": "Wireless", ...
A one liner could be:
print(json.dumps(json.loads(t), indent=2))
You can first load it as a Python dictionary with:
import json
myDict = json.loads(jsonString)
And then, convert it back to a minimized/indented JSON string:
minimizedJSON = json.dumps(myDict)
indentedJSON = json.dumps(myDict, indent = <# of spaces>)
Usually, this is done through literal_eval which will take your string and parse it into a python dictionary :
import ast
# replace with full string of yours
s='{\n "data": {\n "event_type": "message.received",\n "id": "819"\n}\n}'
result = ast.literal_eval(s)
print(type(result), result)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.