[英]How to remove white spaces and \n in the JSON file in Python
我有 JSON 数据以以下格式进入 S3:
{\n \"data\": {\n \"event_type\": \"message.received\",\n \"id\": \"819\",\n \"occurred_at\": \"2020-10\",\n \"payload\": {\n \"cc\": [],\n \"completed_at\": null,\n \"cost\": null,\n \"direction\": \"inbound\",\n \"encoding\": \"GSM-7\",\n \"errors\": [],\n \"from\": {\n \"carrier\": \"Verizon\",\n \"line_type\": \"Wireless\",\n \"phone_number\": \"+111111111\"\n },\n \"id\": \"e8e0d1e3-dce3-\",\n \"media\": [],\n \"messaging_profile_id\": \"400176\",\n \"organization_id\": \"717d556f-ba4f-\",\n \"parts\": 1,\n \"received_at\": \"2020-1\",\n \"record_type\": \"message\",\n \"sent_at\": null,\n \"tags\": [],\n \"text\": \"Hi \",\n \"to\": [\n {\n \"carrier\": \"carr\",\n \"line_type\": \"Wireless\",\n \"phone_number\": \"+111111111\",\n}\n}"
我希望它像这样转换:
{
"data": {
"event_type": "message.received",
"id": "76a60230",
"occurred_at": "2020-12-1",
"payload": {
"cc": [],
"completed_at": null,
"cost": null,
"direction": "inbound",
"encoding": "GSM-7",
"errors": [],
"from": {
"carrier": "Verizon",
"line_type": "Wireless",
"phone_number": "+1111111111"
},
"id": "06c9c765",
"media": [],
"messaging_profile_id": "40017",
"organization_id": "717d5",
"parts": 1,
"received_at": "2020-1",
"record_type": "message",
"sent_at": null,
"tags": [],
"text": "Hi",
"to": [
{
"carrier": "abc",
"line_type": "Wireless",
"phone_number": "+1111111111",
"status": "delivered"
}
],
"type": "SMS",
"valid_until": null,
"failover_url": null,
"url": "https://639hpj"
},
"record_type": "event"
},
"meta": {
"attempt": 1,
"delivered_to": "https://639hpj"
}
}
我保存的第一个 JSON 数据以行而不是 Struct 格式出现。 我没有保留实际的 JSON 数据,但它采用类似的格式(但有效)。 我想运行 lambda function ,其中 JSON 数据不含\n
和空格。
以上 2 个 JSON 数据不一样,但我将收到第一种类型的 JSON 数据,我想将其转换为没有空格和\n
的第二种类型。
您是否意识到空格和换行符是print
用于格式化的?
让我们称之为您的第一个t
(我通过在末尾添加缺少的括号来修复它):
t = '''{\n "data": {\n "event_type": "message.received",\n "id": "819",\n "occurred_at": "2020-10",\n "payload": {\n "cc": [],\n "completed_at": null,\n "cost": null,\n "direction": "inbound",\n "encoding": "GSM-7",\n "errors": [],\n "from": {\n "carrier": "Verizon",\n "line_type": "Wireless",\n "phone_number": "+111111111"\n },\n "id": "e8e0d1e3-dce3-",\n "media": [],\n "messaging_profile_id": "400176",\n "organization_id": "717d556f-ba4f-",\n "parts": 1,\n "received_at": "2020-1",\n "record_type": "message",\n "sent_at": null,\n "tags": [],\n "text": "Hi ",\n "to": [\n {\n "carrier": "carr",\n "line_type": "Wireless",\n "phone_number": "+111111111"\n}\n]\n}}}'''
它打印为:
>>> print(t)
{
"data": {
"event_type": "message.received",
"id": "819",
"occurred_at": "2020-10",
"payload": {
"cc": [],
"completed_at": null,
"cost": null,
"direction": "inbound",
"encoding": "GSM-7",
"errors": [],
...
要获得预期的表示,您应该:
将其加载到 Python object 中: js = json.loads(t)
将其转储回缩进为 2 的字符串: t2 = json.dumps(js)
t2
实际上看起来像'{\n "data": {\n "event_type": "message.received",\n "id": "819",\n "occurred_at": "2020-10",\n "payload": {\n "cc": [],\n "completed_at": null,\n "cost": null,\n...
打印它:
>>> print(t2) { "data": { "event_type": "message.received", "id": "819", "occurred_at": "2020-10", "payload": { "cc": [], "completed_at": null, "cost": null, "direction": "inbound", "encoding": "GSM-7", "errors": [], "from": { "carrier": "Verizon", "line_type": "Wireless", ...
一个班轮可以是:
print(json.dumps(json.loads(t), indent=2))
您可以首先将其加载为 Python 字典:
import json
myDict = json.loads(jsonString)
然后,将其转换回最小化/缩进的 JSON 字符串:
minimizedJSON = json.dumps(myDict)
indentedJSON = json.dumps(myDict, indent = <# of spaces>)
通常,这是通过literal_eval完成的,它将获取您的字符串并将其解析为python 字典:
import ast
# replace with full string of yours
s='{\n "data": {\n "event_type": "message.received",\n "id": "819"\n}\n}'
result = ast.literal_eval(s)
print(type(result), result)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.