[英]Python subprocess - git log wrong JSON Format
我试图将 git 日志格式化为 json 但失败了。
我使用此命令进行格式化,我认为这不是我的问题所在,但嘿,你永远不知道。
这些是我的职能。
def call_git_log():
format: str = '{%n "commit": "%H",%n "abbreviated_commit": "%h",%n "tree": "%T",%n "abbreviated_tree": "%t",%n "parent": "%P",%n "abbreviated_parent": "%p",%n "refs": "%D",%n "encoding": "%e",%n "subject": "%s",%n "sanitized_subject_line": "%f",%n "body": "%b",%n "commit_notes": "%N",%n "verification_flag": "%G?",%n "signer": "%GS",%n "signer_key": "%GK",%n "author": {%n "name": "%aN",%n "email": "%aE",%n "date": "%aD"%n },%n "commiter": {%n "name": "%cN",%n "email": "%cE",%n "date": "%cD"%n }%n},'
output = subprocess.Popen(["git", "log", f"--pretty=format:{format}"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = output.communicate()
return stdout.decode("utf-8")
output = call_git_log()
with open("output/test.json", "w") as file:
print(str(output), file=file)
结果我得到了这个文件 - 格式错误 JSON。 为什么会这样,哪里出了问题。 输出/测试.json
{
"commit": "4099117e564e7106b7ee7e315e3e8b8458a8fdce",
"abbreviated_commit": "4099117",
"tree": "6b1eb2fbf81de876d14781ffa82b5ee5db973af6",
"abbreviated_tree": "6b1eb2f",
"parent": "37445d7254f726801ce5ed067b8ee8eb523b8b99",
"abbreviated_parent": "37445d7",
"refs": "HEAD -> master, master/master",
"encoding": "",
"subject": "ue04-plots - A.3 fertig",
"sanitized_subject_line": "ue04-plots-A.3-fertig",
"body": "",
"commit_notes": "",
"verification_flag": "N",
"signer": "",
"signer_key": "",
"author": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:50:33 +0100"
},
"commiter": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:50:33 +0100"
}
},
{
"commit": "37445d7254f726801ce5ed067b8ee8eb523b8b99",
"abbreviated_commit": "37445d7",
"tree": "caa7df1bd70b5fd2319e903331c2a96d80f08152",
"abbreviated_tree": "caa7df1",
"parent": "cb484ec66468c5bbac1f78a8ed87852202207701",
"abbreviated_parent": "cb484ec",
"refs": "",
"encoding": "",
"subject": "ue04-plots - arrows",
"sanitized_subject_line": "ue04-plots-arrows",
"body": "",
"commit_notes": "",
"verification_flag": "N",
"signer": "",
"signer_key": "",
"author": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:48:45 +0100"
},
"commiter": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:48:45 +0100"
}
},
{
"commit": "cb484ec66468c5bbac1f78a8ed87852202207701",
"abbreviated_commit": "cb484ec",
"tree": "73e2e71396290d9627b9301451ca5a1bb7ba6df4",
"abbreviated_tree": "73e2e71",
"parent": "becd22ff715defbe00e064181ee71266e3d1db45",
"abbreviated_parent": "becd22f",
"refs": "",
"encoding": "",
"subject": "ue04-plots - titel",
"sanitized_subject_line": "ue04-plots-titel",
"body": "",
"commit_notes": "",
"verification_flag": "N",
"signer": "",
"signer_key": "",
"author": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:33:59 +0100"
},
"commiter": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:33:59 +0100"
}
},
我必须更改什么才能使其成为有效的 JSON 文档,json.loads() 可以处理该文档。
看起来您正在操作git log
的 output,使其成为 JSON 文件,然后您将其传输到其他一些 JSON 解析器,并在那里发现错误?
是的,您的 output 不是有效的 JSON:作为“数组”,需要一个包含开头和结尾的括号。
有关后处理示例,请参见https://stackoverflow.com/a/4600561/9035237 → https://gist.github.com/textarcana/1306223 。 您提到的链接中的所有代码也都说明了这一点。
如果您使用的是 Python,您可以:
output = "[" + output + "]"
output = output.replace("},]", "}]")
但是,您的格式仍然存在问题: JSON 不接受 string 内的行分隔符,任何字段中的"
都会永远破坏格式,但这些可能会在提交消息中发生。因此您的格式应该改变。
根据https://gist.github.com/varemenos/e95c2e098e657c7688fd?permalink_comment_id=3260906#gistcomment-3260906说,你可以做一个 hack:使用一些可能不会出现在任何字段中的字符串,例如^^^^
,作为临时引用占位符,然后执行任何字符 escaping,例如\n
→ \\n
和\"
→ \\"
,最后是^^^^
→ \"
。不要在这一步做 JSON 美化,手它最多为 JSON 格式化程序。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.