繁体   English   中英

从 Python 的日志文件中提取特定的 JSON

[英]Extract specific JSON from log file in Python

我正在尝试从包含多个 JSON 和普通文本的日志文件中提取特定的 JSON,在这种情况下,我试图提取包含“输出有效负载”文本的 JSON。 我尝试了多种方法,但无法提取所需的 JSON,文件格式为:

[2020-05-17 15:32:11.698000] INFO [worker-1] org.mule.api.processor.LoggerMessageProcessor [[cloudhub-us-claim-services-1-0-0-prod].post:/claims/{claimNumber}/predictionScores:experience-claims-predictionscore-api.config.7.771]: PredictionScoreAPILogger-7c506940-987d-11ea-9ef4-0a5226a8e24f:16634746: Initialization: Request successfully logged to mirror queue
[2020-05-17 15:32:12.190000] INFO [worker-1] org.mule.transformer.simple.MessagePropertiesTransformer [[cloudhub-us-claim-services-1-0-0-prod].experience-claims-predictionscore-api.prediction-details-claim-updates.stage1.839]: Property with key 'response', not found on message using 'null'. Since the value was marked optional, nothing was set on the message for this property
[2020-05-17 15:32:12.192000] DEBUG [worker-1] aiml.logging.debug [[cloudhub-us-claim-services-1-0-0-prod].experience-claims-predictionscore-api.prediction-details-claim-updates.stage1.839]: PredictionScoreAPILogger-7c506940-987d-11ea-9ef4-0a5226a8e24f:16634746:Datarobot API Call: Output payload received from Datarobot API: {
  "prediction": "N",
  "predictionScore": 0.0000629713,
  "predictionExplanations": "lineItem : 0|feature: ADJER_CANNOT_COMPUTE_TWG_SUGGESTED_TIME_ZERO|Value: Y|strength: -1.4469371757,\nlineItem : 1|feature: ADJER_CANNOT_COMPUTE_TWG_SUGGESTED_PRICE|Value: Y|strength: -1.1968554807,\nlineItem : 2|feature: MONTHS_DIFF_CLAIM_REPAIR_FACILITY_FIRST_CLAIM|Value: 61|strength: -1.0681064444"
}

您可能可以将文件作为文本读取,然后使用正则表达式对其进行解析。 像这样的东西:

import re

logfile = open(logfilepath, 'r')
log = logfile.read()
logfile.close()
objects = re.findall("(Output payload.*:\s?)(\{\s?[\s\S]+?\s?\})", log)

我已经为您给定的样本测试了正则表达式,它工作正常。 所以这段代码也应该可以工作。 获得所有 JSON 对象后,您可以轻松找到所需的对象。

快乐的黑客攻击:)

编辑:根据修改后的问题修改了正则表达式。 正则表达式现在查找“输出有效负载”字符串。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM