簡體   English   中英

解析 python 中的非結構化日志文件

[英]Parsing unstructured log file in python

我想解析一個包含非結構化文本的日志文件。 我需要在 json 中獲取核心 ID,通過/失敗。 自從一周以來,我對編程很陌生,任何幫助都將不勝感激。

AMPTTK v25: RSA ALL THREADS
================RSACores X RSACores==============
time: 421045.73
Num Threads Available to process: 256
Num Cores   Requested to execute: 256
TSC freq: 1600629120.0

Memory allocated @ main (not all used by program): 3842.000000 MB

  RSA thread:       : 0
wrkspace addr       : 7f0483400000
wrkspace size       : f00000

        # cores:   16
        core id:      0,      1,      2,      3,      4,      5,      6,      7,     64,     65,     66,     67,     68,     69,     70,     71,
      pass/fail:   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,
       test ipc:  4.497,  4.503,  4.489,  4.476,  4.537,  4.471,  4.499,  4.459,  4.934,  4.946,  4.892,  4.933,  4.927,  4.927,  4.882,  4.886,
     aperf(MHz):   2826,   2814,   2826,   2826,   2826,   2826,   2827,   2826,   2909,   2909,   2909,   2909,   2909,   2909,   2909,   2909,
      aperf ipc:  2.392,  2.408,  2.397,  2.392,  2.397,  2.388,  2.397,  2.388,  2.341,  2.341,  2.340,  2.341,  2.341,  2.341,  2.340,  2.340,
     mce status:   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,   pass,

我這樣做的一般方法是在日志文件中查找結構,然后將其拉出。 查看您共享的數據,感興趣的行上有一個:字符和十六個逗號分隔值。 由於數據不是直接放入 json 的形式,我已將其存儲在臨時字典中,然后將其轉換為 json 字符串。 下面的例子:

import json

# parse the log file and store in dictionary
raw_data = {}
with open('unstructured_data.txt') as log:
    for line in log:
        line = line.rstrip()
        if line.count(':') == 1:
            heading, data = line.split(':')
            fields = data.split(',')
            if len(fields) > 15:
                raw_data[heading.lstrip()] = fields

# Put only data of interest in to another python dictionary
result_data = {}
for i in range(len(raw_data['core id'])):
    result_data[raw_data['core id'][i].strip()] = raw_data['pass/fail'][i].strip()

# Convert python dictionary to json string
result_json = json.dumps(result_data)

print(result_json)

從您的日志文件中給出以下內容:

$ python3 parse_log.py 
{"0": "pass", "1": "pass", "2": "pass", "3": "pass", "4": "pass", "5": "pass", "6": "pass", "7": "pass", "64": "pass", "65": "pass", "66": "pass", "67": "pass", "68": "pass", "69": "pass", "70": "pass", "71": "pass", "": ""}

雖然這不是一個完美的結果,但希望它可以通過實際數據進行改進。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM