[英]How to parse this log to get date/time out for plotting with Python3
我有一个看起来像这样的日志:
**  Wed; Feb 20 2019 at 12:38:10:734 PM : ** **  GnssLocationListener; \- 41** \- onSatelliteStatusChanged() : fixCount = 7
**  Wed; Feb 20 2019 at 12:38:12:742 PM : ** **  GnssLocationListener; \- 41** \- onSatelliteStatusChanged() : fixCount = 7
**  Wed; Feb 20 2019 at 12:38:14:721 PM : ** **  GnssLocationListener; \- 41** \- onSatelliteStatusChanged() : fixCount = 7
**  Wed; Feb 20 2019 at 12:38:16:777 PM : ** **  GnssLocationListener; \- 41** \- onSatelliteStatusChanged() : fixCount = 7
**  Wed; Feb 20 2019 at 12:38:18:729 PM : ** **  GnssLocationListener; \- 41** \- onSatelliteStatusChanged() : fixCount = 7
**  Wed; Feb 20 2019 at 12:38:20:700 PM : ** **  GnssLocationListener; \- 41** \- onSatelliteStatusChanged() : fixCount = 7
**  Wed; Feb 20 2019 at 12:38:22:697 PM : ** **  GnssLocationListener; \- 41** \- onSatelliteStatusChanged() : fixCount = 7
**  Wed; Feb 20 2019 at 12:38:24:706 PM : ** **  GnssLocationListener; \- 41** \- onSatelliteStatusChanged() : fixCount = 7
**  Wed; Feb 20 2019 at 12:38:26:783 PM : ** **  GnssLocationListener; \- 41** \- onSatelliteStatusChanged() : fixCount = 7
我试图从中获取以下数据:
12:38:10 PM , 7
12:38:12 PM , 7
12:38:14 PM , 7
12:38:16 PM , 7
12:38:18 PM , 7
...
我正在尝试用我在 Python 中所知道的来做到这一点......这是非常基本的。
import matplotlib.pyplot as plt
import matplotlib.dates as md
import numpy as np
import datetime as dt
import time
import csv
data = []
datafile = open('fix_count_02-20-2019-day.txt' , 'r')
datareader = csv.reader((x.replace('\0','') for x in datafile), delimiter=':')
for row in datareader:
data.append(row)
np_data = np.asarray(data)
print(np_data)
plt.subplots_adjust(bottom=0.2)
plt.xticks( rotation=25 )
ax=plt.gca()
#xfmt = md.DateFormatter('%H:%M:%S')
#ax.xaxis.set_major_formatter(xfmt)
plt.plot(np_data)
plt.show()
我已经尝试了split
和join
一些体操,但这对我来说并没有真正解决......我最终想绘制类似于这个问题的图,可能(我猜)使用一个 numpy 数组:
编辑:更新了问题,因为我刚刚意识到您正在尝试使用 numpy 来绘制结果,而不是解析数据。
您将要使用简单的正则表达式模式来解析此日志文件。 您可以根据自己的喜好生成结果列表。
这是将您的时间和fixCount
解析为匹配组的正则表达式模式:
.*((?:\d{2}:){3}\d{3} (?:PM|AM)).*fixCount = (\d+)
链接到它的行动: https : //regexr.com/48ph8
请参阅https://pythonicways.wordpress.com/2016/12/20/log-file-parsing-in-python/以获得如何做你想做的事的一个很好的例子。
解决方案将是这样的:
import re
log_file_path = 'fix_count_02-20-2019-day.txt'
regex = r'.*((?:\d{2}:){3}\d{3} (?:PM|AM)).*fixCount = (\d+)'
match_list = []
with open(log_file_path, 'r') as file:
data = f.read()
for match in re.finditer(regex, data, re.S):
match_text = match.group(0), match.group(1)
match_list.append(match_text)
print match_text
# do something with match_list here
假设日志文件名为 log.txt
with open('log.txt', 'r') as log:
lines = log.readlines()
for line in lines:
line = line.strip()
print('{} {} , {}'.format(line[29:37], line[42:44], line[-1]))
输出
12:38:10 PM , 7
12:38:12 PM , 7
12:38:14 PM , 7
12:38:16 PM , 7
12:38:18 PM , 7
12:38:20 PM , 7
12:38:22 PM , 7
12:38:24 PM , 7
12:38:26 PM , 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.