[英]Looking for an efficient way to combine lines in Python
I'm writing a program to aggregate strace output lines on a Linux host. 我正在编写一个程序,以在Linux主机上聚合strace输出行。 When strace runs with the "-f" option it will intermix system calls line so:
当strace使用“ -f”选项运行时,它将混合系统调用行,因此:
close(255 <unfinished ...>
<... rt_sigprocmask resumed> NULL, 8) = 0
<... close resumed> ) = 0
[pid 19199] close(255 <unfinished ...>
[pid 19198] <... rt_sigprocmask resumed> NULL, 8) = 0
[pid 19199] <... close resumed> ) = 0
I would like to iterate through the output and combine "unfinished" lines with "resumed" lines. 我想遍历输出并将“未完成的”行与“恢复的”行合并。 So in the output above the following two lines:
因此,在上面的两行输出中:
close(255 <unfinished ...>
.....
<... close resumed> ) = 0
Would be combined into: 将合并为:
close(255) = 0
I was thinking about splitting the "unfinished" lines at ">" and putting that into a list. 我当时正在考虑拆分“>”中的“未完成”行,并将其放入列表中。 If a future line contained resume I would iterate through this list to see if the system call and pid are present.
如果将来的行包含简历,我将遍历此列表以查看系统调用和pid是否存在。 If they are I would split() the line at ">" and combine the two.
如果它们是我将split()“>”处的行并将两者合并。 Curious if there is a better way to do this?
好奇是否有更好的方法来做到这一点?
* Update * *更新*
Thanks for the awesome feedback! 感谢您的好评! I came up with the following and would love to get your thoughts on the code:
我提出了以下建议,希望对您的代码有所了解:
holding_cell = list()
if len(sys.argv) > 1:
strace_file = open(sys.argv[1], "r")
else:
strace_file = sys.stdin
for line in strace_file.read().splitlines():
if "clone" in line:
print line
if "unfinished" in line:
holding_cell.append(line.split("<")[0])
elif "resumed" in line:
# Get the name of the system call / pid so we can try
# to match this line w/ one in the buffer
identifier = line.split()[1]
for cell in holding_cell:
if identifier in cell:
print cell + line.split(">")[1]
holding_cell.remove(cell)
else:
print line
Is there a more pythonic way to write this? 有没有更Python的方式来写这个? Thanks again for the awesome feedback!
再次感谢您的好评!
Some iterators such as file objects can be nested. 一些迭代器(例如文件对象)可以嵌套。 Assuming you are reading this from a file-like object, you could just create an inner loop to do the combining.
假设您正在从类似文件的对象中读取内容,则只需创建一个内部循环即可进行合并。 I'm not sure what the formatting rules for
strace
logs are, but nominally, it could be something like 我不确定
strace
日志的格式化规则是什么,但名义上可能是这样的
def get_logs(filename):
with open('filename') as log:
for line in log:
if "<unfinished " in line:
preamble = line.split(' ', 1)[0].strip()
for line in log:
if " resumed>" in line:
yield "{}) = {}\n".format(preamble,
line.split('=')[-1].strip())
break
else:
yield line
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.