简体   繁体   English

寻找一种在Python中合并行的有效方法

[英]Looking for an efficient way to combine lines in Python

I'm writing a program to aggregate strace output lines on a Linux host. 我正在编写一个程序,以在Linux主机上聚合strace输出行。 When strace runs with the "-f" option it will intermix system calls line so: 当strace使用“ -f”选项运行时,它将混合系统调用行,因此:

close(255 <unfinished ...>
<... rt_sigprocmask resumed> NULL, 8) = 0
<... close resumed> )       = 0
[pid 19199] close(255 <unfinished ...>
[pid 19198] <... rt_sigprocmask resumed> NULL, 8) = 0
[pid 19199] <... close resumed> )       = 0

I would like to iterate through the output and combine "unfinished" lines with "resumed" lines. 我想遍历输出并将“未完成的”行与“恢复的”行合并。 So in the output above the following two lines: 因此,在上面的两行输出中:

close(255 <unfinished ...>
.....
<... close resumed> )       = 0

Would be combined into: 将合并为:

close(255) = 0

I was thinking about splitting the "unfinished" lines at ">" and putting that into a list. 我当时正在考虑拆分“>”中的“未完成”行,并将其放入列表中。 If a future line contained resume I would iterate through this list to see if the system call and pid are present. 如果将来的行包含简历,我将遍历此列表以查看系统调用和pid是否存在。 If they are I would split() the line at ">" and combine the two. 如果它们是我将split()“>”处的行并将两者合并。 Curious if there is a better way to do this? 好奇是否有更好的方法来做到这一点?

* Update * *更新*

Thanks for the awesome feedback! 感谢您的好评! I came up with the following and would love to get your thoughts on the code: 我提出了以下建议,希望对您的代码有所了解:

holding_cell = list()

if len(sys.argv) > 1:
    strace_file =  open(sys.argv[1], "r")
else:
    strace_file = sys.stdin

for line in strace_file.read().splitlines():
    if "clone" in line:
        print line
    if "unfinished" in line:
        holding_cell.append(line.split("<")[0])
    elif "resumed" in line:
        # Get the name of the system call / pid so we  can try 
        # to match this line w/ one in the buffer
        identifier = line.split()[1]
        for cell in holding_cell:
            if identifier in cell:
                print cell + line.split(">")[1]
                holding_cell.remove(cell)
    else:
        print line

Is there a more pythonic way to write this? 有没有更Python的方式来写这个? Thanks again for the awesome feedback! 再次感谢您的好评!

Some iterators such as file objects can be nested. 一些迭代器(例如文件对象)可以嵌套。 Assuming you are reading this from a file-like object, you could just create an inner loop to do the combining. 假设您正在从类似文件的对象中读取内容,则只需创建一个内部循环即可进行合并。 I'm not sure what the formatting rules for strace logs are, but nominally, it could be something like 我不确定strace日志的格式化规则是什么,但名义上可能是这样的

def get_logs(filename):
    with open('filename') as log:
        for line in log:
            if "<unfinished " in line:
                preamble = line.split(' ', 1)[0].strip()
                for line in log:
                    if " resumed>" in line:
                        yield "{}) = {}\n".format(preamble,
                            line.split('=')[-1].strip())
                        break
             else:
                 yield line

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 以有效的方式组合正则表达式 python - Combine in an efficient way regex python 寻找在 python 中按行数拆分大型文本文件的有效方法的想法 - Looking for ideas for efficient way to split large text file by number of lines in python Python:组合复杂字典的最有效方法 - Python: Most efficient way to combine complex dictionaries python结合了每一行:使脚本更高效 - python combine each pair of lines: making script more efficient Python-寻找一种更有效的方法来重新编号字典中的键 - Python - Looking for a more efficient way to renumber the keys in my dictionary 寻找一种更有效的方式来编写我的python程序 - Looking for a more efficient way to write my python program 寻找从python中的yelp评论数据集构建矩阵的有效方法 - Looking for efficient way to build matrix from yelp review dataset in python 如何在python中打印形状? 寻找一种不同的、更有效的方式 - how to print shapes in python? looking into a different and more efficient way 寻找一种更有效的方法来重组Python中的大量CSV - Looking for a more efficient way to reorganize a massive CSV in Python 寻找 RAM 有效的方法来在 Python 中并行比较许多分布 - Looking for RAM efficient way to compare many distributions in parallel in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM