简体   繁体   English

将列表解析为另一个函数参数 - Python

[英]Parsing a list as another function argument - Python

I have a .log file where i want to check if there are errors/warnings in it:我有一个 .log 文件,我想检查其中是否有错误/警告:

    2018-03-05 10:55:54,636 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M740.aswc.arxml is well-formed
    2018-03-05 10:55:55,193 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M740.aswc.arxml is valid with the AUTOSAR4.2.2-STRICT schema
    2018-03-05 10:55:55,227 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M741.aswc.arxml is well-formed
    2018-03-05 10:55:55,795 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M741.aswc.arxml is valid with the AUTOSAR4.2.2-STRICT schema
    2018-03-05 10:55:55,831 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M742.aswc.arxml is well-formed
    2018-03-05 10:55:56,403 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M742.aswc.arxml is valid with the AUTOSAR4.2.2-STRICT schema
    2018-03-05 10:55:56,438 WARNING ASWC_M740_MSI is without connector
2018-03-05 10:55:56,438 ERROR ASWC_M741_MSI is without connector
    2018-03-05 10:55:56,438 WARNING PRP_CS_VehicleSPeed is without connector

Until now, i've managed to write the next function but without success:到目前为止,我已经设法编写了下一个函数,但没有成功:

def checkLog(path, level, message):
    """
    path = used for defining the file to be checked
    level = criticity level :INFO, WARNING, ERROR
    message = string to be matched
    """
    datafile = open(path)
    line_file = datafile.readline()
    while line_file != "":
        for text in message:
            if level + " " + text in line_file:
                return True
            line_file = datafile.readline()
    return False

checkLog("C:\\test\Abu\TRS.ABU.GEN.003_1\output\\result.log", "WARNING", ["PRP_CS_VehicleSPeed", "ASWC_M740_MSI", "ASWC_M741_MSI"])

Where i'm wrong?我错在哪里?

The second readline() is inside the for loop that iterates over the possible messages to be matched and so the code moves on to the next line before all messages have been checked.第二个readline()位于 for 循环内,它迭代可能要匹配的消息,因此在检查所有消息之前,代码会移至下一行。

Try moving it to the outer scope:尝试将其移动到外部作用域:

def checkLog(path, level, message):
    datafile = open(path)
    line_file = datafile.readline()
    while line_file != "":
        for text in message:
            if level + " " + text in line_file:
                return True
        line_file = datafile.readline()
    return False

Your code could be better written like this:你的代码可以这样写得更好:

def checkLog(path, level, message):
    with open(path) as datafile:
        for line in datafile:
            for text in message:
                if (level + " " + text) in line:
                    return True
    return False

This avoids the calls to readline() instead iterating over the file object, which simplifies the code.这避免了调用readline()而不是迭代文件对象,从而简化了代码。 Also it opens the file using a context manager (the with statement) which will ensure the the file is properly closed.它还使用上下文管理器( with语句)打开文件,这将确保文件正确关闭。

I recommend you use pandas for this.我建议您为此使用pandas Here is an illustrative example.这是一个说明性示例。

Setup设置

import pandas as pd, numpy as np
from io import StringIO

mystr = StringIO("""2018-03-05 10:55:54,636 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M740.aswc.arxml is well-formed
2018-03-05 10:55:55,193 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M740.aswc.arxml is valid with the AUTOSAR4.2.2-STRICT schema
2018-03-05 10:55:55,227 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M741.aswc.arxml is well-formed
2018-03-05 10:55:55,795 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M741.aswc.arxml is valid with the AUTOSAR4.2.2-STRICT schema
2018-03-05 10:55:55,831 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M742.aswc.arxml is well-formed
2018-03-05 10:55:56,403 INFO The file: C:/test/Abu/TRS.ABU.GEN.003_1/input\ASWC_M742.aswc.arxml is valid with the AUTOSAR4.2.2-STRICT schema
2018-03-05 10:55:56,438 WARNING ASWC_M740_MSI is without connector
2018-03-05 10:55:56,438 ERROR ASWC_M741_MSI is without connector
2018-03-05 10:55:56,438 WARNING PRP_CS_VehicleSPeed is without connector
""")

df = pd.read_csv(mystr, sep=',', header=None, names=['Timestamp', 'Message'])

Solution解决方案

df['Message_Error'] = df.loc[df['Message'].str.contains('WARNING|ERROR'), 'Message'].apply(lambda x: x.split(' ')[:3])
df['Message_Error'] = df['Message_Error'].apply(lambda x: x if isinstance(x, list) else [])
df = df.join(pd.DataFrame(df['Message_Error'].values.tolist()))

#              Timestamp                                            Message  \
# 0  2018-03-05 10:55:54  636 INFO The file: C:/test/Abu/TRS.ABU.GEN.003...   
# 1  2018-03-05 10:55:55  193 INFO The file: C:/test/Abu/TRS.ABU.GEN.003...   
# 2  2018-03-05 10:55:55  227 INFO The file: C:/test/Abu/TRS.ABU.GEN.003...   
# 3  2018-03-05 10:55:55  795 INFO The file: C:/test/Abu/TRS.ABU.GEN.003...   
# 4  2018-03-05 10:55:55  831 INFO The file: C:/test/Abu/TRS.ABU.GEN.003...   
# 5  2018-03-05 10:55:56  403 INFO The file: C:/test/Abu/TRS.ABU.GEN.003...   
# 6  2018-03-05 10:55:56     438 WARNING ASWC_M740_MSI is without connector   
# 7  2018-03-05 10:55:56       438 ERROR ASWC_M741_MSI is without connector   
# 8  2018-03-05 10:55:56  438 WARNING PRP_CS_VehicleSPeed is without con...   

#                          Message_Error     0        1                    2  
# 0                                   []  None     None                 None  
# 1                                   []  None     None                 None  
# 2                                   []  None     None                 None  
# 3                                   []  None     None                 None  
# 4                                   []  None     None                 None  
# 5                                   []  None     None                 None  
# 6        [438, WARNING, ASWC_M740_MSI]   438  WARNING        ASWC_M740_MSI  
# 7          [438, ERROR, ASWC_M741_MSI]   438    ERROR        ASWC_M741_MSI  
# 8  [438, WARNING, PRP_CS_VehicleSPeed]   438  WARNING  PRP_CS_VehicleSPeed 

Example query示例查询

q= {'PRP_CS_VehicleSPeed', 'ASWC_M740_MSI', 'ASWC_M741_MSI'}))
mask = (df[1] == 'WARNING') & df[2].isin(q)

df_mask = df[mask]

#              Timestamp                                            Message  \
# 6  2018-03-05 10:55:56     438 WARNING ASWC_M740_MSI is without connector   
# 8  2018-03-05 10:55:56  438 WARNING PRP_CS_VehicleSPeed is without con...   

#                          Message_Error    0        1                    2  
# 6        [438, WARNING, ASWC_M740_MSI]  438  WARNING        ASWC_M740_MSI  
# 8  [438, WARNING, PRP_CS_VehicleSPeed]  438  WARNING  PRP_CS_VehicleSPeed 

I think you need to unindent this line line_file = datafile.readline() .我认为你需要line_file = datafile.readline()这一行line_file = datafile.readline() What you currently doing is checking if the first line contains first message, if not jump to second line and check if it contains second message.您当前所做的是检查第一行是否包含第一条消息,如果没有则跳转到第二行并检查它是否包含第二条消息。 So it is not checking if every line contains one of those three messages.所以它不会检查每一行是否包含这三个消息之一。

You are creating a file pointer but not iterating over it, due to which you are unable to parse whole file.您正在创建一个文件指针,但没有对其进行迭代,因此您无法解析整个文件。

I would suggest on using context guard with我建议在使用上下文后卫

with open(path, 'r') as datafile:
    all_lines = datafile.readlines()
    for line in all_lines:
        if line:
           # rest of your logic

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM