锻炼的数据结构建议

Question

As a part of Data Structure course my teacher gave me an extra exercise which is a little bit more difficult and challenging. 作为“数据结构”课程的一部分，我的老师给了我额外的练习，这有点困难和挑战。 I`ve tried to find out the Data Structure I need to use for this problem and I dont have any idea, also I want to try to code it by myself out of the exercise to improve my python skills. 我已经尝试找出我需要用于解决此问题的数据结构，但我没有任何想法，我也想尝试自己编写代码以提高自己的python技能。

About the exercise: 1. I have a text file with logs which looks like that: 关于练习：1.我有一个带有日志的文本文件，如下所示：

M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Open Connection”
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Close Connection, and reboot”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in

there are 2 types of logs, M is Master and S is Slave. 日志有2种类型，M是Master，S是Slave。 I need a data structure which will be able to split each row and grab it into a specific column. 我需要一个数据结构，该数据结构将能够拆分每一行并将其抓取到特定的列中。 ie M-1 columns will be : 即M-1列将是：

M, 1, Datetime, Error Level, DeviceId, UserId, Message

but S-1 columns will be : 但S-1列将为：

S, 1, Datetime, Error Level, DeviceId, Action, Message

Note : as you can see there is Action in S,1 but not UserId. 注意：如您所见，在S，1中有Action，但没有UserId。

What I need to be able to do at the end is to enter in the command line the columns that I want to stdout and a condition (ie Error Level > 50). 最后，我需要做的是在命令行中输入我要标准输出的列和条件（即错误级别> 50）。

What I tought about was Dictionary, but by this way I won't be able to support unlimited number of versions (if its possible, please explain me how). 我要讲的是Dictionary，但这样一来，我将无法支持无限数量的版本（如果可能，请向我解释如何）。

Thanks! 谢谢！

Answer 1

I would probably use a namedtuple class from the collections package to hold each parsed item since it allows you to access each field by an index number and also by name. 我可能会使用collections包中的namedtuple类来保存每个已解析的项，因为它允许您通过索引号和名称来访问每个字段。 Moreover, new namedtuple classes can be dynamically created rather easily by passing a list of column names. 此外，可以通过传递列名列表来轻松地动态创建新的namedtuple类。

from collections import namedtuple

Master = namedtuple('Master', ['Type', 'N', 'Datetime', 'ErrorLevel', 'DeviceId', 'UserName', 'Message'])
Slave = namedtuple('Slave', ['Type', 'N', 'Datetime', 'ErrorLevel', 'DeviceId', 'Action', 'Message'])

n_cols = 7

logfileasstring = """
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Open Connection”
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Close Connection, and reboot”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in"""


master_list = []
slave_list = []

for r in logfileasstring.splitlines(False):
    if not r:
        continue
    values = [value.strip() for value in r.split(',', n_cols - 1)]
    if r[0] == 'M':
        master_list.append(Master(*values))
    else:
        slave_list.append(Slave(*values))


print(master_list[0][6]) # by index
print(master_list[0].Message) # by column name if name known in advance
column_name = 'Message'
print(master_list[0].__getattribute__(column_name)) # by column name if name not known in advance

Run demo 运行演示

Answer 2

does this help: 这是否有帮助：

logfileasstring = """
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Open Connection”
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Close Connection, and reboot”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system”
S, 1, 14/08/2019 11:40, 6, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, User logged in, User username logged in"""
listoflist = [[v.strip() for v in r.split(",", maxsplit=6)]
               for r in logfileasstring.splitlines(keepends=False) 
               if r]

grouped = {("M", "1"): [], ("S", "1"): []}
for row in listoflist:
    datasets_for = grouped[row[0], row[1]]
    datasets_for.append(row[2:])


# must be set by script
fields = [0, 1, 2]
for k in grouped:
    print(k, "::")
    for row in grouped[k]:
        print("  -", [row[f] for f in fields])

锻炼的数据结构建议

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-08-24 13:53:15

解决方案2
0 2019-08-23 12:04:51

锻炼的数据结构建议

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-08-24 13:53:15

解决方案2 0 2019-08-23 12:04:51

解决方案1
1 已采纳 2019-08-24 13:53:15

解决方案2
0 2019-08-23 12:04:51