简体   繁体   English

从 python 中的列表项中解析特定字符串

[英]parsing specific strings from list items in python

I have the following code in python, which contains log messages to debug SSH.我在 python 中有以下代码,其中包含调试 SSH 的日志消息。

for log_item in ssh_log:
   print(log_item.rstrip())

#will show ...
2022-04-06 01:55:15,085 10.x Remote version/idstring: SSH-2.0-ConfD-4.3.11.4
2022-04-06 01:55:15,085 20.x Connected (version 2.0, client ConfD-4.3.11.4)
2022-04-06 01:55:15,161 10.x kex algos:['diffie-hellman-group14-sha1'] server key:['ssh-rsa']
...

What is the approach to get the values in bold assign my variables, maybe some regex as part of the for loop or something else to get the following:获取粗体值的方法是什么分配我的变量,可能是一些正则表达式作为 for 循环的一部分或其他东西来获得以下内容:

idstring = SSH-2.0-ConfD-4.3.11.4
kex_algos = ['diffie-hellman-group14-sha1']
key_type = ['ssh-rsa']

If all the data is in the same format as the data given here, You can use the following regex:如果所有数据的格式都与此处给出的数据相同,则可以使用以下正则表达式:

import re
a = """
2022-04-06 01:55:15,085 10.x Remote version/idstring: SSH-2.0-ConfD-4.3.11.4
2022-04-06 01:55:15,085 20.x Connected (version 2.0, client ConfD-4.3.11.4)
2022-04-06 01:55:15,161 10.x kex algos:['diffie-hellman-group14-sha1'] server key:['ssh-rsa']"""

idstring = re.findall("idstring: (.*)", a)[0] # Remove zero to get a list if 
                                              # multiple values are present
print(idstring)
kex_algos = re.findall("algos:\['(.*)'\] ", a)
print(kex_algos)
key_type = re.findall("key:\['(.*)'\]", a)
print(key_type)

Output: Output:

'SSH-2.0-ConfD-4.3.11.4'
['diffie-hellman-group14-sha1']
['ssh-rsa']

Solution without regex.没有正则表达式的解决方案。 See comments inline below.请参阅下面的内联评论。

for log_item in ssh_log:
    line = log_item.rstrip()
    if 'idstring' in line:
        print('idstring = ',line.split(':')[-1]) #Pick last value after ':'
    if 'kex algos' in line:
        print('kex algos = ', line[line.find('['):line.find(']')+1]) #find between first set of square brackets.
    if 'key:' in line:
        key = line.split('key:')[1] #Get values after 'key:'
        print('key_type = ', key)

You can update prints to variable assignments if this is what you required.如果这是您的需要,您可以将打印更新为变量分配。

You can also use ttp template to parse your data if your data has similar structure.如果您的数据具有相似的结构,您还可以使用 ttp 模板来解析您的数据。

from ttp import ttp
import json

with open("log.txt") as f:
    data_to_parse = f.read()

ttp_template = """
{{ignore}} {{ignore}} {{ignore}} {{ignore}} version/idstring: {{version_id_string}}
{{ignore}} {{ignore}} {{ignore}} {{ignore}} algos:{{key_algos}} server key:{{key_type}}
"""

parser = ttp(data=data_to_parse, template=ttp_template)
parser.parse()

# print result in JSON format
results = parser.result(format='json')[0]
# print(results)

result = json.loads(results)

# print(result)

for i in result:
    print(i["key_algos"])
    print(i["key_type"])
    print(i["version_id_string"])

The output is: output 是:

['diffie-hellman-group14-sha1']
['ssh-rsa']
SSH-2.0-ConfD-4.3.11.4

With the 3 lines of sample data from the original question in a file, one could take this approach:使用文件中原始问题的 3 行示例数据,可以采用以下方法:

import re

with open('ssh.log') as sshlog:
    for line in map(str.strip, sshlog):
        _, _, _, kw, *rem = line.split()
        match kw:
            case 'Remote':
                print(f'ID string = {rem[-1]}')
            case 'kex':
                m = re.findall("(?<=\[').+?(?='\])", line)
                print(f'algos = {m[0]}')
                print(f'type = {m[1]}')
            case _:
                pass

The assumption here is that only lines with either of the keywords 'Remote' or 'kex' are of interest.这里的假设是只有带有关键字“Remote”或“kex”的行才有意义。

Output: Output:

ID string = SSH-2.0-ConfD-4.3.11.4
algos = diffie-hellman-group14-sha1
type = ssh-rsa

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM