在python中将文本片段转换为字典键值对/ JSON

Question

I am trying to convert the output of Prowler in the following format to a dictionary, and then converting the dictionary into a JSON file. 我正在尝试将以下格式的Prowler输出转换为字典，然后将字典转换为JSON文件。

 0.1  Generating AWS IAM Credential Report... 

 1  Identity and Access Management **************************************** 

 1.1  Avoid the use of the root account (Scored).
       INFO! Root account last accessed (password key_1 key_2): 1970-01-01 00:00:00 N/A N/A 

 1.2  Ensure multi-factor authentication (MFA) is enabled for all IAM users that have a console password (Scored)
       WARNING! User XXXX has Password enabled but MFA disabled 
       WARNING! User XXXX has Password enabled but MFA disabled 

 1.3  Ensure credentials unused for 90 days or greater are disabled (Scored)
       WARNING! User "XXXX" has not logged in during the last 90 days  
       WARNING! User "XXXX" has not logged in during the last 90 days   
       OK!  User "XXXX" found with credentials used in the last 90 days
       OK!  User "XXXX" found with credentials used in the last 90 days

 1.4  Ensure access keys are rotated every 90 days or less (Scored)
       WARNING!  XXXXXXX has not rotated access key1 in over 90 days   

 1.5  Ensure IAM password policy requires at least one uppercase letter (Scored)
       OK!  Password Policy requires upper case

 1.6  Ensure IAM password policy require at least one lowercase letter (Scored)
       OK!  Password Policy requires lower case

In python, I have this function to parse the prowler.txt file, which uses regex to find the section header as a key value for the dictionary, and then parse the text file after a header match to add the lines underneath as the value for the key. 在python中，我具有此函数来解析prowler.txt文件，该文件使用regex查找部分标题作为字典的键值，然后在标题匹配后解析文本文件，以将下面的行添加为钥匙。

def create_master_report(ec2_info):
    prowler_file = 'reports/prowler.txt'
    findings = {}
    with open(prowler_file, 'r') as f:
        for line in f:
            if re.search('\s\d\.\d\d*\s\s\w', line):
                header = line.strip()
                findings.update({header: []})
    for i in findings:
        prowler_findings = []
        with open(prowler_file, 'r') as f:
            for index, line in enumerate(f, start=1):
                if line.strip() == i:
                    for line in enumerate(f, start=index+1):
                        if line != r'\\n':
                            #if re.search('WARNING!', line):
                            prowler_findings.append(str(line).strip())
                        if line == r'\\n':
                            break
        findings.update({i: prowler_findings})
    report_json['Prowler Results'].update(findings)
    with open(master_report, 'w') as outfile:
        json.dump(report_json, outfile, sort_keys=True)

However, I seem to be looping through the entire document and adding much more than anticipated as the key value. 但是，我似乎遍历整个文档，并且添加了比预期更多的关键值。 The end goal here is to parse the document starting at the line after the header, and then break once a new line is detected. 此处的最终目标是解析文档，该文档从标头之后的行开始，然后在检测到新行时中断。 I think a while loop would work, but I can't seem to implement one that loops through each line and breaks on a new line. 我认为while循环会起作用，但是我似乎无法实现一个循环遍历每一行并在新行上中断的循环。 In addition, I would only want to pull in lines that contain 'WARNING!' 另外，我只想插入包含'WARNING!' , but I have that commented out in order to test the basic functionality. ，但为了进行基本功能测试，我将其注释掉了。

Can anyone provide any insight on how to do this? 谁能提供任何有关如何执行此操作的见解？

Answer 1

You can do this in one loop. 您可以在一个循环中完成此操作。 For each line, if it is a header, update the current key, else append the line in the findings with current key: 对于每行，如果它是标题，请更新当前键，否则将行中的行添加到当前键中：

findings = {}
key = None
with open('reports/prowler.txt') as f:
   for line in f:
       if re.search(r'\s\d\.\d\d*\s\s\w', line):
          key = line.strip()
          findings[key] = []
       elif key is not None:
           findings[key].append(line)

Answer 2

There is a good answer yet. 有一个很好的答案。 But note that: 但请注意：

if line == r'\\n':

does not work if you want to match (or not) a single carriage return. 如果您想匹配（或不匹配）单个回车符，则不起作用。 It should be simply: 应该很简单：

if line == '\n':

在python中将文本片段转换为字典键值对/ JSON

问题描述

2 个解决方案

解决方案1
0 已采纳 2018-03-28 14:06:33

解决方案2
0 2018-03-28 14:14:51

在python中将文本片段转换为字典键值对/ JSON

问题描述

2 个解决方案

解决方案1 0 已采纳 2018-03-28 14:06:33

解决方案2 0 2018-03-28 14:14:51

解决方案1
0 已采纳 2018-03-28 14:06:33

解决方案2
0 2018-03-28 14:14:51