繁体   English   中英

如何使用python从深度嵌套的json中获取所有键?

[英]How to get all keys from deeply nested json using python?

问题陈述:从 json 键中删除/重命名特殊字符(#、$、反斜杠等)并在主 json 文件中替换。

方法 :

  1. 我试图首先获取深度嵌套的 json 的所有键。
  2. 检查每个键中的特殊字符,然后重命名/替换并写回 json 文件。

问题 :

  1. 我的 Json 嵌套很深,所以我编写的逻辑适用于简单的 json,但不适用于深层嵌套的 json。

代码 :

import json
import base64

def getKeys(object, prev_key = None, keys = []):
    if type(object) != type({}):
        keys.append(prev_key)
        return keys
    new_keys = []
    for k, v in object.items():
        if prev_key != None:
            new_key = "{}.{}".format(prev_key, k)
        else:
            new_key = k
        new_keys.extend(getKeys(v, new_key, []))
    return new_keys

上面的代码适用于下面的 json:它打印所有的 json 键

json_string= '{"Relate:0/name": "securityhub-ec2-instance-managed-by-ssm-dc0c9f18","RelatedAWSResources:0/type": "AWS::Config::ConfigRule","aws/securityhub/ProductName": "Security Hub","aws/securityhub/CompanyName": "AWS"}'

输出 :

['Relate:0/name', 'RelatedAWSResources:0/type', 'aws/securityhub/ProductName', 'aws/securityhub/CompanyName']

但它不适用于以下 json :

{
  "version": "0",
  "id": "ffd8a756-9fe6-fa54-af4e-cf85fa3d2896",
  "detail-type": "Security Hub Findings - Imported",
  "source": "aws.securityhub",
  "account": "220307202362",
  "time": "2021-10-17T14:26:25Z",
  "region": "us-west-2",
  "resources": [
    "arn:aws:securityhub:us-west-2::product/aws/securityhub/arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1/PCI.CW.1/finding/b5a325b7-eab1-439f-b14d-1dc52c3a423f"
  ],
  "detail": {
    "findings": [
      {
        "ProductArn": "arn:aws:securityhub:us-west-2::product/aws/securityhub",
        "Types": [
          "Software and Configuration Checks/Industry and Regulatory Standards/PCI-DSS"
        ],
        "Description": "This control checks for the CloudWatch metric filters using the following pattern { $.userIdentity.type = \"Root\" && $.userIdentity.invokedBy NOT EXISTS && $.eventType != \"AwsServiceEvent\" } It checks that the log group name is configured for use with active multi-region CloudTrail, that there is at least one Event Selector for a Trail with IncludeManagementEvents set to true and ReadWriteType set to All, and that there is at least one active subscriber to an SNS topic associated with the alarm.",
        "Compliance": {
          "Status": "FAILED",
          "StatusReasons": [
            {
              "Description": "Multi region CloudTrail with the required configuration does not exist in the account",
              "ReasonCode": "CLOUDTRAIL_MULTI_REGION_NOT_PRESENT"
            }
          ],
          "RelatedRequirements": [
            "PCI DSS 7.2.1"
          ]
        },
        "ProductName": "Security Hub",
        "FirstObservedAt": "2021-10-17T14:26:18.383Z",
        "CreatedAt": "2021-10-17T14:26:18.383Z",
        "LastObservedAt": "2021-10-17T14:26:21.346Z",
        "CompanyName": "AWS",
        "FindingProviderFields": {
          "Types": [
            "Software and Configuration Checks/Industry and Regulatory Standards/PCI-DSS"
          ],
          "Severity": {
            "Normalized": 40,
            "Label": "MEDIUM",
            "Product": 40,
            "Original": "MEDIUM"
          }
        },
        "ProductFields": {
          "StandardsArn": "arn:aws:securityhub:::standards/pci-dss/v/3.2.1",
          "StandardsSubscriptionArn": "arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1",
          "ControlId": "PCI.CW.1",
          "RecommendationUrl": "https://docs.aws.amazon.com/console/securityhub/PCI.CW.1/remediation",
          "StandardsControlArn": "arn:aws:securityhub:us-west-2:220307202362:control/pci-dss/v/3.2.1/PCI.CW.1",
          "aws/securityhub/ProductName": "Security Hub",
          "aws/securityhub/CompanyName": "AWS",
          "aws/securityhub/annotation": "Multi region CloudTrail with the required configuration does not exist in the account",
          "Resources:0/Id": "arn:aws:iam::220307202362:root",
          "aws/securityhub/FindingId": "arn:aws:securityhub:us-west-2::product/aws/securityhub/arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1/PCI.CW.1/finding/b5a325b7-eab1-439f-b14d-1dc52c3a423f"
        },
        "Remediation": {
          "Recommendation": {
            "Text": "For directions on how to fix this issue, consult the AWS Security Hub PCI DSS documentation.",
            "Url": "https://docs.aws.amazon.com/console/securityhub/PCI.CW.1/remediation"
          }
        },
        "SchemaVersion": "2018-10-08",
        "GeneratorId": "pci-dss/v/3.2.1/PCI.CW.1",
        "RecordState": "ACTIVE",
        "Title": "PCI.CW.1 A log metric filter and alarm should exist for usage of the \"root\" user",
        "Workflow": {
          "Status": "NEW"
        },
        "Severity": {
          "Normalized": 40,
          "Label": "MEDIUM",
          "Product": 40,
          "Original": "MEDIUM"
        },
        "UpdatedAt": "2021-10-17T14:26:18.383Z",
        "WorkflowState": "NEW",
        "AwsAccountId": "220307202362",
        "Region": "us-west-2",
        "Id": "arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1/PCI.CW.1/finding/b5a325b7-eab1-439f-b14d-1dc52c3a423f",
        "Resources": [
          {
            "Partition": "aws",
            "Type": "AwsAccount",
            "Region": "us-west-2",
            "Id": "AWS::::Account:220307202362"
          }
        ]
      }
    ]
  }
} 

剥离标点功能:

import string
from typing import Optional, Iterable, Union


delete_dict = {sp_character: '' for sp_character in string.punctuation}

PUNCT_TABLE = str.maketrans(delete_dict)


def strip_punctuation(s: str,
                      exclude_chars: Optional[Union[str, Iterable]] = None) -> str:
    """
    Remove punctuation and spaces from a string.

    If `exclude_chars` is passed, certain characters will not be removed
    from the string.

    """
    punct_table = PUNCT_TABLE.copy()
    if exclude_chars:
        for char in exclude_chars:
            punct_table.pop(ord(char), None)

    # Next, remove the desired punctuation from the string
    return s.translate(punct_table) 

用法:

cleaned_keys = {json data}
for key, expected_key in cleaned_keys.items():
    actual_key = strip_punctuation(key)

问题陈述:从 json 键中删除/重命名特殊字符(#、$、反斜杠等)并在主 json 文件中替换。

如果我对您的理解正确,您不需要创建自己的函数(例如递归函数)来迭代 JSON 数据。

好消息是,在将 JSON 字符串加载到 Python 对象本身时,可以通过使用object_pairs_hook参数来实现这一点。 当您为此参数定义可调用对象时,它将在元组列表中传递,其中每个元组都是来自 JSON 数据的键值对。 因此,您只需替换收到的输入数据中的所有键。

这是一个有点人为的示例,它用感叹号包装所有 JSON 键(嵌套或其他方式) !! 在他们旁边:

import json


json_string = r"""
{
  "version": "0",
  "id": "ffd8a756-9fe6-fa54-af4e-cf85fa3d2896",
  "detail-type": "Security Hub Findings - Imported",
  "source": "aws.securityhub",
  "account": "220307202362",
  "time": "2021-10-17T14:26:25Z",
  "region": "us-west-2",
  "resources": [
    "arn:aws:securityhub:us-west-2::product/aws/securityhub/arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1/PCI.CW.1/finding/b5a325b7-eab1-439f-b14d-1dc52c3a423f"
  ],
  "detail": {
    "findings": [
      {
        "ProductArn": "arn:aws:securityhub:us-west-2::product/aws/securityhub",
        "Types": [
          "Software and Configuration Checks/Industry and Regulatory Standards/PCI-DSS"
        ],
        "Description": "This control checks for the CloudWatch metric filters using the following pattern { $.userIdentity.type = \"Root\" && $.userIdentity.invokedBy NOT EXISTS && $.eventType != \"AwsServiceEvent\" } It checks that the log group name is configured for use with active multi-region CloudTrail, that there is at least one Event Selector for a Trail with IncludeManagementEvents set to true and ReadWriteType set to All, and that there is at least one active subscriber to an SNS topic associated with the alarm.",
        "Compliance": {
          "Status": "FAILED",
          "StatusReasons": [
            {
              "Description": "Multi region CloudTrail with the required configuration does not exist in the account",
              "ReasonCode": "CLOUDTRAIL_MULTI_REGION_NOT_PRESENT"
            }
          ],
          "RelatedRequirements": [
            "PCI DSS 7.2.1"
          ]
        },
        "ProductName": "Security Hub",
        "FirstObservedAt": "2021-10-17T14:26:18.383Z",
        "CreatedAt": "2021-10-17T14:26:18.383Z",
        "LastObservedAt": "2021-10-17T14:26:21.346Z",
        "CompanyName": "AWS",
        "FindingProviderFields": {
          "Types": [
            "Software and Configuration Checks/Industry and Regulatory Standards/PCI-DSS"
          ],
          "Severity": {
            "Normalized": 40,
            "Label": "MEDIUM",
            "Product": 40,
            "Original": "MEDIUM"
          }
        },
        "ProductFields": {
          "StandardsArn": "arn:aws:securityhub:::standards/pci-dss/v/3.2.1",
          "StandardsSubscriptionArn": "arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1",
          "ControlId": "PCI.CW.1",
          "RecommendationUrl": "https://docs.aws.amazon.com/console/securityhub/PCI.CW.1/remediation",
          "StandardsControlArn": "arn:aws:securityhub:us-west-2:220307202362:control/pci-dss/v/3.2.1/PCI.CW.1",
          "aws/securityhub/ProductName": "Security Hub",
          "aws/securityhub/CompanyName": "AWS",
          "aws/securityhub/annotation": "Multi region CloudTrail with the required configuration does not exist in the account",
          "Resources:0/Id": "arn:aws:iam::220307202362:root",
          "aws/securityhub/FindingId": "arn:aws:securityhub:us-west-2::product/aws/securityhub/arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1/PCI.CW.1/finding/b5a325b7-eab1-439f-b14d-1dc52c3a423f"
        },
        "Remediation": {
          "Recommendation": {
            "Text": "For directions on how to fix this issue, consult the AWS Security Hub PCI DSS documentation.",
            "Url": "https://docs.aws.amazon.com/console/securityhub/PCI.CW.1/remediation"
          }
        },
        "SchemaVersion": "2018-10-08",
        "GeneratorId": "pci-dss/v/3.2.1/PCI.CW.1",
        "RecordState": "ACTIVE",
        "Title": "PCI.CW.1 A log metric filter and alarm should exist for usage of the \"root\" user",
        "Workflow": {
          "Status": "NEW"
        },
        "Severity": {
          "Normalized": 40,
          "Label": "MEDIUM",
          "Product": 40,
          "Original": "MEDIUM"
        },
        "UpdatedAt": "2021-10-17T14:26:18.383Z",
        "WorkflowState": "NEW",
        "AwsAccountId": "220307202362",
        "Region": "us-west-2",
        "Id": "arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1/PCI.CW.1/finding/b5a325b7-eab1-439f-b14d-1dc52c3a423f",
        "Resources": [
          {
            "Partition": "aws",
            "Type": "AwsAccount",
            "Region": "us-west-2",
            "Id": "AWS::::Account:220307202362"
          }
        ]
      }
    ]
  }
}
"""


def clean_keys(o):
    return {f'!!{k}!!': v for k, v in o}


r = json.loads(json_string, object_pairs_hook=clean_keys)
print(r)

结果对象:

{'!!version!!': '0', '!!id!!': 'ffd8a756-9fe6-fa54-af4e-cf85fa3d2896', '!!detail-type!!': 'Security Hub Findings - Imported', '!!source!!': 'aws.securityhub', '!!account!!': '220307202362', '!!time!!': '2021-10-17T14:26:25Z', '!!region!!': 'us-west-2', '!!resources!!': ['arn:aws:securityhub:us-west-2::product/aws/securityhub/arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1/PCI.CW.1/finding/b5a325b7-eab1-439f-b14d-1dc52c3a423f'], '!!detail!!': {'!!findings!!': [{'!!ProductArn!!': 'arn:aws:securityhub:us-west-2::product/aws/securityhub', '!!Types!!': ['Software and Configuration Checks/Industry and Regulatory Standards/PCI-DSS'], '!!Description!!': 'This control checks for the CloudWatch metric filters using the following pattern { $.userIdentity.type = "Root" && $.userIdentity.invokedBy NOT EXISTS && $.eventType != "AwsServiceEvent" } It checks that the log group name is configured for use with active multi-region CloudTrail, that there is at least one Event Selector for a Trail with IncludeManagementEvents set to true and ReadWriteType set to All, and that there is at least one active subscriber to an SNS topic associated with the alarm.', '!!Compliance!!': {'!!Status!!': 'FAILED', '!!StatusReasons!!': [{'!!Description!!': 'Multi region CloudTrail with the required configuration does not exist in the account', '!!ReasonCode!!': 'CLOUDTRAIL_MULTI_REGION_NOT_PRESENT'}], '!!RelatedRequirements!!': ['PCI DSS 7.2.1']}, '!!ProductName!!': 'Security Hub', '!!FirstObservedAt!!': '2021-10-17T14:26:18.383Z', '!!CreatedAt!!': '2021-10-17T14:26:18.383Z', '!!LastObservedAt!!': '2021-10-17T14:26:21.346Z', '!!CompanyName!!': 'AWS', '!!FindingProviderFields!!': {'!!Types!!': ['Software and Configuration Checks/Industry and Regulatory Standards/PCI-DSS'], '!!Severity!!': {'!!Normalized!!': 40, '!!Label!!': 'MEDIUM', '!!Product!!': 40, '!!Original!!': 'MEDIUM'}}, '!!ProductFields!!': {'!!StandardsArn!!': 'arn:aws:securityhub:::standards/pci-dss/v/3.2.1', '!!StandardsSubscriptionArn!!': 'arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1', '!!ControlId!!': 'PCI.CW.1', '!!RecommendationUrl!!': 'https://docs.aws.amazon.com/console/securityhub/PCI.CW.1/remediation', '!!StandardsControlArn!!': 'arn:aws:securityhub:us-west-2:220307202362:control/pci-dss/v/3.2.1/PCI.CW.1', '!!aws/securityhub/ProductName!!': 'Security Hub', '!!aws/securityhub/CompanyName!!': 'AWS', '!!aws/securityhub/annotation!!': 'Multi region CloudTrail with the required configuration does not exist in the account', '!!Resources:0/Id!!': 'arn:aws:iam::220307202362:root', '!!aws/securityhub/FindingId!!': 'arn:aws:securityhub:us-west-2::product/aws/securityhub/arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1/PCI.CW.1/finding/b5a325b7-eab1-439f-b14d-1dc52c3a423f'}, '!!Remediation!!': {'!!Recommendation!!': {'!!Text!!': 'For directions on how to fix this issue, consult the AWS Security Hub PCI DSS documentation.', '!!Url!!': 'https://docs.aws.amazon.com/console/securityhub/PCI.CW.1/remediation'}}, '!!SchemaVersion!!': '2018-10-08', '!!GeneratorId!!': 'pci-dss/v/3.2.1/PCI.CW.1', '!!RecordState!!': 'ACTIVE', '!!Title!!': 'PCI.CW.1 A log metric filter and alarm should exist for usage of the "root" user', '!!Workflow!!': {'!!Status!!': 'NEW'}, '!!Severity!!': {'!!Normalized!!': 40, '!!Label!!': 'MEDIUM', '!!Product!!': 40, '!!Original!!': 'MEDIUM'}, '!!UpdatedAt!!': '2021-10-17T14:26:18.383Z', '!!WorkflowState!!': 'NEW', '!!AwsAccountId!!': '220307202362', '!!Region!!': 'us-west-2', '!!Id!!': 'arn:aws:securityhub:us-west-2:220307202362:subscription/pci-dss/v/3.2.1/PCI.CW.1/finding/b5a325b7-eab1-439f-b14d-1dc52c3a423f', '!!Resources!!': [{'!!Partition!!': 'aws', '!!Type!!': 'AwsAccount', '!!Region!!': 'us-west-2', '!!Id!!': 'AWS::::Account:220307202362'}]}]}}

编辑:使用问题中提供的strip_punctuation函数, clean_keys函数将定义如下:

def clean_keys(o):
    return {strip_punctuation(k): v for k, v in o}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM