简体   繁体   English

从大型 JSON 文件中解析字段,其中另一个字段满足条件,使用 Python

[英]Parsing field from large JSON file where another field meets a condition, using Python

I've never worked with JSON files before and I cannot find a way to get certain fields out of the file using Python.我以前从未使用过 JSON 文件,我找不到使用 Python 从文件中获取某些字段的方法。 The file in question is the Azure Service Tags file which lists all the Azure services by region and lists out the IP addresses used.有问题的文件是 Azure 服务标签文件,该文件按地区列出了所有 Azure 服务,并列出了使用的 IP 地址。 The latest file can be found here https://www.microsoft.com/en-us/download/details.aspx?id=56519 but a sample is below (it's over 70,000 lines).最新的文件可以在这里找到https://www.microsoft.com/en-us/download/details.aspx?id=56519但下面是一个示例(超过 70,000 行)。

{
  "changeNumber": 135,
  "cloud": "Public",
  "values": [
    {
      "name": "ActionGroup",
      "id": "ActionGroup",
      "properties": {
        "changeNumber": 7,
        "region": "",
        "regionId": 0,
        "platform": "Azure",
        "systemService": "ActionGroup",
        "addressPrefixes": [
          "13.66.60.119/32",
          "13.66.143.220/30",
          "13.66.202.14/32",
          "13.66.248.225/32",
          "13.66.249.211/32",
          "13.67.10.124/30",
          "13.69.109.132/30"
        ],
        "networkFeatures": [
          "API",
          "NSG",
          "UDR",
          "FW"
        ]
      }
    },
    {
      "name": "ApplicationInsightsAvailability",
      "id": "ApplicationInsightsAvailability",
      "properties": {
        "changeNumber": 2,
        "region": "",
        "regionId": 0,
        "platform": "Azure1",
        "systemService": "ApplicationInsightsAvailability",
        "addressPrefixes": [
          "13.86.97.224/27",
          "13.86.98.0/27",
          "13.86.98.48/28",
          "13.86.98.64/28",
          "20.37.156.64/27",
          "20.37.192.80/29"
        ],
        "networkFeatures": [
          "API",
          "NSG",
          "UDR",
          "FW"
        ]
      }
    },
  ]
}

What I am trying to do is "if the regionId is between 27 and 30, print the addressPrefixes".我想要做的是“如果 regionId 在 27 到 30 之间,打印 addressPrefixes”。 I've tried doing this with jsonpath_rw, jsonpath_rw_ext, pandas, and probably some other ways (I've been looking at this sporadically for a while).我已经尝试过使用 jsonpath_rw、jsonpath_rw_ext、pandas 以及其他一些方法来执行此操作(我已经偶尔研究了一段时间)。

This code will print a list of unique region ID's (using jsonpath_rw_ext):此代码将打印唯一区域 ID 的列表(使用 jsonpath_rw_ext):

with open(r'C:\Temp\ServiceTags_Public_20210215.json') as f:
    data = json.load(f)

listRegionId = []
for regionId in jp.match("$..properties.regionId", data):
    if regionId not in listRegionId:
        listRegionId.append(regionId)
print(listRegionId)

If I change the if statement to if regionId == 30: how can I then reference the addressPrefixes field?如果我将 if 语句更改为if regionId == 30:那么如何引用 addressPrefixes 字段?

Many thanks非常感谢

Edit to add I'm a network engineer who can do a bit of Python so my code isn't the best编辑添加我是一名网络工程师,可以做一些 Python 所以我的代码不是最好的

I hope this is what you want:我希望这是你想要的:

import json
with open('test.json', 'r') as f:
    data = json.load(f)

for element in data['values']:
    region_id = element['properties']['regionId']
    if 27 < region_id < 30:
        print('RegionId: ' + str(region_id))
        print(element['properties']['addressPrefixes'])

I tried it on the json data you attached, can't display entire output so here's a sample of it:我在您附加的 json 数据上进行了尝试,无法显示整个 output 所以这里是它的一个示例:

RegionId: 29
['2603:1020:305:402::178/125']
RegionId: 28
['2603:1020:605:402::178/125']
RegionId: 29
['13.87.122.84/31', '13.87.123.144/28', '2603:1020:305:402::140/124']
RegionId: 28
['51.137.136.0/32', '51.140.210.84/31', '51.140.211.176/28', '2603:1020:605:402::140/124']
RegionId: 29
['2603:1020:305:402::a0/123']

You can access a sort of deeper level in python dictionaries by using double squared brackets.您可以使用双方括号访问 python 字典中的更深层次。

This should do the trick:这应该可以解决问题:

with open('ServiceTags_Public_20210405.json') as f:
    data = json.load(f)

for entry in data["values"]:
    if 27 < entry['properties']["regionId"] < 30:
        print(entry['properties']["addressPrefixes"])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM