简体   繁体   中英

How to find the frequency of IP addresses that belong to the same subnet using regex?

I have an enormous json file with entries that contain IPv4 addresses. Assume /24 subnet mask. Sample:

json = [
  { "ip": "154.16.58.206"},
  { "ip": "154.16.58.218"},
  { "ip": "154.16.46.180"},
  { "ip": "154.16.60.181"},
  { "ip": "154.16.46.167"},
  { "ip": "154.16.58.131"},
  { "ip": "154.16.60.173"},
  { "ip": "154.16.62.147"},
  { "ip": "154.16.62.175"},
  { "ip": "154.16.50.216"},
  { "ip": "154.16.58.141"}
]

const str = JSON.stringify(json)

I want some sort of mapping of what groups there are and how many ip's are in every group like:

{
  "154.16.58.0" => 4
  "154.16.46.0" => 2
  "154.16.60.0" => 2
  "154.16.62.0" => 2
  "154.16.50.0" => 1
}

I might be able to come up with some greedy js solution but because it's so much data, I need a performant regex solution. And the only thing I can come up with would be something like /(\d+.\d+.\d+).\d+/g

Assuming file.json contains:

[
  { "ip": "154.16.58.206"},
  { "ip": "154.16.58.218"},
  { "ip": "154.16.46.180"},
  { "ip": "154.16.60.181"},
  { "ip": "154.16.46.167"},
  { "ip": "154.16.58.131"},
  { "ip": "154.16.60.173"},
  { "ip": "154.16.62.147"},
  { "ip": "154.16.62.175"},
  { "ip": "154.16.50.216"},
  { "ip": "154.16.58.141"}
]

Then would you please try the python code:

#!/usr/bin/python

import re, json
from collections import Counter

with open('file.json') as f:
    net = []
    for i in json.load(f):
        ip = re.sub(r'\d+$', '0', i["ip"])
        net.append(ip)
c = Counter(net)
print(json.dumps(c, indent=2))

Output:

{
  "154.16.46.0": 2, 
  "154.16.50.0": 1, 
  "154.16.62.0": 2, 
  "154.16.60.0": 2, 
  "154.16.58.0": 4
}

I managed to find a solution that worked for me:

const json = [
  { "ip": "154.16.58.206" },
  { "ip": "154.16.58.218" },
  { "ip": "154.16.46.180" },
  { "ip": "154.16.60.181" },
  { "ip": "154.16.46.167" },
  { "ip": "154.16.58.131" },
  { "ip": "154.16.60.173" },
  { "ip": "154.16.62.147" },
  { "ip": "154.16.62.175" },
  { "ip": "154.16.50.216" },
  { "ip": "154.16.58.141" }
]
const str = JSON.stringify(json)

const ipMap = new Map()

const regxp = /(\d+\.\d+\.\d+)\.\d+/g
const matches = str.matchAll(regxp);

for (const match of matches) {
  const [, ip] = match
  const val = ipMap.get(ip)
  ipMap.set(ip, val ? val + 1 : 1)
}
console.log(ipMap)

Output:

(5) {
  154.16.58 => 4
  154.16.46 => 2
  154.16.60 => 2
  154.16.62 => 2
  154.16.50 => 1
}

You can try it on JSitor: https://jsitor.com/-bUGo7lZC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM