I have an enormous json file with entries that contain IPv4 addresses. Assume /24 subnet mask. Sample:
json = [
{ "ip": "154.16.58.206"},
{ "ip": "154.16.58.218"},
{ "ip": "154.16.46.180"},
{ "ip": "154.16.60.181"},
{ "ip": "154.16.46.167"},
{ "ip": "154.16.58.131"},
{ "ip": "154.16.60.173"},
{ "ip": "154.16.62.147"},
{ "ip": "154.16.62.175"},
{ "ip": "154.16.50.216"},
{ "ip": "154.16.58.141"}
]
const str = JSON.stringify(json)
I want some sort of mapping of what groups there are and how many ip's are in every group like:
{
"154.16.58.0" => 4
"154.16.46.0" => 2
"154.16.60.0" => 2
"154.16.62.0" => 2
"154.16.50.0" => 1
}
I might be able to come up with some greedy js solution but because it's so much data, I need a performant regex solution. And the only thing I can come up with would be something like /(\d+.\d+.\d+).\d+/g
Assuming file.json
contains:
[
{ "ip": "154.16.58.206"},
{ "ip": "154.16.58.218"},
{ "ip": "154.16.46.180"},
{ "ip": "154.16.60.181"},
{ "ip": "154.16.46.167"},
{ "ip": "154.16.58.131"},
{ "ip": "154.16.60.173"},
{ "ip": "154.16.62.147"},
{ "ip": "154.16.62.175"},
{ "ip": "154.16.50.216"},
{ "ip": "154.16.58.141"}
]
Then would you please try the python
code:
#!/usr/bin/python
import re, json
from collections import Counter
with open('file.json') as f:
net = []
for i in json.load(f):
ip = re.sub(r'\d+$', '0', i["ip"])
net.append(ip)
c = Counter(net)
print(json.dumps(c, indent=2))
Output:
{
"154.16.46.0": 2,
"154.16.50.0": 1,
"154.16.62.0": 2,
"154.16.60.0": 2,
"154.16.58.0": 4
}
I managed to find a solution that worked for me:
const json = [
{ "ip": "154.16.58.206" },
{ "ip": "154.16.58.218" },
{ "ip": "154.16.46.180" },
{ "ip": "154.16.60.181" },
{ "ip": "154.16.46.167" },
{ "ip": "154.16.58.131" },
{ "ip": "154.16.60.173" },
{ "ip": "154.16.62.147" },
{ "ip": "154.16.62.175" },
{ "ip": "154.16.50.216" },
{ "ip": "154.16.58.141" }
]
const str = JSON.stringify(json)
const ipMap = new Map()
const regxp = /(\d+\.\d+\.\d+)\.\d+/g
const matches = str.matchAll(regxp);
for (const match of matches) {
const [, ip] = match
const val = ipMap.get(ip)
ipMap.set(ip, val ? val + 1 : 1)
}
console.log(ipMap)
Output:
(5) {
154.16.58 => 4
154.16.46 => 2
154.16.60 => 2
154.16.62 => 2
154.16.50 => 1
}
You can try it on JSitor: https://jsitor.com/-bUGo7lZC
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.