[英]Count the number of occurrences of an IP in the JSON log file
I have the following data in JSON format.我有以下 JSON 格式的数据。 I want to find the number of occurrences (count) of each unique value of the "remoteIp"
key.我想找到"remoteIp"
键的每个唯一值的出现次数(计数)。
{
"insertId": "kdkddkdmdkd",
"jsonPayload": {
"@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
"enforcedSecurityPolicy": {
"configuredAction": "DENY",
"outcome": "DENY",
"preconfiguredExprIds": [
"owasp-crs-v030001-id942220-sqli"
],
"name": "shbdbbddjdjdjd",
"priority": 2000
},
"statusDetails": "body_denied_by_security_policy"
},
"httpRequest": {
"requestMethod": "POST",
"requestUrl": "https://dknnkkdkddkd/token",
"requestSize": "3004",
"status": 403,
"responseSize": "274",
"userAgent": "okhttp/3.12.2",
"remoteIp": "182.2.169.59",
"serverIp": "10.114.44.4",
"latency": "0.018728s"
}
The solution I have created till now is able to fetch all the unique "remoteIp"
s and saved them to a set.到目前为止,我创建的解决方案能够获取所有唯一的"remoteIp"
并将它们保存到一个集合中。 But somehow I am not able to count the occurrence of each unique Ip in the log file.但不知何故,我无法计算日志文件中每个唯一 Ip 的出现次数。
import json
unique_ip = set()
request_url = set()
request_method = set()
status_code = set()
userAgent = set()
with open("automation.json") as file:
data = json.load(file)
for d2 in data:
s1 = (d2['httpRequest']['requestUrl'])
request_url.add(''.join(s1))
s2 = (d2['httpRequest']['requestMethod'])
request_method.add(''.join(s2))
s3 = (d2['httpRequest']['remoteIp'])
unique_ip.add(''.join(s3))
s4 = (str(d2['httpRequest']['status']))
status_code.add(''.join(s4))
s5 = (d2['httpRequest']['userAgent'])
userAgent.add(''.join(s5))
def printing():
a = str(len(unique_ip))
b = str(len(request_url))
c = str(len(request_method))
d = str(len(userAgent))
e = str(len(status_code))
with open("output.csv", "w") as f1:
print(
f' {a} Unique IP List = {unique_ip} \n {b} Unique URLs = {request_url} \n {c} Unique Req Method = {request_method} \n'
f' {d} Unique userAgent = {userAgent} \n {e} Unique statusCode = {status_code}', file=f1)
printing()
Make a frequency table instead of a set.制作频率表而不是一组。 You'll go through the same amount of steps, but instead of not adding already existing IPs and other, you add to their frequency.您将完成相同数量的步骤,但不是不添加现有 IP 和其他,而是添加它们的频率。
EDIT: added function for try catching attempts to get and save values from a response.编辑:添加了尝试捕获从响应中获取和保存值的尝试的功能。 updated example json to include a response with no status code更新了示例 json 以包含没有状态代码的响应
import json
unique_ip = {}
request_url = {}
request_method = {}
status_code = {}
userAgent = {}
def getAndSaveValueSafely(freqTable, searchDict, key):
try:
tmp = searchDict['httpRequest'][key]
if tmp in freqTable:
freqTable[tmp] += 1
else:
freqTable[tmp] = 1
except KeyError:
if 'not_present' in freqTable:
freqTable['not_present'] += 1
else:
freqTable['not_present'] = 1
with open("tmp.json") as file:
data = json.load(file)
for d2 in data:
getAndSaveValueSafely(request_url, d2, 'requestUrl')
getAndSaveValueSafely(request_method, d2, 'requestMethod')
getAndSaveValueSafely(unique_ip, d2, 'remoteIp')
getAndSaveValueSafely(status_code, d2, 'status')
getAndSaveValueSafely(userAgent, d2, 'userAgent')
print('request_url: ', request_url)
print('request_method: ', request_method)
print('unique_ip: ', unique_ip)
print('status_code: ', status_code)
print('userAgent: ', userAgent)
example list of dicts.字典的示例列表。 copied your example 3 times and added another unique复制了您的示例 3 次并添加了另一个独特的
[
{
"insertId": "kdkddkdmdkd",
"jsonPayload": {
"@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
"enforcedSecurityPolicy": {
"configuredAction": "DENY",
"outcome": "DENY",
"preconfiguredExprIds": [
"owasp-crs-v030001-id942220-sqli"
],
"name": "shbdbbddjdjdjd",
"priority": 2000
},
"statusDetails": "body_denied_by_security_policy"
},
"httpRequest": {
"requestMethod": "POST",
"requestUrl": "https://dknnkkdkddkd/token",
"requestSize": "3004",
"status": 403,
"responseSize": "274",
"userAgent": "okhttp/3.12.2",
"remoteIp": "182.2.169.59",
"serverIp": "10.114.44.4",
"latency": "0.018728s"
}
},
{
"insertId": "kdkddkdmdkd",
"jsonPayload": {
"@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
"enforcedSecurityPolicy": {
"configuredAction": "DENY",
"outcome": "DENY",
"preconfiguredExprIds": [
"owasp-crs-v030001-id942220-sqli"
],
"name": "shbdbbddjdjdjd",
"priority": 2000
},
"statusDetails": "body_denied_by_security_policy"
},
"httpRequest": {
"requestMethod": "POST",
"requestUrl": "https://dknnkkdkddkd/token",
"requestSize": "3004",
"status": 403,
"responseSize": "274",
"userAgent": "okhttp/3.12.2",
"remoteIp": "182.2.169.59",
"serverIp": "10.114.44.4",
"latency": "0.018728s"
}
},
{
"insertId": "kdkddkdmdkd",
"jsonPayload": {
"@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
"enforcedSecurityPolicy": {
"configuredAction": "DENY",
"outcome": "DENY",
"preconfiguredExprIds": [
"owasp-crs-v030001-id942220-sqli"
],
"name": "shbdbbddjdjdjd",
"priority": 2000
},
"statusDetails": "body_denied_by_security_policy"
},
"httpRequest": {
"requestMethod": "POST",
"requestUrl": "https://dknnkkdkddkd/token",
"requestSize": "3004",
"status": 403,
"responseSize": "274",
"userAgent": "okhttp/3.12.2",
"remoteIp": "182.2.169.59",
"serverIp": "10.114.44.4",
"latency": "0.018728s"
}
},
{
"insertId": "kdkddkdmdkd",
"jsonPayload": {
"@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
"enforcedSecurityPolicy": {
"configuredAction": "DENY",
"outcome": "DENY",
"preconfiguredExprIds": [
"owasp-crs-v030001-id942220-sqli"
],
"name": "shbdbbddjdjdjd",
"priority": 2000
},
"statusDetails": "body_denied_by_security_policy"
},
"httpRequest": {
"requestMethod": "GET",
"requestUrl": "https://temp/token",
"requestSize": "3004",
"status": 403,
"responseSize": "274",
"userAgent": "okhttp/3.11.2",
"remoteIp": "182.2.168.59",
"serverIp": "10.113.44.4",
"latency": "0.018728s"
}
},
{
"insertId": "kdkddkdmdkd",
"jsonPayload": {
"@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
"enforcedSecurityPolicy": {
"configuredAction": "DENY",
"outcome": "DENY",
"preconfiguredExprIds": [
"owasp-crs-v030001-id942220-sqli"
],
"name": "shbdbbddjdjdjd",
"priority": 2000
},
"statusDetails": "body_denied_by_security_policy"
},
"httpRequest": {
"requestMethod": "GET",
"requestUrl": "https://temp/token",
"requestSize": "3004",
"responseSize": "274",
"userAgent": "okhttp/3.11.2",
"remoteIp": "182.2.168.59",
"serverIp": "10.113.44.4",
"latency": "0.018728s"
}
}
]
output from running code运行代码的输出
request_url: {'https://dknnkkdkddkd/token': 3, 'https://temp/token': 2}
request_method: {'POST': 3, 'GET': 2}
unique_ip: {'182.2.169.59': 3, '182.2.168.59': 2}
status_code: {403: 4, 'not_present': 1}
userAgent: {'okhttp/3.12.2': 3, 'okhttp/3.11.2': 2}
Probably the simplest thing to do is use a collections.Counter
instead of a set
to track the removeIp
s encountered.可能最简单的做法是使用collections.Counter
而不是set
来跟踪遇到的removeIp
。 It's a dictionary subclass and like a dictionary, the keys must all be unique, plus it keeps track of how many times each key is "added".它是一个字典子类,就像字典一样,键必须都是唯一的,而且它会跟踪每个键被“添加”的次数。
Here is the modified code:这是修改后的代码:
from collections import Counter
import json
request_url = set()
request_method = set()
unique_ip = Counter()
status_code = set()
userAgent = set()
with open("automation.json") as file:
data = json.load(file)
for d2 in data:
s1 = d2['httpRequest']['requestUrl']
request_url.add(''.join(s1))
s2 = d2['httpRequest']['requestMethod']
request_method.add(''.join(s2))
s3 = d2['httpRequest']['remoteIp']
unique_ip.update([s3])
s4 = str(d2['httpRequest']['status'])
status_code.add(''.join(s4))
s5 = d2['httpRequest']['userAgent']
userAgent.add(''.join(s5))
def printing():
a = len(unique_ip)
b = len(request_url)
c = len(request_method)
d = len(userAgent)
e = len(status_code)
with open("output.csv", "w") as f1:
print(
f' {a} Unique IP List = {unique_ip} \n {b} Unique URLs = {request_url} \n {c} Unique Req Method = {request_method} \n'
f' {d} Unique userAgent = {userAgent} \n {e} Unique statusCode = {status_code}', file=f1)
printing()
Note that the file you are creating is not in CSV format, so isn't one.请注意,您正在创建的文件不是CSV 格式,所以也不是。
With that said, if I copied your example data 3 times and added another with a different remoteIp
, this is what is written to the "output.csv"
file:话虽如此,如果我将您的示例数据复制了 3 次并添加了另一个具有不同remoteIp
的数据,这就是写入"output.csv"
文件的内容:
2 Unique IP List = Counter({'182.2.169.59': 3, '182.2.168.59': 1})
1 Unique URLs = {'https://dknnkkdkddkd/token'}
1 Unique Req Method = {'POST'}
1 Unique userAgent = {'okhttp/3.12.2'}
1 Unique statusCode = {'403'}
Note what is written at the beginning for the "Unique IP List".请注意“唯一 IP 列表”开头所写的内容。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.