![](/img/trans.png)
[英]Extracting a complex substring using regex with data from a string in python
[英]Extracting data after filtering substring using Python
我有這種格式的DNS流量的JSON文件
{
"index": {
"_type": "answer_query",
"_id": 0,
"_index": "index_name"
}
}
{
"answer_section": " ",
"query_type": "A",
"authority_section": "com. 172 IN SOA a.xxxx-xxxx.net. nstld.xxxx-xxxxcom. 1526440480 1800 900 604800 86400",
"record_code": "NXDOMAIN",
"ip_src": "xx.xx.xx.xx",
"response_ip": "xx.xx.xx.xx",
"date_time": "2018-05-16T00:57:20Z",
"checksum": "CORRECT",
"query_name": "xx.xxxx.com.",
"port_src": 50223,
"question_section": "xx.xxxx.com. IN A",
"answer_count_section": 0
}
我需要提取authority_section
中小於300的空格后的數字(在示例中為172),並忽略不滿足要求的數據,然后將輸出寫入另一個JSON文件。
我該如何實現? 謝謝
假設stack1.txt是您發布的文件。 這將編寫一個新文件stack2.txt,如果“空格后的值”> = 300,則該行將省略“ authority_section”行。此解決方案不需要解析json,但是它非常依賴於所要存儲的數據格式一致的。
import os
with open('stack2.txt','w') as new_file:
old_file = open('stack1.txt').readlines()
delete_file = False
for line in old_file:
if not (line.strip().startswith('"authority_section"') and int(line.split(':')[1].split()[1]) >= 300):
new_file.write(line)
else:
delete_file = True
if delete_file:
os.remove('stack2.txt')
您可以嘗試如下操作:
#!/usr/bin/python3
import json
import re
data = (
"""
{
"answer_section": " ",
"query_type": "A",
"authority_section": "com. 172 IN SOA a.xxxx-xxxx.net. nstld.xxxx-xxxxcom. 1526440480 1800 900 604800 86400",
"record_code": "NXDOMAIN",
"ip_src": "xx.xx.xx.xx",
"response_ip": "xx.xx.xx.xx",
"date_time": "2018-05-16T00:57:20Z",
"checksum": "CORRECT",
"query_name": "xx.xxxx.com.",
"port_src": 50223,
"question_section": "xx.xxxx.com. IN A",
"answer_count_section": 0
}
"""
)
json_data = json.loads(data)
print('BEFORE: ', json_data)
r = re.compile('^\s([1-2]\d\d|[1-9]\d|[1-9])\s$')
found = False
key_to_delete = None
for key, value in json_data.items():
if value == 0:
pass
else:
tmp = str(value)
for i in range(0, len(tmp)):
if r.match(tmp[i:i+3]):
found = True
key_to_delete = key
print('FOUND 1: ', value)
elif r.match(tmp[i:i+4]):
found = True
key_to_delete = key
print('FOUND 2: ', value)
elif r.match(tmp[i:i+5]):
found = True
key_to_delete = key
print('FOUND 3: ', value)
if found:
json_data.pop(key_to_delete)
print('RESULT: ', json_data)
我在回答中使用了正則表達式。 閱讀有關正則表達式的更多信息。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.