Can someone please help with regex pattern for below string in Python? I have .log
file and I want to find below line from string I have to get user and ip.
I want regex that can get me one word before from
and one after from
.
Failed password for root from 123.183.209.132 port 39706 ssh2
I want root
and 123.183.209.132
from above string
Failed password for invalid user packer from 13.82.211.217 port 45832 ssh2
I want packer
and 13.82.211.217
from above string
reverse mapping checking getaddrinfo for undefined.datagroup.ua
[93.183.207.5] failed - POSSIBLE BREAK-IN ATTEMPT!
reverse mapping checking getaddrinfo for nsg-static-226.127.71.182.airtel.in [182.71.127.226] failed - POSSIBLE BREAK-IN ATTEMPT!
reverse mapping checking getaddrinfo for 179.185.44.168.static.gvt.net.br [179.185.44.168] failed - POSSIBLE BREAK-IN ATTEMPT!
I want undefined.datagroup.ua
and 93.183.207.5
from(new regex).
My working code.
def parse(filename, date=None):
try:
# string = 'Failed password for ([a-z]*|[a-z]* [a-z]* [a-z]*) from '
string = 'Failed password for ([a-z]*|[a-z]* [a-z]* [a-z]*) from [0-9]+(?:\.[0-9]+){3}'
# string_sub = 'for (?<user>[a-zA-Z\.]+).*?(?<ip>(?:\d{1,3}\.){3}\d{1,3})'
# string_re = re.compile(r"^[^ ]+ - (C[^ ]*) \[([^ ]+)").match
match_list =[]
with open(filename, 'r') as file:
for line in file:
for match in re.finditer(string, line, re.S):
match_text = match.group()
user_ip = re.search(r'Failed password for .*?(\w+) from (\d+(?:\.\d+){3})', match_text)
user = user_ip.groups()[0]
print(user)
except KeyError as e:
msg="key %s is missing" % str(e)
return msg
except Exception as e:
return str(e)
I'm stuck with regex.
Regex may be overkill for your use case... Did you try simpler things, like this, for instance:
s1 = "Failed password for root from 123.183.209.132 port 39706 ssh2"
s2 = "Failed password for invalid user packer from 13.82.211.217 port 45832 ssh2"
parsed = s1.split('from',1)
user = parsed[0].split()[-1]
ip = parsed[1].split()[0]
print(f'User is {user} and IP is {ip}')
If I understand correctly you basically want the word (username) after for
and the ip
of that line? If that's the case, how about:
for (?<user>[a-zA-Z\.]+).*?(?<ip>(?:\d{1,3}\.){3}\d{1,3})
https://regex101.com/r/aojbyS/1 . Granted, this is a short-hand form for an IP, but to make it more correct you should use a proper ipv4 regex .
Additionally, in your question, you don't say what should be captured from the following, which might modify the above regex.
Failed password for invalid user packer from 13.82.211.217 port 45832 ssh2.
import re
inp = [
'Failed password for root from 123.183.209.132 port 39706 ssh2',
'Failed password for invalid user packer from 13.82.211.217 port 45832 ssh2',
'''reverse mapping checking getaddrinfo for undefined.datagroup.ua
[93.183.207.5] failed - POSSIBLE BREAK-IN ATTEMPT!''',
]
for s in inp:
result = re.search(r'(?:Failed password|reverse mapping.+?) for .*?([\w.]+)\s+(?:from |\[)(\d+(?:\.\d+){3})', s)
print result.groups()
Output:
('root', '123.183.209.132')
('packer', '13.82.211.217')
('undefined.datagroup.ua', '93.183.207.5')
Explanation:
(?: # non capture group
Failed password # literally
| # OR
reverse mapping # literally
.+? # 1 or more any character, not greedy
) # end group
for # literally
.*? # 0 or more any character
([\w.]+) # group 1, 1 or more word character or dot
\s+ # 1 or more spaces
(?:from |\[) # non capture group, from OR opening square bracket
(\d+(?:\.\d+){3}) # group 2, IP
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.