I'm having some trouble to extract a substring without including the delimiters.
x = "- dubuque3528 [21/Jun/2019:15:46:"
or
x = "- - [21/Jun/2019:15:46:"
user_name = re.findall('.-.*\[', x)
That returns: "- dubuque3528 ["
or "- - [".
I would like to retrieve "dubuque3528"
or "-"
instead.
You can use
-\s*(.*?)\s*\[
See the regex demo . Details :
-
- a hyphen \s*
- zero or more whitespaces (.*?)
- Group 1: any zero or more chars other than line break chars as few as possible \s*
- zero or more whitespaces \[
- a [
char. See the Python demo :
import re
x = ["- dubuque3528 [21/Jun/2019:15:46:", "- - [21/Jun/2019:15:46:"]
for s in x:
m = re.search(r'-\s*(.*?)\s*\[', s)
if m:
print(m.group(1))
With your shown samples, please try following regex.
-\s+(\S+)\s+\[
Here is the Online demo for above regex.
You can run this above regex in Python like as follows, written and tested in Python3:
import re
x = ["- dubuque3528 [21/Jun/2019:15:46:", "- - [21/Jun/2019:15:46:"]
for val in x:
m = re.search(r'-\s+(\S+)\s+\[', val)
if m:
print(m.group(1))
Output will be as follows:
dubuque3528
-
Explanation of above regex:
-\s+ ##Matching hash followed by 1 or more occurrnces of spaces.
(\S+) ##Creating 1st capturing group where matching 1 or more non-spaces here.
\s+\[ ##Matching 1 or more occurrences of spaces followed by [.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.