简体   繁体   中英

How can I extract a substring from a string, avoiding including the delimiters?

I'm having some trouble to extract a substring without including the delimiters.

x =  "- dubuque3528 [21/Jun/2019:15:46:"

or

x = "- - [21/Jun/2019:15:46:"

user_name = re.findall('.-.*\[', x)

That returns: "- dubuque3528 [" or "- - [". I would like to retrieve "dubuque3528" or "-" instead.

You can use

-\s*(.*?)\s*\[

See the regex demo . Details :

  • - - a hyphen
  • \s* - zero or more whitespaces
  • (.*?) - Group 1: any zero or more chars other than line break chars as few as possible
  • \s* - zero or more whitespaces
  • \[ - a [ char.

See the Python demo :

import re
x = ["- dubuque3528 [21/Jun/2019:15:46:", "- - [21/Jun/2019:15:46:"]
for s in x:
    m = re.search(r'-\s*(.*?)\s*\[', s)
    if m:
        print(m.group(1))

With your shown samples, please try following regex.

-\s+(\S+)\s+\[

Here is the Online demo for above regex.

You can run this above regex in Python like as follows, written and tested in Python3:

import re
x = ["- dubuque3528 [21/Jun/2019:15:46:", "- - [21/Jun/2019:15:46:"]
for val in x:
  m = re.search(r'-\s+(\S+)\s+\[', val)
  if m:
    print(m.group(1))

Output will be as follows:

dubuque3528
-

Explanation of above regex:

-\s+   ##Matching hash followed by 1 or more occurrnces of spaces.
(\S+)  ##Creating 1st capturing group where matching 1 or more non-spaces here.
\s+\[  ##Matching 1 or more occurrences of spaces followed by [.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM