简体   繁体   中英

Python Regex Match Before Character AND Ignore White Space

I'm trying to write a regex to match part of a string that comes before '/' but also ignores any leading or trailing white space within the match.

So far I've got ^[^\\/]* which matches everything before the '/' but I can't figure out how to ignore the white space.

      123 / some text 123

should yield

123

and

     a test / some text 123

should yield

a test

That's a little bit tricky. You first start matching from a non-whitespace character then continue matching slowly but surely up to the position that is immediately followed by an optional number of spaces and a slash mark:

\S.*?(?= *\/)

See live demo here

If slash mark could be the first non-whitespace character in input string then replace \\S with [^\\s\\/] :

[^\s\/].*?(?= *\/)

This expression is what you might want to explore:

^(.*?)(\s+\/.*)$

Here, we have two capturing groups where the first one collects your desired output, and the second one is your undesired pattern, bounded by start and end chars, just to be safe that can be removed if you want:

(.*?)(\s+\/.*)

Python Test

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"^(.*?)(\s+\/.*)$"

test_str = ("123 / some text 123\n"
    "anything else    / some text 123")

subst = "\\1"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

JavaScript Demo

 const regex = /^(.*?)(\\s+\\/.*)$/gm; const str = `123 / some text 123 anything else / some text 123`; const subst = `\\n$1`; // The substituted value will be contained in the result variable const result = str.replace(regex, subst); console.log('Substitution result: ', result); 

RegEx

If this wasn't your desired expression, you can modify/change your expressions in regex101.com .

在此输入图像描述

RegEx Circuit

You can also visualize your expressions in jex.im :

在此输入图像描述

Spaces

For spaces before your desired output, we can simply add a capturing group with negative lookbehind :

 ^(\s+)?(.*?)(\s+\/.*)$

JavaScript Demo

 const regex = /^(\\s+)?(.*?)(\\s+\\/.*)$/gm; const str = ` 123 / some text 123 anything else / some text 123 123 / some text 123 anything else / some text 123`; const subst = `$2`; // The substituted value will be contained in the result variable const result = str.replace(regex, subst); console.log('Substitution result: ', result); 

Demo

在此输入图像描述

Here is a possible solution

Regex

(?<!\/)\S.*\S(?=\s*\/)

Example

# import regex # or re

string = ' 123 / some text 123'
test = regex.search(r'(?<!\/)\S.*\S(?=\s*\/)', string)
print(test.group(0))
# prints '123'

string = 'a test / some text 123'
test = regex.search(r'(?<!\/)\S.*\S(?=\s*\/)', string)
print(test.group(0))
# prints 'a test'

Short explanation

  • (?<!\\/) says before a possible match there can be no / symbol.
  • \\S.*\\S matches lazily anything ( .* ) while making sure it does not start or end with a white space ( \\S )
  • (?=\\s*\\/) means a possible match must be followed by a / symbol or by white spaces + a / .

You could do it without a regex

my_string = "      123 / some text 123"
match = my_string.split("/")[0].strip()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM