简体   繁体   中英

Python regex replace whole string

I have a regex to strip the end off a request url:

re.sub('(?:^\/en\/category).*(-\d{1,4}$)', '', r)

My problem is that the docs say it will replace the matched part, however when it matches my string it replaces the whole string , eg:

/en/category/specials/men-2610

I'm not sure what Python is doing, but my regex seems fine

EDIT : I wish to have the string with the end stripped off, target =

/en/category/specials/men
(?<=^\/en\/category)(.*)-\d{1,4}$

Try this.replace by \\1 .See demo.

https://regex101.com/r/tX2bH4/27

Your whole pattern matches that is why it is replacing the whole string.

PS match is different than captures or groups .

import re
p = re.compile(r'(?<=^\/en\/category)(.*)-\d{1,4}$', re.IGNORECASE)
test_str = "/en/category/specials/men-2610"
subst = "\1"

result = re.sub(p, subst, test_str)

As stated in the docs, the matched part is replaced. Matched is different from captured .

You will have to capture the text you don't want to remove in a capture group like so:

(^/en/category.*)-\d{1,4}$

and put it back into the string using the backreference \\1 :

re.sub(r'(^/en/category.*)-\d{1,4}$', r'\1', text)

Just transfer the capturing group to the other part and then replace the match with \\1 and you don't need to escape the forward slash if the pattern is defined as a raw string.

re.sub(r'^(/en/category.*)-\d{1,4}$', r'\1', string)

DEMO

>>> s = "/en/category/specials/men-2610"
>>> re.sub(r'^(/en/category.*)-\d{1,4}$', r'\1', s)
'/en/category/specials/men'

OR

>>> s.split('-')[0]
'/en/category/specials/men'
>>> re.sub('(^\/en\/category.*)(-\d{1,4}$)', 
           r'\1', '/en/category/specials/men-2610')
'/en/category/specials/men'

Your pattern is fine, you just need to change which item is the capturing group:

Before:

(?:^\/en\/category).*(-\d{1,4}$)

After:

((?:^\\/en\\/category).*)-\\d{1,4}$

Since the ?: is no longer necessary we can reduce this further to:

(^\\/en\\/category.*)-\\d{1,4}$

Notice I've moved the capturing group from the digits to the part before it.

Example:

http://ideone.com/FLAaFh

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM