简体   繁体   中英

How do I print matches from a regex given a string value in Python?

I have a string "/browse/advanced-computer-science-modules?title=machine-learning" in python. I want to print the string in between the second "/" and the "?", which is "advanced-computer-science-modules". I've created a regular expression that is as follows ^([az] [-] [az])*?$ but it prints nothing when I run the.findall() function from the re module. Is there a way I can solve this issue?

I created my own regex and imported the re module in python. Below is a snippet of my code that returned nothing.

regex = re.compile(r'^([a-z]*[\-]*[a-z])*?$')
str = '/browse/advanced-computer-science-modules?title=machine-learning'
print(regex.findall(str))

Since this appears to be a URL, I'd suggest you use URL-parsing tools instead:

>>> from urllib.parse import urlsplit
>>> url = '/browse/advanced-computer-science-modules?title=machine-learning'
>>> s = urlsplit(url)
SplitResult(scheme='', netloc='', path='/browse/advanced-computer-science-modules', query='title=machine-learning', fragment='')
>>> s.path
'/browse/advanced-computer-science-modules'
>>> s.path.split('/')[-1]
'advanced-computer-science-modules'

The regex is as follows:

\/[a-zA-Z\-]+\?

Then you catch the substring:

regex.findall(str)[1:len(str) - 1]

Very specific to this problem, but it should work.

Alternatively, you can use split method of a string :

str = '/browse/advanced-computer-science-modules?title=machine-learning'
result = str.split('/')[-1].split('?')[0]

print(result)
#advanced-computer-science-modules

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM