简体   繁体   中英

How to use Python Regular expression to get the character in the middle of the forward slash

I have this string: https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com .

How do I use Python Regular expression to get the character in the middle of the forward slash?

I want to Get: ['app.redretarget.com','sapp','ptag','jxy666.myshopify.com']

When I use:

cmd = 'https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com'
pin_url = re.compile(r'/(.*?)/{0,1}')
print pin_url.findall(cmd)

I get an error.

You can use split by "/". I am using filter to remove empty elements in the list.

string ='https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com'
print filter(None, string.split("/"))

Output:

['https:', 'app.redretarget.com', 'sapp', 'ptag', 'jxy666.myshopify.com']

instead of regex you could use urllib.parse.urlparse and pathlib.Path :

from urllib.parse import urlparse
from pathlib import Path

cmd = 'https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com'
parsed = urlparse(cmd)
parts = (parsed.netloc, ) + Path(parsed.path).parts[1:]
print(parts)  # ('app.redretarget.com', 'sapp', 'ptag', 'jxy666.myshopify.com')

note that urlparse could also parse more complex urls; the result of urlparse in your case is

print(parsed)
# ParseResult(scheme='https', netloc='app.redretarget.com', 
#             path='/sapp/ptag/jxy666.myshopify.com', params='', query='',
#             fragment='')

I'd propose splitting twice

cmd.split('//', 1)[1].split('/')

['app.redretarget.com', 'sapp', 'ptag', 'jxy666.myshopify.com']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM