[英]How to use Python Regular expression to get the character in the middle of the forward slash
I have this string: https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com
. 我有以下字符串: https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com
: https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com
。
How do I use Python Regular expression to get the character in the middle of the forward slash? 如何使用Python正则表达式在正斜杠中间获取字符?
I want to Get: ['app.redretarget.com','sapp','ptag','jxy666.myshopify.com']
我想获取: ['app.redretarget.com','sapp','ptag','jxy666.myshopify.com']
When I use: 当我使用时:
cmd = 'https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com'
pin_url = re.compile(r'/(.*?)/{0,1}')
print pin_url.findall(cmd)
I get an error. 我得到一个错误。
You can use split
by "/". 您可以使用“ /” split
。 I am using filter
to remove empty elements in the list. 我正在使用filter
删除列表中的空元素。
string ='https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com'
print filter(None, string.split("/"))
Output: 输出:
['https:', 'app.redretarget.com', 'sapp', 'ptag', 'jxy666.myshopify.com']
instead of regex you could use urllib.parse.urlparse
and pathlib.Path
: 可以使用urllib.parse.urlparse
和pathlib.Path
代替regex:
from urllib.parse import urlparse
from pathlib import Path
cmd = 'https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com'
parsed = urlparse(cmd)
parts = (parsed.netloc, ) + Path(parsed.path).parts[1:]
print(parts) # ('app.redretarget.com', 'sapp', 'ptag', 'jxy666.myshopify.com')
note that urlparse
could also parse more complex urls; 注意urlparse
也可以解析更复杂的url; the result of urlparse
in your case is 在您的情况下urlparse
的结果是
print(parsed)
# ParseResult(scheme='https', netloc='app.redretarget.com',
# path='/sapp/ptag/jxy666.myshopify.com', params='', query='',
# fragment='')
I'd propose splitting twice 我建议分两次
cmd.split('//', 1)[1].split('/')
['app.redretarget.com', 'sapp', 'ptag', 'jxy666.myshopify.com']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.