简体   繁体   English

如何使用Python正则表达式在正斜杠中间获取字符

[英]How to use Python Regular expression to get the character in the middle of the forward slash

I have this string: https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com . 我有以下字符串: https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com : https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com

How do I use Python Regular expression to get the character in the middle of the forward slash? 如何使用Python正则表达式在正斜杠中间获取字符?

I want to Get: ['app.redretarget.com','sapp','ptag','jxy666.myshopify.com'] 我想获取: ['app.redretarget.com','sapp','ptag','jxy666.myshopify.com']

When I use: 当我使用时:

cmd = 'https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com'
pin_url = re.compile(r'/(.*?)/{0,1}')
print pin_url.findall(cmd)

I get an error. 我得到一个错误。

You can use split by "/". 您可以使用“ /” split I am using filter to remove empty elements in the list. 我正在使用filter删除列表中的空元素。

string ='https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com'
print filter(None, string.split("/"))

Output: 输出:

['https:', 'app.redretarget.com', 'sapp', 'ptag', 'jxy666.myshopify.com']

instead of regex you could use urllib.parse.urlparse and pathlib.Path : 可以使用urllib.parse.urlparsepathlib.Path代替regex:

from urllib.parse import urlparse
from pathlib import Path

cmd = 'https://app.redretarget.com/sapp/ptag/jxy666.myshopify.com'
parsed = urlparse(cmd)
parts = (parsed.netloc, ) + Path(parsed.path).parts[1:]
print(parts)  # ('app.redretarget.com', 'sapp', 'ptag', 'jxy666.myshopify.com')

note that urlparse could also parse more complex urls; 注意urlparse也可以解析更复杂的url; the result of urlparse in your case is 在您的情况下urlparse的结果是

print(parsed)
# ParseResult(scheme='https', netloc='app.redretarget.com', 
#             path='/sapp/ptag/jxy666.myshopify.com', params='', query='',
#             fragment='')

I'd propose splitting twice 我建议分两次

cmd.split('//', 1)[1].split('/')

['app.redretarget.com', 'sapp', 'ptag', 'jxy666.myshopify.com']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM