[英]How to parse string in python using library re ?
I have the following strings, that should be parsed:我有以下字符串,应该解析:
http_proxy=172.55.30.14:80
https_proxy=Administrator:some_password@172.55.30.27:443
I want to extract from this string我想从这个字符串中提取
protocol = "http" (or "https")
proxy_server = 172.55.30.14
proxy_port = 80
and if password is set, variables username and password also should be initialized:如果设置了密码,变量 username 和 password 也应该初始化:
username = Administrator
password = some_password
I have an idea to split it using method split:我有一个想法使用方法 split 拆分它:
res = re.split(r':', line)
Any ideas how to do this better ?任何想法如何更好地做到这一点?
You can use this code:您可以使用此代码:
import re
# s="""http_proxy=172.55.30.14:80"""
s="""https_proxy=Administrator:some_password@172.55.30.27:443"""
regex=r"^(?P<xprotocol>http[s]?)_proxy=((?P<xusername>\w+):(?P<xpassword>\w+)@)?(?P<xserver>[\d.]+):(?P<xport>\d+)"
res=re.match(regex,s,re.I)
print("protocol: {}".format(res.group("xprotocol")))
print("proxy_server: {}".format(res.group("xserver")))
print("proxy_port: {}".format(res.group("xport")))
print("username: {}".format(res.group("xusername")))
print("password: {}".format(res.group("xpassword")))
Some notes:一些注意事项:
– http[s]?
– http[s]?
= http
or https
= http
或https
– \\w
= one letter, include a->z, A->Z, 0->9 and _ – \\w
= 一个字母,包括 a->z、A->Z、0->9 和 _
– +
= one or more. – +
= 一个或多个。 So \\w+
= one or more letters所以\\w+
= 一个或多个字母
– (group)?
—— (group)?
= zero or one text group = 零个或一个文本组
– \\d+
= one or more number, include 0->9 – \\d+
= 一个或多个数字,包括 0->9
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.