简体   繁体   English

如何使用库 re 解析 python 中的字符串?

[英]How to parse string in python using library re ?

I have the following strings, that should be parsed:我有以下字符串,应该解析:

http_proxy=172.55.30.14:80
https_proxy=Administrator:some_password@172.55.30.27:443

I want to extract from this string我想从这个字符串中提取

protocol = "http" (or "https")
proxy_server = 172.55.30.14
proxy_port = 80

and if password is set, variables username and password also should be initialized:如果设置了密码,变量 username 和 password 也应该初始化:

username = Administrator
password = some_password

I have an idea to split it using method split:我有一个想法使用方法 split 拆分它:

res = re.split(r':', line)

Any ideas how to do this better ?任何想法如何更好地做到这一点?

You can use this code:您可以使用此代码:

import re

# s="""http_proxy=172.55.30.14:80"""
s="""https_proxy=Administrator:some_password@172.55.30.27:443"""

regex=r"^(?P<xprotocol>http[s]?)_proxy=((?P<xusername>\w+):(?P<xpassword>\w+)@)?(?P<xserver>[\d.]+):(?P<xport>\d+)"

res=re.match(regex,s,re.I)

print("protocol: {}".format(res.group("xprotocol")))
print("proxy_server: {}".format(res.group("xserver")))
print("proxy_port: {}".format(res.group("xport")))
print("username: {}".format(res.group("xusername")))
print("password: {}".format(res.group("xpassword")))

Some notes:一些注意事项:
http[s]? http[s]? = http or https = httphttps
\\w = one letter, include a->z, A->Z, 0->9 and _ \\w = 一个字母,包括 a->z、A->Z、0->9 和 _
+ = one or more. + = 一个或多个。 So \\w+ = one or more letters所以\\w+ = 一个或多个字母
(group)? —— (group)? = zero or one text group = 零个或一个文本组
\\d+ = one or more number, include 0->9 \\d+ = 一个或多个数字,包括 0->9

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM