如何在Python中使用RegEx打印子字符串？

Question

This is two texts: 这是两个文本：

1) 'provider:sipoutilp1.ym.ms' 
2) 'provider:sipoutqtm.ym.ms'

I would like to print ilp when reaches to the fist line and qtm when reaches to the second line. 我想打印ilp时达到的拳头线和qtm当到达第二行。

This is my solution but it is not working. 这是我的解决方案，但不起作用。

RE_PROVIDER = re.compile(r'(?P<provider>\((ilp+|qtm+)')

or in the line below, 或在下面的行中，

182938,DOMINICAN REPUBLIC-MOBILE

to DOMINICAN REPUBLIC , can I use the same approach re.compile ? 对DOMINICAN REPUBLIC ，我可以使用相同的方法re.compile吗？

Thank you for any help. 感谢您的任何帮助。

Answer 1

Your regex is not correct because you have a open parenthesis before your keywords, since there is no such character in your lines. 您的正则表达式不正确，因为您的关键字前面有一个括号，因为您的行中没有这样的字符。

As a more general way you can get capture the alphabetical character after sipout or provider:sipout . 作为更通用的方法，您可以在sipout或provider:sipout之后捕获字母字符。

>>> s1 = 'provider:sipoutilp1.ym.ms'
>>> s2 = 'provider:sipoutqtm.ym.ms'
>>> RE_PROVIDER = re.compile(r'(?P<provider>(?<=sipout)(ilp|qtm))')
>>> RE_PROVIDER.search(s1).groupdict()
{'provider': 'ilp'}
>>> RE_PROVIDER.search(s2).groupdict()
{'provider': 'qtm'}

(?<=sipout) is a positive look-behind which will makes the regex engine match the patter which is precede with sipout . (?<=sipout)是一个正 (?<=sipout) ，这将使regex引擎与sipout前面的模式匹配。

After edit: 编辑后：

If you want to match multiple strings with different structure, you have to use a optional preceding patterns for matching your keywords, and due to this point that you cannot use unfixed length patterns within look-behind you cannot use it for this aim. 如果要匹配具有不同结构的多个字符串，则必须使用可选的前置模式来匹配关键字，并且由于这一点，您不能在后视中使用未固定长度的模式，因此不能将其用于此目的。 So instead you can use a capture group trick. 因此，您可以使用捕获组技巧。

You can define the optional preceding patterns within a none capture group and your keyword within a capture group then after match get the second matched gorup ( group(1) , group(0) is the whole of your match). 您可以在无捕获组中定义可选的先前模式，并在捕获组中定义关键字，然后在匹配之后获取第二个匹配group(1) ， group(0)是您的整个匹配项）。

>>> RE_PROVIDER = re.compile(r'(?:sipout|\d+,)(?P<provider>(ilp|qtm|[A-Z\s]+))')
>>> RE_PROVIDER.search(s1).groupdict()
{'provider': 'ilp'}
>>> RE_PROVIDER.search(s2).groupdict()
{'provider': 'qtm'}
>>> s3 = "182938,DOMINICAN REPUBLIC-MOBILE"
>>> RE_PROVIDER.search(s3).groupdict()
{'provider': 'DOMINICAN REPUBLIC'}

Note that gorupdict doesn't works in this case because it will returns 请注意 ， gorupdict在这种情况下不起作用，因为它将返回

如何在Python中使用RegEx打印子字符串？

问题描述

1 个解决方案

解决方案1
4 已采纳 2016-02-04 10:30:23

如何在Python中使用RegEx打印子字符串？

问题描述

1 个解决方案

解决方案1 4 已采纳 2016-02-04 10:30:23

解决方案1
4 已采纳 2016-02-04 10:30:23