简体   繁体   English

解析字符串模式-Python

[英]Parsing a string pattern - Python

I have a string pattern (for a xml test reporter) in the following pattern: 我在以下模式中有一个字符串模式(用于xml测试报告程序):

'testsets.testcases.[testset].[testcase]-[date-stamp]'

For example: 例如:

a='testsets.testcases.test_different_blob_sizes.TestDifferentBlobSizes-20150430130436'

I know I always can parse the testset and testcase names by doing: 我知道我总是可以通过执行以下操作来解析testcase testsettestcase名称:

temp = a.split("-")[0]
current = temp.split(".")
testset = '.'.join(current[:-1]) + ".py"
testcase = current[-1]

However, I want to accomplish that using a more pythonic way, like regex or any other expression that I would do it in a single line. 但是,我想使用一种更Python的方式来实现这一点,例如regex或我将在一行中完成的任何其他表达式。 How can I accomplish that? 我该怎么做?

You can try: 你可以试试:

testset, testcase = re.search('(.*)\.(.*)-.*', a).group(1, 2)
testset += '.py'

re.search returns a MatchObject on matches, and it has a group method we can use to extract match groups for the regex ("()"s in the regex). re.search在匹配MatchObject上返回MatchObject ,它具有一个group方法,可用于为正则表达式(正则表达式中的“()”)提取匹配组。

只需使用groups ,从正则表达式搜索组获得:

data = re.search(r'.+\..+\.(.+)\.(.+)-(\d+)', string).groups()

If you strictly want to pull out the testset and testcase, ie "test_different_blob_sizes" and "TestDifferentBlobSizes", as in the first part of your question, you can just do: 如您在问题的第一部分中一样,如果严格要提取测试集和测试用例,即“ test_different_blob_sizes”和“ TestDifferentBlobSizes”,则可以执行以下操作:

testset, testcase = re.split('[.-]',s)[2:4]

For compact regexp-based code based on what you have, see Ziyao Wei's response. 有关基于所拥有内容的紧凑型正则表达式代码的信息,请参见Ziyao Wei的回复。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM