[英]Partial string matching in python
I have an section id A00-A09
.我有一个部分 ID A00-A09
。 Anything like A01
, A01.01
, A02
till A09.09
should be classified under this section id.像A01
、 A01.01
、 A02
到A09.09
类的任何内容都应归入此部分 ID。 How can i do this in Python?我怎样才能在 Python 中做到这一点? At the moment I can only match string with exact character.目前我只能匹配具有精确字符的字符串。
You can use []
with re module:您可以将[]
与 re 模块一起使用:
re.findall('A0[0-9].0[0-9]|A0[0-9]','A01')
output:输出:
['A01']
Non occurance:不发生:
re.findall('A0[0-9].0[0-9]|A0[0-9]','A11')
output:输出:
[]
Use re.match()
to check this.使用re.match()
来检查这一点。 here is an example:这是一个例子:
import re
section_id = "A01.09"
if re.match("^A0[0-9](\.0[0-9])?$", section_id):
print "yes"
Here the regex means A0X
is mandatory, and .0X
is optional.这里正则表达式表示A0X
是强制性的,而.0X
是可选的。 X
is from 0-9
. X
是从0-9
。
Cut the section id and compare:剪切部分 id 并进行比较:
sid = "A00-A09"
def under_sid(ssid, sid):
sid_start, sid_end = sid.split("-")
return ssid[:3] >= sid_start and ssid[:3] <= sid_end
for i in ["A01", "A01.01", "A02", "A09.09"]:
assert under_sid(i, sid)
for i in ["B01", "A22.01", "A93", "A19.09"]:
assert not under_sid(i, sid)
You can do partial matches using startswith()
and endswith()
.您可以使用startswith()
和endswith()
进行部分匹配。 Assuming the full id is always in a X12.Y34
- each part is a letter and two numbers, separated by .
假设完整的 id 总是在X12.Y34
- 每个部分都是一个字母和两个数字,用.
or -
(or any character):或-
(或任何字符):
>>> id = 'A03.A07'
>>> section_id = id[:3]
>>> section_id
'A03'
>>> id.startswith('A03')
True
>>> id.startswith('A07')
False # so won't match with the subsection.
>>> sub_section_id = id[-3:]
>>> sub_section_id
'A07'
And you can convert it to uppercase if the input can sometimes be lowercase.如果输入有时可以是小写,您可以将其转换为大写。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.