[英]python regular expression to check start and end of a word in a string
I am working on a script to rename files. 我正在编写一个脚本来重命名文件。 In this scenario there are three possibilities. 在这种情况下,有三种可能性。
1.file does not exist: Create new file 1.file不存在:创建新文件
2.File exists: create new file with filename '(number of occurence of file)'.eg filename(1) 2.文件存在:用文件名'(文件出现次数)'创建新文件。例如文件名(1)
3.Duplicate of file already exists: create new file with filename '(number of occurence of file)'.eg filename(2) 3.文件的重复已存在:创建文件名为'(文件出现次数)'的新文件。例如文件名(2)
I have the filename in a string. 我有一个字符串中的文件名。 I can check the last character of filename using regex but how to check the last characters from '(' to ')' and get the number inside it? 我可以使用正则表达式检查文件名的最后一个字符但是如何检查'('到')'中的最后一个字符并获取其中的数字?
You just need something like this: 你只需要这样的东西:
(?<=\()(\d+)(?=\)[^()]*$)
Explanation: 说明:
(?<=\\()
must be preceded by a literal (
(?<=\\()
必须以文字开头(
(\\d+)
match and capture the digits (\\d+)
匹配并捕获数字 (?=\\)[^()]+$)
must be followed by )
and then no more (
or )
until the end of the string. (?=\\)[^()]+$)
必须后跟)
然后不再(
或)
直到字符串结尾。 Example: if the file name is Foo (Bar) Baz (23).jpg
, the regex above matches 23
示例:如果文件名是Foo (Bar) Baz (23).jpg
,则上面的正则表达式匹配23
Here is the code and tests to get a filename based on existing filenames: 以下是基于现有文件名获取文件名的代码和测试:
import re
def get_name(filename, existing_names):
exist = False
index = 0
p = re.compile("^%s(\((?P<idx>\d+)\))?$" % filename)
for name in existing_names:
m = p.match(name)
if m:
exist = True
idx = m.group('idx')
if idx and int(idx) > index:
index = int(idx)
if exist:
return "%s(%d)" % (filename, index + 1)
else:
return filename
# test data
exists = ["abc(1)", "ab", "abc", "abc(2)", "ab(1)", "de", "ab(5)"]
tests = ["abc", "ab", "de", "xyz"]
expects = ["abc(3)", "ab(6)", "de(1)", "xyz"]
print exists
for name, exp in zip(tests, expects):
new_name = get_name(name, exists)
print "%s -> %s" % (name, new_name)
assert new_name == exp
Look at this line for the regex to get the number in (*)
: 查看此行以获取正则表达式以获取(*)
的数字:
p = re.compile("^%s(\\((?P<idx>\\d+)\\))?$" % filename)
Here it uses a named capture ?P<idx>\\d+
for the number \\d+
, and access the capture later with m.group('idx')
. 在这里,它使用命名的捕获?P<idx>\\d+
作为数字\\d+
,稍后使用m.group('idx')
访问捕获。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.