简体   繁体   English

Python(3.5)-打开文件,名称的正则表达式部分,重命名-循环

[英]Python (3.5) - Open File, Regex part of name, Rename - Loop

I have a bunch of files: 我有一堆文件:

File Completed for 123456 1 - Platform Junk (AP .msg

File Completed for 1234566 1 - More Junk here and Stuf.msg

File Completed for 654321 1 - ® Stuff and Junk.msg

So each file contains a 6 or 7 digit number (not including that 1 after the number), also some files have stupid R (registered trademark) symbols. 因此,每个文件都包含6或7位数字(不包括数字后的1 ),有些文件也具有愚蠢的R(注册商标)符号。

My goal is to search the current directory for all .msg files, find the 6 or 7 digit number, and rename the file to 123456.msg or 11234567.msg`. 我的目标是在当前目录中搜索所有.msg文件,找到6或7位数字,然后将文件重命名为123456.msg或11234567.msg`。

I have the regex that should work properly to extract the number: 我有应该正确工作的正则表达式来提取数字:

(?<!\\d)(\\d{6}|\\d{7})(?!\\d)

Now I just need to loop through all .msg files and rename them. 现在,我只需要遍历所有.msg文件并重命名它们。 I've got my foot in the door with the following code, but I don't quite know how to extract what I want and rename: 我已经开始使用以下代码,但是我不太了解如何提取想要的内容并重命名:

for filename in glob.glob(script_dir + '*.msg'):
    new_name = re.sub(r'(?<!\d)(\d{6}|\d{7})(?!\d)')

Any help or step in the right direction would be much appreciated! 任何帮助或朝着正确方向迈出的步伐将不胜感激!

Only the regex is right here, don't take it the wrong way. 只有正则表达式在这里,不要以错误的方式使用它。 I'll explain how to fix your code to rename your files step by step: 我将逐步解释如何修复代码以重命名文件:

First, the glob pattern should be written using os.path.join or you'd have to end script_dir with / : 首先, glob模式应使用os.path.join编写,否则您必须以/结束script_dir

for filename in glob.glob(os.path.join(script_dir,'*.msg')):

Let's test your regex, adapted to keep only the regex match and drop the rest: 让我们测试一下您的正则表达式,使其仅保留正则表达式匹配并删除其余部分:

>>> re.sub(r".*((?<!\d)(\d{6}|\d{7})(?!\d)).*",r"\1.msg","File Completed for 1234566 1 - More Junk here and Stuf.msg")
'1234566.msg'

Ok, now since it works, then compute the new name like this: 好的,既然可以了,那么就可以像下面这样计算新名称:

base_filename = os.path.basename(filename)
new_name = re.sub(r".*((?<!\d)(\d{6}|\d{7})(?!\d)).*",r"\1.msg",base_filename) # keep only matched pattern, restore .msg suffix in the end

so the regex only applies to filename, not full path 因此正则表达式仅适用于文件名,不适用于完整路径

And last, use os.rename to rename the files (check if something was replaced or rename will fail because source==dest: 最后,使用os.rename重命名文件(检查是否已替换某些内容,否则重命名将失败,因为source == dest:

if base_filename != new_name:
   os.rename(filename,os.path.join(script_dir,new_name))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM