简体   繁体   中英

Python (3.5) - Open File, Regex part of name, Rename - Loop

I have a bunch of files:

File Completed for 123456 1 - Platform Junk (AP .msg

File Completed for 1234566 1 - More Junk here and Stuf.msg

File Completed for 654321 1 - ® Stuff and Junk.msg

So each file contains a 6 or 7 digit number (not including that 1 after the number), also some files have stupid R (registered trademark) symbols.

My goal is to search the current directory for all .msg files, find the 6 or 7 digit number, and rename the file to 123456.msg or 11234567.msg`.

I have the regex that should work properly to extract the number:

(?<!\\d)(\\d{6}|\\d{7})(?!\\d)

Now I just need to loop through all .msg files and rename them. I've got my foot in the door with the following code, but I don't quite know how to extract what I want and rename:

for filename in glob.glob(script_dir + '*.msg'):
    new_name = re.sub(r'(?<!\d)(\d{6}|\d{7})(?!\d)')

Any help or step in the right direction would be much appreciated!

Only the regex is right here, don't take it the wrong way. I'll explain how to fix your code to rename your files step by step:

First, the glob pattern should be written using os.path.join or you'd have to end script_dir with / :

for filename in glob.glob(os.path.join(script_dir,'*.msg')):

Let's test your regex, adapted to keep only the regex match and drop the rest:

>>> re.sub(r".*((?<!\d)(\d{6}|\d{7})(?!\d)).*",r"\1.msg","File Completed for 1234566 1 - More Junk here and Stuf.msg")
'1234566.msg'

Ok, now since it works, then compute the new name like this:

base_filename = os.path.basename(filename)
new_name = re.sub(r".*((?<!\d)(\d{6}|\d{7})(?!\d)).*",r"\1.msg",base_filename) # keep only matched pattern, restore .msg suffix in the end

so the regex only applies to filename, not full path

And last, use os.rename to rename the files (check if something was replaced or rename will fail because source==dest:

if base_filename != new_name:
   os.rename(filename,os.path.join(script_dir,new_name))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM