简体   繁体   中英

How to find digits, pad zeros with regex and replace path in Python?

I am trying to get file paths of all .txt files in a directory and replace the root directory of each file and pad zeros for file path with different padding lengths. Consider an example of the file list:

./Old directory/ABC 01/XYZ 1 - M 1.txt
./Old directory/ABC 01/XYZ 1 - M 2.txt
./Old directory/ABC 01/XYZ 1 - M 3.txt

Now a require a Python code to give me this output:

./New directory/ABC 00001/XYZ 0001 - M 001.txt
./New directory/ABC 00001/XYZ 0001 - M 002.txt
./New directory/ABC 00001/XYZ 0001 - M 003.txt

The reproducible code (my effort):

import os
import re
files = []
for root, directories, files in os.walk('./Old directory'):
    files = sorted([f for f in files if os.path.splitext(f)[1] in ('.txt')])
    for file in files:
        files.append(os.path.join(root, file))
for file in files:
    file.replace('./Old directory', './New directory')

I doubt that it is that easy, but it looks like you are very close.

import re
...
for file in files:
    file = file.replace('./Old directory', './New directory')
    p = re.compile(ur'(\d+)')
    file = re.sub(p, u"000$1", file)

View testing example

It is fatal to use the same variable files for two different purposes in your code - I changed one instance to filenames , and I complemented the code to do the zero-padding.

import os
import re
filenames = []
for root, directories, files in os.walk('./Old directory'):
    files = sorted([f for f in files if os.path.splitext(f)[1] in ('.txt')])
    for file in files:
        filenames.append(os.path.join(root, file))
def padzeros(s, m, g, width):   # pad the group g of match m in string s 
    return s[:m.start(g)]+m.group(g).zfill(width)+s[m.end(g):]
for file in filenames:
    file = file.replace('./Old directory', './New directory')
    m = re.search(r'\D+(\d+)\D+(\d+)\D+(\d+)', file)
    # important: pad from last to first match
    file = padzeros(file, m, 3, 3)
    file = padzeros(file, m, 2, 4)
    file = padzeros(file, m, 1, 5)
    print file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM