简体   繁体   English

用正则表达式python替换文件名中的字符

[英]Replace character in file name with regex python

My script should replace the "|"我的脚本应该替换“|” character of a file it finds via regex in a directory with an "l".它通过正则表达式在带有“l”的目录中找到的文件的字符。

The code runs but filenames are not replaced.代码运行但文件名没有被替换。 What is wrong?怎么了?

#!/usr/bin/python

import os
from posixpath import dirname
import re
import glob
import fnmatch

class bcolors:
    HEADER = '\033[95m'
    OKBLUE = '\033[94m'
    OKCYAN = '\033[96m'
    OKGREEN = '\033[92m'
    WARNING = '\033[93m'
    FAIL = '\033[91m'
    ENDC = '\033[0m'
    BOLD = '\033[1m'
    UNDERLINE = '\033[4m' 

#Path
file_src = dirname(os.path.abspath(__file__))

#Current directory name
print(bcolors.OKBLUE + bcolors.BOLD + 'Directory:', file_src)
'\n'

#List all files in directory
list_file = os.listdir(file_src)
print(bcolors.BOLD + 'In this directory:', '\n', list_file)
'\n'

#Finding all the "|" characters in a string
file_pattern = re.compile('[\\":<>;|*?]*')


#Replace "|" with "l"
list = str(list_file)
re.sub(file_pattern, 'l', list, re.I)

There are a few problems with your example:您的示例存在一些问题:

list = str(list_file)

This line is这条线是

  • shadowing a reserved keyword in Python (don't name a variable list ),在 Python 中隐藏保留关键字(不要命名变量list ),
  • I don't think it's doing what you think it's doing.我不认为它在做你认为它在做的事情。 It's not giving you a list of strings.它没有给你一个字符串列表。 It's giving you a string-representation of list_file , and它为您提供了list_file的字符串表示list_file ,并且
  • your list_file is already a list of strings.您的list_file已经是一个字符串列表。 I suspect you wrote this so that your re.sub function call would operate on a single thing, but you're better to use a list comprehension我怀疑你写这个是为了让你的re.sub函数调用对单一事物进行操作,但你最好使用列表理解

On to the next line:转到下一行:

re.sub(file_pattern, 'l', list, re.I)

You'll need to perform that .sub for each str in your list_ , and assign the result to a variable:您需要为list_每个str执行该.sub ,并将结果分配给一个变量:

replaced_list_file = [re.sub(file_pattern, 'l', f, re.I) for f in list_file]

but as multiple commenters have said, is that compile pattern actually doing what you think it's doing?但正如多位评论者所说,编译模式实际上在做你认为它在做什么吗? Have a look at this link and see if the results are what you expect.看看这个链接,看看结果是否是你所期望的。

Joshua's answer and the many comments, especially the suggestions from ekhumoro , already pointed out issues and guided to the solution. Joshua 的回答和许多评论,尤其是ekhumoro建议,已经指出了问题并引导了解决方案。

Fixed and improved固定和改进

Here is my copy-paste ready code, with some highlighting inline comments:这是我的复制粘贴就绪代码,带有一些突出显示的内联注释:

#!/usr/bin/python

import os
from posixpath import dirname
import re
import glob
import fnmatch

class bcolors:
    HEADER = '\033[95m'
    OKBLUE = '\033[94m'
    OKCYAN = '\033[96m'
    OKGREEN = '\033[92m'
    WARNING = '\033[93m'
    FAIL = '\033[91m'
    ENDC = '\033[0m'
    BOLD = '\033[1m'
    UNDERLINE = '\033[4m' 
    RESET = '\u001b[0m' # added to get regular style

def print_list(files):
    '''Print a list, one element per line.'''
    for f in files:
        print(bcolors.OKBLUE + f + bcolors.RESET)

#Path
directory = dirname(os.path.abspath(__file__))

#Current directory name
print(bcolors.BOLD + 'Directory:' + bcolors.OKBLUE, directory)
print(bcolors.RESET)

#List all files in directory
files = os.listdir(directory)
print(bcolors.BOLD + 'In this directory:' + bcolors.OKBLUE, len(files), bcolors.RESET + 'files')
print_list(files)

#Finding all the "|" characters in a string
pipe_pattern = re.compile('\|')  # need to escape the special character pipe (in regex means logical-OR)


#Replace "|" with "l"
renamed_files = []
for f in files:
    f_renamed = re.sub(r'\|', 'l', f, re.I)
    if (str(f_renamed) != str(f)):
        renamed_files.append(f_renamed)

# print the list of filenames, each on a separate line
print(bcolors.BOLD, "Renamed:" + bcolors.OKGREEN, len(renamed_files), bcolors.RESET + "files")
print_list(renamed_files)

Explanation解释

  • A simple regex to match a pipe-character is \\|匹配管道字符的简单正则表达式是\\|
  • Note: prepended backslash is required to escape special characters (like | (or), \\ escape, ( and ) grouping etc.)注意:需要前置反斜杠来转义特殊字符(如| (或)、 \\转义、 ()分组等)
  • Sometimes it is useful to extract code-blocks to functions (eg the def print_list ) .有时将代码块提取到函数(例如def print_list )很有用。 These can be easily tested.这些可以很容易地测试。

Test your replacement测试您的替代品

To test your replacement a simple function would help.要测试您的替代品,一个简单的功能会有所帮助。 Then you can test it with a fictive example.然后你可以用一个虚构的例子来测试它。

def replace_pipe(file):
    return file.replace('|', 'l') # here the first argument is no regex, thus not escaped!

### Test it with an example first
print( replace_pipe('my|file.txt') )

If everything works like expected.如果一切正常。 Then you add further (critical) steps.然后添加更多(关键)步骤。

Avoid integrating the I/O layer to early避免将 I/O 层集成到早期

To elaborate on the important advice from ekhumoro : The os.rename is a file-system operation at I/O layer.详细说明ekhumoro重要建议os.rename是 I/O 层的文件系统操作。 It has immediate effect on your system and can not easily be undone.它会立即对您的系统产生影响,并且无法轻易撤消。

So it can be regarded as critical.所以它可以被认为是关键的。 Imagine your renaming does not work as expected.想象一下,您的重命名没有按预期工作。 Then all the files can be renamed to a cryptic mess, at worst (harmful like ransomware).然后所有文件都可以重命名为一个神秘的混乱,最坏的情况是(像勒索软件一样有害)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM