简体   繁体   English

使用regex重命名linux中的文件

[英]Rename files in linux with regex

This is not actually a question, I solved this myself. 这实际上不是问题,我自己解决了。 But I wanted to post my solution here to save other people in the same situation time and effort. 但是我想在这里发布我的解决方案,以在相同情况下节省其他人的时间和精力。

So I came into the situation where I had to rename a lot (+3000) matching a certain pattern. 因此,我不得不重新命名很多(+3000)以匹配特定模式的情况。 In my case, the files were automatic backups from syncthing, so a file would be renamed like this: 在我的情况下,文件是同步的自动备份,因此文件将被重命名,如下所示:

foo.bar -> foo~20150221-1330.bar

After a lot of searching trough forums and man pages, I created the following one-liner which restores the original filename with the find, sed , xargs and mv commands in linux: 经过大量的论坛和手册页搜索后,我创建了以下单行代码,该代码使用linux中的find, sedxargsmv命令恢复了原始文件名:

find . -type f  | sed -e 'p;s/\(.*\)~20[0-9]\{6\}-[0-9]\{6\}\(.*\)/\1\2/' | xargs -n2 -d'\n' mv

If you can replace the sed -part with your own pattern if you like. 如果您愿意,可以用自己的样式替换sed -part。 This command can handle whitespaces by the way (thanks to the -d'\\n' flag in xargs ), but not newlines. 该命令可以通过空格(由于xargs-d'\\n'标志)来处理空格,但不能使用换行符。 I hope some of you find this command useful. 我希望某些人觉得此命令有用。

Ok so I'll give some more information about what each command does: 好的,我将为每个命令提供更多信息:

  • find : give all regular files (not directories) in the current directory find :给出当前目录中的所有常规文件(不是目录)
  • sed : p will print every line from stdin, the s/regex/regex/ will print the same lines, but substituted. sedp将打印stdin中的每一行, s/regex/regex/将打印相同的行,但是被替换。 So you get each file followed by the fixed filename: 因此,您得到每个文件后跟固定的文件名:

     ./foo/bar~20150221-172703.txt ./foo/bar.txt` 
  • xargs : -n2 will take two lines and send them to mv as parameter, -d'\\n' will fix issues with whitespaces in folder names (the delimiter is set to newline instead of whitespace) xargs-n2将使用两行并将其作为参数发送到mv ,-d'\\ n'将解决文件夹名称中的空格问题(定界符设置为换行符而不是空格)

Your answer is good, but if you also want to address filenames that also contain newlines, if you don't want to use sed and if you want to make sure that only files with basename that matches this pattern are taken into account (your method fails, eg, with dir~20150221-172703/file , and your method makes mv complain a lot about file and file being the same file when the filename doesn't match your pattern) you need to proceed slightly differently. 您的回答很好,但是如果您还想处理还包含换行符的文件名,或者您不想使用sed并且要确保仅考虑具有与该模式匹配的基本名称的文件(您的方法失败,例如,使用dir~20150221-172703/file ,并且当文件名与您的模式不匹配时,您的方法使mv抱怨filefile是同一文件),您需要稍作不同。


One possibility, if your find supports the -print0 option (GNU find does): have find spit all the filenames, and use a while loop in which Bash (not sed ) will perform the substitution. 一种可能的情况是,如果您的find支持-print0选项(GNU find则支持):让find吐出所有文件名,并使用while循环,其中Bash(未sed )将执行替换。 Something like this: 像这样:

find . -type f -print0 | while IFS= read -r -d '' file; do
    dirname=${file%/*}
    basename=${file##*/}
    # Perform the substitution only on basename
    # Since you like regex, you can use them
    if [[ $basename =~ ^(.*)~20[0-9]{6}-[0-9]{6}(.*)$ ]]; then
        new_basename=${BASH_REMATCH[1]}${BASH_REMATCH[2]}
        echo mv "$dirname/$basename" "$dirname/$new_basename"
    fi
done

You could also make sure find only spits matching files by using the -regex filter (not POSIX, but GNU find supports it): 您还可以通过使用-regex过滤器(不是POSIX,但GNU find支持它)来确保只find匹配文件。

find . -regextype posix-basic -regex '.*~20[0-9]\{6\}-[0-9]\{6\}[^/]*' -type f

Not sure this fully answers your question, but at least it fixes the flaws your method has. 不确定是否可以完全回答您的问题,但至少可以解决您的方法所存在的缺陷。

Without GNU goodies, it's more difficult to have something robust and clean... 没有GNU好吃的东西,要拥有坚固而干净的东西就更加困难了。

You can use rnm : 您可以使用rnm

rnm -rs '/(.*)~20[0-9]{6}-[0-9]{6}(.*)/\1\2/' -fo -dp -1 *
  1. -fo : file only mode -fo :仅文件模式
  2. -dp : depth of directory (-1 means unlimited depth ie goes to all sub-directories). -dp :目录深度(-1表示无限深度,即转到所有子目录)。

You could use your own regex pattern exactly though, in that case you would have to change the regex mode to basic (BRE): 不过,您可以完全使用自己的正则表达式模式,在这种情况下,您必须将正则表达式模式更改为基本(BRE):

rnm --regex basic -rs '/\(.*\)~20[0-9]\{6\}-[0-9]\{6\}\(.*\)/\1\2/' -fo -dp -1 *

Note: 注意:

  1. Only invalid characters are the null character and the path delimiter ( / ). 只有无效字符是空字符和路径定界符( / )。
  2. Default regex mode is javascript. 默认正则表达式模式为javascript。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM