简体   繁体   English

使用正则表达式匹配组重命名 Powershell

[英]Powershell renaming with regex matched group

I'm struggling to write a Powershell Command that does the following.我正在努力编写一个执行以下操作的 Powershell 命令。 Assume A folder which has a bunch of files with random names that match a regex pattern.假设有一个文件夹,其中包含一堆随机名称匹配正则表达式模式的文件。 I would like to capture the part that matches the pattern and rename the file to that part only.我想捕获与模式匹配的部分并将文件重命名为该部分。

Eg "asdjlk-c12aa13-.pdf" should become "c12aa13.pdf" if the pattern is \\w\\d+\\w+\\d+ (or similiar).例如,如果模式是\\w\\d+\\w+\\d+ (或类似的),“asdjlk-c12aa13-.pdf”应该变成“c12aa13.pdf”。

My current idea looks something like this:我目前的想法是这样的:

Get-ChildItem | Rename-Item -NewName { $_.Name -match $pattern ... } -WhatIf

where ... needs to be replaced with something that sets the "value" of the codeblock (ie the NewName) to the matched group.其中...需要替换为将代码块的“值”(即 NewName)设置为匹配组的内容。 Ie I don't know how to access $matched directly after the -match command.即我不知道如何在-match命令之后直接访问$matched

Also, I wonder if it's possible to do lazy matching using -match , .*?另外,我想知道是否可以使用-match.*?进行延迟匹配.*? doesn't seem to do the trick.似乎没有办法。

While you could follow the -match operation with subsequent extraction of the matched part(s) via the automatic $Matches variable, it's often easier to combine the two operations with the help of the -replace operator:虽然您可以在-match操作之后通过自动$Matches变量提取匹配的部分,但在-replace运算符的帮助下, -replace两个操作组合起来通常更容易:

You just need to make sure that in order to return only the parts of interest, you must match the input string in full and then ignore the parts you don't care about:你只需要确保的是,为了只返回感兴趣的部分,必须在完整输入字符串你不关心的零件匹配,然后忽略:

PS> 'asdjlk-c12aa13-.pdf' -replace '^.*?(\w\d+\w+\d+).*?(\.pdf)$', '$1$2'
c12aa13.pdf
  • ^.*? (lazily) matches the prefix before the part of interest. (懒惰地)匹配感兴趣部分之前的前缀。

  • (\\w\\d+\\w+\\d+) matches the part of interest, wrapped in a capture group; (\\w\\d+\\w+\\d+)匹配感兴趣的部分,包裹在一个捕获组中; since it is the 1st capture group in the regex, you can refer to what it captured as $1 in the replacement operand.由于它是正则表达式中的第一个捕获组,因此您可以在替换操作数中将它捕获的内容称为$1

  • .*? (lazily) matches everything after up to the .pdf filename extension. (懒惰地)匹配.pdf文件扩展名之后的所有内容。

  • (\\.pdf)$ matches filename extension .pdf at the end of the name and, as the 2nd capture group, can be referenced as $2 in the replacement operand. (\\.pdf)$匹配名称末尾的文件扩展名.pdf ,作为第二个捕获组,可以在替换操作数中引用为$2

  • $1$2 simply concatenates the 2 capture-group matches to output the desired name. $1$2简单地连接 2 个捕获组匹配以输出所需的名称。

    • Note: Generally, use single-quoted strings for both the regex and the replacement operand, so that $ isn't accidentally interpreted by PowerShell beforehand.注意:通常,对正则表达式和替换操作数都使用单引号字符串,这样$就不会被PowerShell事先意外解释。

    • For more information about -replace and the syntax of the replacement operand, see this answer of mine.有关-replace和替换操作数的语法的更多信息,请参阅我的这个答案


The solution in the context of your command:您的命令上下文中的解决方案:

Get-ChildItem |
  Rename-Item -NewName { $_.Name -replace '^.*?(\w\d+\w+\d+).*?(\.pdf)$', '$1$2' } -WhatIf

A safer method is to do so with a test (similar to -WhatIf ) This example renames files from DSC12345 - X-1.jpg => DSC12345-X1.jpg更安全的方法是使用测试(类似于-WhatIf )此示例重命名来自DSC12345 - X-1.jpg => DSC12345-X1.jpg

# first verify what your files will convert too
# - gets files
# - pipes to % (foreach)
# - creates $a variable for replacement
# - echo replacement
Get-ChildItem . | % { $a = $_.name -replace "^DSC(\d+)\s-\s([A-Z])-(\d).jpg$",'DSC$1-$2$3.jpg'; echo "$_.name => $a"; }

# example output:
# DSC04975-W1.jpg.name => DSC04975-W1.jpg
# DSC04976-W2.jpg.name => DSC04976-W2.jpg
# DSC04977-W3.jpg.name => DSC04977-W3.jpg
# ...

# use the same command and replace "echo" with "ren"
Get-ChildItem . | % { $a = $_.name -replace "^DSC(\d+)\s-\s([A-Z])-(\d).jpg$",'DSC$1-$2$3.jpg'; ren $_.name $a; }

This is much safer as renames can be disastrous when run incorrectly.这更安全,因为如果运行不正确,重命名可能是灾难性的。

To be honest, I am not sure if your line above will work.老实说,我不确定你上面的那句话是否有效。 If "\\w\\d+\\w+\\d+" is the pattern you are looking for, I would do something like this:如果 "\\w\\d+\\w+\\d+" 是你正在寻找的模式,我会做这样的事情:

[regex]$regex = "\w\d+\w+\d+"    
Get-ChildItem | ?{$_.name -match $regex} | %{rename-item $_ "$($regex.Matches($_).value).pdf"}

In this case, you pipeline the output of the Get-ChildItem to the "foreach where loop" (?{...}), and after that you pipeline this outpout to the "foreach loop" (%{...}) to rename every object.在这种情况下,您将 Get-ChildItem 的输出通过管道传输到“foreach where 循环”(?{...}),然后将此输出通过管道传输到“foreach 循环”(%{...})重命名每个对象。

You can put as much as you want in the scriptblock.您可以在脚本块中放入任意数量的内容。 Also hiding the output of the -match.还隐藏了 -match 的输出。 The regex is lazy with the "?".正则表达式对“?”很懒惰。

Get-ChildItem | Rename-Item -NewName { [void]($_.Name -match '.+?'); $matches.0 } -WhatIf

What if: Performing the operation "Rename File" on target "Item: /Users/js/foo/afile Destination: /Users/js/foo/a".
What if: Performing the operation "Rename File" on target "Item: /Users/js/foo/bfile Destination: /Users/js/foo/b".
What if: Performing the operation "Rename File" on target "Item: /Users/js/foo/cfile Destination: /Users/js/foo/c".

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM