简体   繁体   English

PowerShell正则表达式从文件名中提取SID

[英]PowerShell regex to extract SID from filename

I have an array $vhdlist with contents similar to the following filenames: 我有一个数组$ vhdlist,其内容类似于以下文件名:

UVHD-S-1-5-21-8746256374-654813465-374012747-4533.vhdx
UVHD-S-1-5-21-8746256374-654813465-374012747-6175.vhdx
UVHD-S-1-5-21-8746256374-654813465-374012747-8147.vhdx
UVHD-template.vhdx

I want to use a regex and be left with an array containing only SID portion of the filenames. 我想使用一个正则表达式,并保留一个仅包含文件名的SID部分的数组。

I am using the following: 我正在使用以下内容:

$sids = foreach ($file in $vhdlist) 
{
[regex]::split($file, '^UVHD-(?:([(\d)(\w)-]+)).vhdx$')
}

There are 2 problems with this: in the resulting array there are 3 blank lines for every SID; 这有两个问题:在结果数组中,每个SID都有3条空行; and the "template" filename matches (the resulting line in the output is just "template"). 并且“模板”文件名匹配(输出中的结果行仅是“模板”)。 How can I get an array of SIDs as the output and not include the "template" line? 如何获得SID数组作为输出而不包含“ template”行?

You seem to want to filter the list down to those filenames that contain an SID. 您似乎想将列表筛选为包含SID的文件名。 Filtering is done with Where-Object ( where for short); 过滤是通过Where-Object (简称where )完成的; you don't need a loop. 您不需要循环。

An SID could be described as " S- and then a bunch of digits and dashes" for this simple case. 对于这种简单情况,SID可以描述为S- ,然后是一堆数字和破折号” That leaves us with ^UVHD-S-[\\d-]*\\.vhdx$ for the filename. 这给我们留下了^UVHD-S-[\\d-]*\\.vhdx$作为文件名。

In combination we get: 结合起来,我们得到:

$vhdlist | where { $_ -Match "^UVHD-S-[\d-]*\.vhdx$" }

When you don't really have an array of strings, but actually an array of files , use them directly. 当您实际上没有字符串数组,而是实际上有文件数组时,请直接使用它们。

dir C:\some\folder | where { $_.Name -Match "^UVHD-S-[\d-]*\.vhdx$" }

Or, possibly you can even make it as simple as: 或者,甚至可以将其简化为:

dir C:\some\folder\UVHD-S-*.vhdx

EDIT 编辑

Extracting the SIDs from a list of strings can be thought as a combined transformation (for each element, extract the SID) and filter (remove non-matches) operation. 从字符串列表中提取SID可以看作是组合转换 (对于每个元素,提取SID)和过滤 (删除不匹配项)操作。

PowerShell's ForEach-Object cmdlet ( foreach for short) works like map() in other languages. PowerShell的ForEach-Object cmdlet(简称foreach )的工作方式与其他语言中的map()相似。 It takes every input element and returns a new value. 它接受每个输入元素并返回一个新值。 In effect it transforms a list of input elements into output elements. 实际上,它将输入元素列表转换为输出元素。 Together with the -replace operator you can extract SIDs this way. -replace运算符一起,您可以通过这种方式提取SID。

$vhdlist | foreach { $_ -replace ^(?:UVHD-(S-[\d-]*)\.vhdx|.*)$,"`$1" } | where { $_ -gt "" }

The regex back-reference for .NET languages is $1 . .NET语言的正则表达式后向引用为$1 The $ is a special character in PowerShell strings, so it needs to be escaped, except when there is no ambiguity. $是PowerShell字符串中的特殊字符,因此,除非没有歧义,否则需要对其进行转义。 The backtick is the PS escape character. 反引号是PS转义字符。 You can escape the $ in the regex as well, but there it's not necessary. 您也可以在正则表达式中转义$ ,但这不是必需的。

As a final step we use where to remove empty strings (ie non-matches). 作为最后一步,我们where删除空字符串(即不匹配项)。 Doing it this way around means we only need to apply the regex once, instead of two times when filtering first and replacing second. 以这种方式进行处理意味着我们只需要应用一次正则表达式,而不是在第一次过滤和第二次替换时应用两次。

PowerShell operators can also work on lists directly. PowerShell操作员也可以直接在列表上工作。 So the above could even be shortened: 因此,以上内容甚至可以缩短:

$vhdlist -replace "^UVHD-(S-[\d-]*)\.vhdx$","`$1" | where { $_ -gt "" }

The shorter version only works on lists of actual strings or objects that produce the right thing when .ToString() is called on them. 较短的版本仅适用于实际字符串或对象列表,这些字符串或对象在调用.ToString()时会产生正确的结果。

Regex breakdown: 正则表达式细分:

^                       # start-of-string anchor
(?:                     # begin non-capturing group (either...)
  UVHD-                 #   'UVHD-'
  (                     #   begin group 1
    S-[\d-]*            #     'S-' and however many digits and dashes
  )                     #   end group 1
  \.vhdx                #   '.vhdx'
  |                     #    ...or...
  .*                    #   anything else
)                       # end non-capturing group
$                       # end-of-string anchor

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM