简体   繁体   English

PowerShell正则表达式如何使用多行字符串?

[英]How does PowerShell regex work with multi-line strings?

Alright, this is driving me nuts because my regex is working on Rubular, but PowerShell is not working as I expect. 好吧,这让我疯了,因为我的正则表达式正在研究Rubular,但是PowerShell并没有像我期望的那样工作。

  1. I did a Get-ChildItem on a network directory and then directed the output into a txt file. 我在网络目录上做了一个Get-ChildItem,然后将输出定向到一个txt文件。
  2. I went to remove the directory info from the text file that appears like the following: 我去删除文本文件中的目录信息,如下所示:

在此输入图像描述

  1. When I use PowerShell to try and write a regex to remove the Directory info, I run into some problems. 当我使用PowerShell尝试编写正则表达式来删除目录信息时,我遇到了一些问题。

When I use: 我用的时候:

$var = Get-Contnet "file path"
$var -match "Directory.*"

PowerShell grabs the text I am looking for, BUT it doesn't grab the text that starts on a new line, I get: PowerShell抓取我正在寻找的文本,但它没有抓住从新行开始的文本,我得到:

Directory: \\Drive\Unit\Proposals\Names\Location\crazy folder path\even crazier folder path\unbelievable folder path\

So... when I use: 所以...当我使用时:

$var -match "Directory.*\n.*"

I get nothing... 我一无所获......

When I try this on Rublar it works fine, what am I missing here? 当我在Rublar上尝试这个时它工作正常,我在这里缺少什么? Any help would be great, thanks! 任何帮助都会很棒,谢谢!

Filburt's answer is a good one, and it doesn't look like regular expressions are the best tool to use here. Filburt的答案是一个很好的答案,它看起来不像正则表达式是这里使用的最佳工具。 However, you bumped into an issue that may cause confusion again down the road. 但是,你遇到了一个可能在未来再次引起混淆的问题。 The issue here is that the variable you populated with Get-Content is not a multi-line string. 这里的问题是您使用Get-Content填充的变量不是多行字符串。 It is an array of strings: 它是一个字符串数组:

$var = Get-Content "file path"
$var.GetType() # Shows 'Object[]'

When you run a regex match against $var , it matches against each object in the array (each line in the file) individually. 当您针对$var运行正则表达式匹配时,它会分别匹配数组中的每个对象(文件中的每一行)。 It can't match past the end of a line because the next line is a new object. 它不能匹配超过一行的结尾,因为下一行是一个新对象。

One workaround here is to flatten that array of strings down into a single string like this: 这里的一个解决方法是将该字符串数组展平为单个字符串,如下所示:

$var = (Get-Content "file path" | Out-String)
$var.GetType() # Shows 'String' now

In Powershell it can sometimes be tricky to tell when you're dealing with a single String object versus an array of Strings. 在Powershell中,当你处理单个String对象和一个字符串数组时,有时候很难说。 If you output them to the console they appear identical. 如果将它们输出到控制台,它们看起来是相同的。 In those cases, GetType() and Out-String can be useful tools. 在这些情况下, GetType()Out-String可以是有用的工具。

Edit: As of Powershell 3.0, the Filesystem provider includes a -Raw switch for Get-Content . 编辑:从Powershell 3.0开始, Filesystem提供程序包含一个用于Get-Content-Raw开关。 That switch instructs Get-Content to read the file all at once without splitting it into chunks. 该开关指示Get-Content一次性读取文件而不将其拆分为块。 It is significantly quicker than using the Out-String workaround, because it doesn't waste time pulling pieces apart only to put them back together again. 它比使用Out-String解决方法要快得多,因为它不会浪费时间将片断分开,只是为了将它们重新组合在一起。

为什么不在将它们输出到文件之前选择所需的属性?

Get-ChildItem | Select-Object Mode, LastWriteTime, Length, Name | Out-File Result.txt

It's possible that the lines don't end with \\n . 线条可能不以\\n结尾。 I believe the standard line termination characters in Windows is \\r\\n . 我相信Windows中的标准行终止字符是\\r\\n Try re-writing your regex to match that. 尝试重写你的正则表达式以匹配它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM