简体   繁体   English

复杂 - 在Powershell中使用Regex替换c#

[英]complex -replace with Regex in Powershell c#

Following Regex: 正则表达式:

(?<=href(\s+)?=(\s+)?")(?!(\s+)?http)(?!//).+(?=")

Works as expected with test articles: 与测试文章一起按预期工作:

href="//www.google-analytics.com/analytics.js">
href="https://www.google-analytics.com/analytics.js">
href="index.html">
href="..\index.html">
href="main.css">
href="..\assets\main.css">
href = " ..\assets\main.css ">

As you may see here: https://t.co/PC0U9br3vn 正如您在此处看到的: https//t.co/PC0U9br3vn

However: 然而:

[$string] $string = Get-Content sample.txt

[$string] $regex = '(?<=href(\s+)?=(\s+)?")(?!(\s+)?http)(?!(\s+)?//)(?!(\s+)?mailto).+(?=")'

$newString = $string -replace $regex, "..\$&"

$string
$newString

Produces the following output: 产生以下输出:

//www.google-analytics.com/analytics.js">  href=" https://www.google-analytics.com/analytics.js">  href="index.html">  href="..\index.html">  href="  main.css">  href="..\assets\main.css">  href = " ..\assets\main.css ">  href = "mailto://email@domain ">  href = "..\..\..\assets\main.css"
//www.google-analytics.com/analytics.js">  href=" https://www.google-analytics.com/analytics.js">  href="..\index.html">  href="..\index.html">  href="  main.css">  href="..\assets\main.css">  href = " ..\assets\main.css ">  href = "mailto://email@domain ">  href = "..\..\..\assets\main.css"

As only the first article is being operated on. 因为只有第一篇文章正在进行中。

The same script is working elsewhere where the replace string does not utilise regex and is a simple string. 相同的脚本正在其他地方工作,其中替换字符串不使用正则表达式并且是一个简单的字符串。

Input is of the wrong type: 输入的类型错误:

[$string] $string = Get-Content sample.txt

However and array of strings works: 但是,字符串数组的工作原理:

[$string[]] $string = Get-Content sample.txt

All you need is a negated character class [^"]+ ( see this post of mine where I explain how \\[^"\\]+ works ). 你需要的只是一个否定的字符类 [^"]+参见我的这篇文章,我解释了\\[^"\\]+工作 )。 However, also note that (\\s+)? 但是,还要注意(\\s+)? is the same as \\s* . \\s*相同。 No need to overstuff your regex with capturing groups if you are not planning to use them. 如果您不打算使用它们,则无需使用捕获组来填充正则表达式。

Use 采用

(?<=href\s*=\s*")(?!\s*http)(?!//)[^"]+

See regex demo 请参阅正则表达式演示

Here is what it matches: 这是它匹配的内容:

  • (?<=href\\s*=\\s*") - if there is href followed by 0 or more whitespace symbols, followed with = and then again 0 or more whitespace before... (?<=href\\s*=\\s*") - 如果有href后跟0或更多空格符号,则后跟= ,然后再返回0或更多空格...
  • (?!\\s*http) - and if there is no 0 or more whitespace followed by http right after the current position, and... (?!\\s*http) - 如果在当前位置之后没有0或更多空格后跟http ,并且...
  • (?!//) - if there is no // right after the current position... (?!//) - 如果在当前位置之后没有// ...
  • [^"]+ - match 1 or more characters other than " . [^"]+ - 匹配""以外的1个或多个字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM