[英]complex -replace with Regex in Powershell c#
Following Regex: 正则表达式:
(?<=href(\s+)?=(\s+)?")(?!(\s+)?http)(?!//).+(?=")
Works as expected with test articles: 与测试文章一起按预期工作:
href="//www.google-analytics.com/analytics.js">
href="https://www.google-analytics.com/analytics.js">
href="index.html">
href="..\index.html">
href="main.css">
href="..\assets\main.css">
href = " ..\assets\main.css ">
As you may see here: https://t.co/PC0U9br3vn 正如您在此处看到的: https : //t.co/PC0U9br3vn
However: 然而:
[$string] $string = Get-Content sample.txt
[$string] $regex = '(?<=href(\s+)?=(\s+)?")(?!(\s+)?http)(?!(\s+)?//)(?!(\s+)?mailto).+(?=")'
$newString = $string -replace $regex, "..\$&"
$string
$newString
Produces the following output: 产生以下输出:
//www.google-analytics.com/analytics.js"> href=" https://www.google-analytics.com/analytics.js"> href="index.html"> href="..\index.html"> href=" main.css"> href="..\assets\main.css"> href = " ..\assets\main.css "> href = "mailto://email@domain "> href = "..\..\..\assets\main.css"
//www.google-analytics.com/analytics.js"> href=" https://www.google-analytics.com/analytics.js"> href="..\index.html"> href="..\index.html"> href=" main.css"> href="..\assets\main.css"> href = " ..\assets\main.css "> href = "mailto://email@domain "> href = "..\..\..\assets\main.css"
As only the first article is being operated on. 因为只有第一篇文章正在进行中。
The same script is working elsewhere where the replace string does not utilise regex and is a simple string. 相同的脚本正在其他地方工作,其中替换字符串不使用正则表达式并且是一个简单的字符串。
Input is of the wrong type: 输入的类型错误:
[$string] $string = Get-Content sample.txt
However and array of strings works: 但是,字符串数组的工作原理:
[$string[]] $string = Get-Content sample.txt
All you need is a negated character class [^"]+
( see this post of mine where I explain how \\[^"\\]+
works ). 你需要的只是一个否定的字符类 [^"]+
( 参见我的这篇文章,我解释了\\[^"\\]+
工作 )。 However, also note that (\\s+)?
但是,还要注意(\\s+)?
is the same as \\s*
. 与\\s*
相同。 No need to overstuff your regex with capturing groups if you are not planning to use them. 如果您不打算使用它们,则无需使用捕获组来填充正则表达式。
Use 采用
(?<=href\s*=\s*")(?!\s*http)(?!//)[^"]+
See regex demo 请参阅正则表达式演示
Here is what it matches: 这是它匹配的内容:
(?<=href\\s*=\\s*")
- if there is href
followed by 0 or more whitespace symbols, followed with =
and then again 0 or more whitespace before... (?<=href\\s*=\\s*")
- 如果有href
后跟0或更多空格符号,则后跟=
,然后再返回0或更多空格... (?!\\s*http)
- and if there is no 0 or more whitespace followed by http
right after the current position, and... (?!\\s*http)
- 如果在当前位置之后没有0或更多空格后跟http
,并且... (?!//)
- if there is no //
right after the current position... (?!//)
- 如果在当前位置之后没有//
... [^"]+
- match 1 or more characters other than "
. [^"]+
- 匹配"
。 "
以外的1个或多个字符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.