[英]Parsing an input-string with different quotes via RegEx
I need to convert an input-string with multipe words into a string-array via Powershell.我需要通过 Powershell 将带有多个单词的输入字符串转换为字符串数组。 Words can be separated by multiple spaces and/or linebreaks.
单词可以由多个空格和/或换行符分隔。 Each word can be escaped by a single quote or a double quote.
每个单词都可以用单引号或双引号转义。 Some words may start with a hashtag - in that case any quoting appears after that hashtag.
有些词可能以主题标签开头 - 在这种情况下,任何引用都会出现在该主题标签之后。
Here a code sample of a possible input and the expected result:这是一个可能的输入和预期结果的代码示例:
$inputString = @"
test1
#custom1
#"custom2" #'custom3'
#"custom ""four""" #'custom ''five'''
test2 "test3" 'test4'
"@
$result = @(
'test1'
'#custom1'
'"#custom2"'
"#'custom3'"
'#"custom ""four"""'
"#'custom ''five'''"
'test2'
'"test3"'
"'test4'"
)
Is there any solution to do this via a clever RegEx-expression?是否有任何解决方案可以通过巧妙的 RegEx 表达式来做到这一点? Or does someone have a parser-snippet/function to start with?
或者有人有一个解析器片段/函数可以开始吗?
Assuming you fully control or implicitly trust the input string , you can use the following approach, which relies on Invoke-Expression
, which should normally be avoided :假设您完全控制或隐式信任输入 string ,您可以使用以下方法,该方法依赖于
Invoke-Expression
,通常应该避免:
Assumptions made:做出的假设:
#
only appears at the start of embedded strings. #
只出现在嵌入字符串的开头。$inputString = @"
test1
#custom1
#"custom2" #'custom3'
#"custom ""four""" #'custom ''five'''
test2 "test3" 'test4'
"@
$embeddedStrings = Invoke-Expression @"
Write-Output $($inputString -replace '\r?\n', ' ' -replace '#', '`#')
"@
Caveat: The outer quoting around the individual strings is lost in the process and the embedded, escaped quotes are unescaped ;警告:单个字符串周围的外部引用在此过程中丢失,嵌入的转义引号未转义; outputting
$embeddedString
yields:输出
$embeddedString
产生:
test1
#custom1
#custom2
#custom3
#custom "four"
#custom 'five'
test2
test3
test4
The approach relies on the fact that your embedded strings use PowerShell's quoting and quote-escaping rules;该方法依赖于您嵌入的字符串使用 PowerShell 的引用和引用转义规则这一事实; the only problems are the leading
#
characters, which are escaped as `#
above.唯一的问题是前导
#
字符,它们被转义为上面的`#
。 By replacing the embedded newlines ( \\r?\\n
) with spaces, the result can be passed as a list of positional arguments to Write-Output
, inside a string that is then evaluated with Invoke-Expression
.通过用空格替换嵌入的换行符 (
\\r?\\n
),结果可以作为位置参数列表传递给Write-Output
,在一个字符串中,然后用Invoke-Expression
求值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.