简体   繁体   English

了解 sed 正则表达式模式

[英]Understanding sed regex pattern

I'm very new to the Linux World and I'm trying to get a hang of the basic commands.我对 Linux 世界很陌生,我正在尝试掌握基本命令。 While going thru one of the scripts, observed the below line, which I couldn't comprehend.在浏览其中一个脚本时,观察到以下行,我无法理解。

sed -n -e 's|declare -x ||p' -e 's|^declare -ax* \([^=]*\)='\''\(.*\)'\''.*$|\1=\2|p'

Going thru the SED & declare man pages, i got an idea about the flags/options, like -n and -e, but not sure about the regex like pattern given above and what exactly "p" at the end of the command does?通过 SED 并声明手册页,我对标志/选项有所了解,例如 -n 和 -e,但不确定上面给出的类似正则表达式的模式以及命令末尾的“p”到底是做什么的?

Tried to reproduce the above line on the regex101 site, but with no luck:(试图在 regex101 网站上重现上述行,但没有运气:(

The first expression simply removes any declare -x .第一个表达式只是删除了任何declare -x

The second extracts the variable and value from declare -ax variable=value with some complications around quoting.第二个从declare -ax variable=value中提取变量和值,并在引用方面有一些复杂性。 The x is optional (strictly speaking the regex allows zero or more, but you probably don't expect more than one). x是可选的(严格来说,正则表达式允许零个或多个,但您可能不会期望超过一个)。

In some more detail,再详细一点,

  • s|regex|replacement| just replaces any match of regex with replacement , using |只是用replacement替换任何匹配的regex ,使用| as the regex delimiter instead of the default /作为正则表达式分隔符而不是默认/
  • s|regex|replacement|p with the p flag prints the resulting line if the replacement occurred;带有p标志s|regex|replacement|p如果发生替换,则打印结果行; this is often combined with sed -n to only print the lines where a replacement occurred.这通常与sed -n结合使用,仅打印发生替换的行。
  • 'whatever'\''something'\''more stuff' uses shell quoting to represent literal single quotes in an otherwise single-quoted string. 'whatever'\''something'\''more stuff'使用 shell 引用来表示单引号字符串中的文字单引号。 You can't escape single quotes inside single quotes so this uses a closing single quote followed by a backslashed literal single quote followed by another opening single quote to embed single quotes in the quoted string.您不能在单引号内转义单引号,因此这使用一个右单引号,后跟一个反斜杠文字单引号,然后是另一个左单引号,以在引用的字符串中嵌入单引号。
  • s/\(something.*\)other/\1/ replaces something or other with something or , where the backslashed parentheses specify grouping, and \1 is a back reference to the text which matched the first parenthesized group. s/\(something.*\)other/\1/something or other替换为something or ,其中反斜杠括号指定分组, \1是对匹配第一个括号组的文本的反向引用。 Similarly \2 refers to the second parenthesized group, etc.类似地, \2指的是第二个带括号的组,等等。

.* inside the parentheses is actually wrong if the intent is to capture a single-quoted string;如果意图是捕获单引号字符串,括号内的.*实际上是错误的; the regex should only match a character which is not a single quote (or ideally an expression which contains literal single quotes as per the explanation above).正则表达式应该只匹配一个不是单引号的字符(或者理想情况下,根据上面的解释,一个包含文字单引号的表达式)。

https://regex101.com/ is not particularly suitable for sed regex. https://regex101.com/不是特别适合sed正则表达式。 It doesn't support the regex dialect of sed (the closest is probably the ECMAScript dialect, but you have to understand the differences anyway), and can't tell you what the surrounding script does.它不支持sed的正则表达式方言(最接近的可能是 ECMAScript 方言,但无论如何您必须了解差异),并且无法告诉您周围的脚本做什么。

The p is a flag of the s command. ps命令的标志。 On my system, it's not documented in the man page, but in the info page.在我的系统上,它没有记录在man页中,而是在info页中。

'p' 'p'
If the substitution was made, then print the new pattern space.如果进行了替换,则打印新的模式空间。

The '\'' dance is just a common way how to insert a single quote into a bash parameter. '\''舞蹈只是如何在 bash 参数中插入单引号的常用方法。 Single quotes are removed during "quote removal" and single quotes can't be nested.在“引号删除”期间会删除单引号,并且不能嵌套单引号。 So you need to end the quoted string, escape a quote, and start another quoted string.所以你需要结束带引号的字符串,转义一个引号,然后开始另一个带引号的字符串。 You can also find the alternative '"'"' in the wild.您还可以在野外找到替代'"'"'

The sed will therefore see this as the parameter (I used the traditional / instead of | as there's no need to use | ):因此,sed 会将其视为参数(我使用传统的/而不是|因为没有必要使用| ):

s/^declare -ax* \([^=]*\)='\(.*\)'.*$/\1=\2/p

which searches for declare at the beginning of a line ( ^ ) followed by a space, -a and possibly x or xx or xxx etc.;在行首搜索declare ( ^ ) 后跟一个空格, -a可能还有xxxxxx等; followed by a space and anything but = , then = , and then really anything in single quotes.后跟一个空格和除=之外的任何内容,然后是= ,然后是单引号中的任何内容。 We don't care what follows the last single quote.我们不在乎最后一个单引号后面的内容。 The two anythings are remembered in \1 and \2 , and the whole line is replaced by \1=\2 , ie the declare -axxx is removed from it, as are the outermost single quotes.这两个东西在\1\2中被记住,整行被\1=\2替换,即从其中删除了declare -axxx ,最外面的单引号也是如此。 If the line doesn't match the regex, nothing is printed.如果该行与正则表达式不匹配,则不打印任何内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM