简体   繁体   中英

Understanding sed regex pattern

I'm very new to the Linux World and I'm trying to get a hang of the basic commands. While going thru one of the scripts, observed the below line, which I couldn't comprehend.

sed -n -e 's|declare -x ||p' -e 's|^declare -ax* \([^=]*\)='\''\(.*\)'\''.*$|\1=\2|p'

Going thru the SED & declare man pages, i got an idea about the flags/options, like -n and -e, but not sure about the regex like pattern given above and what exactly "p" at the end of the command does?

Tried to reproduce the above line on the regex101 site, but with no luck:(

The first expression simply removes any declare -x .

The second extracts the variable and value from declare -ax variable=value with some complications around quoting. The x is optional (strictly speaking the regex allows zero or more, but you probably don't expect more than one).

In some more detail,

  • s|regex|replacement| just replaces any match of regex with replacement , using | as the regex delimiter instead of the default /
  • s|regex|replacement|p with the p flag prints the resulting line if the replacement occurred; this is often combined with sed -n to only print the lines where a replacement occurred.
  • 'whatever'\''something'\''more stuff' uses shell quoting to represent literal single quotes in an otherwise single-quoted string. You can't escape single quotes inside single quotes so this uses a closing single quote followed by a backslashed literal single quote followed by another opening single quote to embed single quotes in the quoted string.
  • s/\(something.*\)other/\1/ replaces something or other with something or , where the backslashed parentheses specify grouping, and \1 is a back reference to the text which matched the first parenthesized group. Similarly \2 refers to the second parenthesized group, etc.

.* inside the parentheses is actually wrong if the intent is to capture a single-quoted string; the regex should only match a character which is not a single quote (or ideally an expression which contains literal single quotes as per the explanation above).

https://regex101.com/ is not particularly suitable for sed regex. It doesn't support the regex dialect of sed (the closest is probably the ECMAScript dialect, but you have to understand the differences anyway), and can't tell you what the surrounding script does.

The p is a flag of the s command. On my system, it's not documented in the man page, but in the info page.

'p'
If the substitution was made, then print the new pattern space.

The '\'' dance is just a common way how to insert a single quote into a bash parameter. Single quotes are removed during "quote removal" and single quotes can't be nested. So you need to end the quoted string, escape a quote, and start another quoted string. You can also find the alternative '"'"' in the wild.

The sed will therefore see this as the parameter (I used the traditional / instead of | as there's no need to use | ):

s/^declare -ax* \([^=]*\)='\(.*\)'.*$/\1=\2/p

which searches for declare at the beginning of a line ( ^ ) followed by a space, -a and possibly x or xx or xxx etc.; followed by a space and anything but = , then = , and then really anything in single quotes. We don't care what follows the last single quote. The two anythings are remembered in \1 and \2 , and the whole line is replaced by \1=\2 , ie the declare -axxx is removed from it, as are the outermost single quotes. If the line doesn't match the regex, nothing is printed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM