简体   繁体   中英

Multi line regex expression

I have the following file, and I'm trying to use /\\|\\".*\\"\\|/gm match multi line..end here , but the regex finds nothing using sed -e ':a;N;$!ba;s/\\|\\".*\\"\\|/abcxyz/gm' t > t.2 , I can't find out what's wrong with it.

    |2012-10-12 13:41:08.067|2012-10-12 13:45:03.282|f||"multi line star t herer erj
jdkajdkfj 
   end here"|2017

To quote from the GNU sed manual :

The 'M' modifier to regular-expression matching is a GNU 'sed' extension which directs GNU 'sed' to match the regular expression in 'multi-line' mode. The modifier causes '^' and '$' to match respectively (in addition to the normal behavior) the empty string after a newline, and the empty string before a newline. There are special character sequences ('`' and '\\'') which always match the beginning or the end of the buffer. In addition, the period character does not match a new-line character in multi-line mode.

(emphasis added)

You need to add \\n to match the (possibly multiple?) newlines explicitly, or else do it a different way. I'd suggest using [^\\"] in place of .* , unless you know there's only one quoted field per record in the file.

sed is for simple subsitutions on individual lines, that is all . If you are using sed constructs other than s, g, and p (with -n) then you are using functionality that became obsolete in the mid-1970s when awk was invented.

With GNU awk for multi-char RS:

$ awk -v RS='multi line.*end here' 'RT{print RT}' file
multi line star t herer erj
jdkajdkfj
end here

If that's not what you're looking for then edit your question to clarify the expected output and your requirements for matching it (string vs regexp, case sensitive vs insensitive, partial vs full, etc.)

This might work for you (GNU sed):

sed -e ':a;N;$!ba;s/|".*"|/|"abcxyz"|/' file

The | and the " do not need to be quoted. If the | is quoted it acts as the alternation metacharacter ie this|that meaning this or that.

NB The :a;N;$!ba construct slurps the entire file into memory and because the .* is greedy, more than one record may be matched.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM