Match Multiple Strings In Awk Command Using RS And RT

Question

I have the following data:

Example line 0</span>
<tag>Example line 1</tag>
<span>Example line 1.5</span>
--Hello Example line 1.7
<tag>
Example line 2
</tag>
--Hello Example line 2.7
<span>Example line 4</span>

Using this command awk -v RS='</tag>' 'RT {gsub(/.*?<tag>|\\n/, ""); print "<tag>" $0 RT}' awk -v RS='</tag>' 'RT {gsub(/.*?<tag>|\\n/, ""); print "<tag>" $0 RT}' I get:

<tag>Example line 1</tag>
<tag>Example line 2</tag>

However, I want the output to be:

<tag>Example line 1</tag>
--Hello Example line 1.7
<tag>Example line 2</tag>
--Hello Example line 2.7

Question:

I would just like to know how to add the "or" option to also match any line that begins with --Hello . What would be the proper way to implement in my code?

Other options:

Or, another option would be to use grep -o '<tag.*tag>\\|^--.*' but I would need to also find a way to match newlines (as asked here: Match Anything In Between Strings For Linux Grep Command ).

Any help is highly appreciated.

Answer 1

You can modify your earlier awk command to this:

awk -v RS='</tag>' '/\n--Hello /{print gensub(/.*\n(--Hello [^\n]*).*/, "\\1", "1")}
       RT{gsub(/.*<tag>|\n/, ""); print "<tag>" $0 RT}' file

<tag>Example line 1</tag>
--Hello Example line 1.7
<tag>Example line 2</tag>
--Hello Example line 2.7

Answer 2

$ cat tst.awk
BEGIN { RS="--Hello[^\\n]+|<\\/tag>" }
RT { print (RT~/^--/ ? "" : gensub(/.*(<tag>)/,"\\1",1)) RT }

$ awk -f tst.awk file
<tag>Example line 1</tag>
--Hello Example line 1.7
<tag>
Example line 2
</tag>
--Hello Example line 2.7

The above uses GNU awk for multi-char RS, RT, and gensub().

Match Multiple Strings In Awk Command Using RS And RT

Question

2 answers

solution1
2 ACCPTED 2016-10-14 22:17:35

solution2
0 2016-10-14 22:06:34

Match Multiple Strings In Awk Command Using RS And RT

Question

2 answers

solution1 2 ACCPTED 2016-10-14 22:17:35

solution2 0 2016-10-14 22:06:34

solution1
2 ACCPTED 2016-10-14 22:17:35

solution2
0 2016-10-14 22:06:34