简体   繁体   中英

Using sed or awk to extract the lines between 2 patterns

I am new to shell scripting and tried lot many things using old threads to retrieve the message from the log file but failed to get the desired output.

Below is the sample message how it looks

00:31:54.184 MNK  I 4155809232 (monklog:391): The result of the mapping is : S|aaaaa|bbbbb|32|D|M|28/04/2015|ccc|33208369
00:31:54.184 MNK  I 4155809232 (monklog:391): .05|28/04/2015|0428|C|105840.|dddd|fffff|9511705558|/CTC/097/eeeeee eee|/PT
00:31:54.184 MNK  I 4155809232 (monklog:391): /SC/TT/12/SN/eee eeeeeee/CeeY/ee -eee aa aaaa S.A.B. DE C.V./DC/aaaaa
00:31:54.184 MNK  I 4155809232 (monklog:391):  AND aaaaa aaaa/NA/aaaaa,/SK/aaaaa|D|M|28/04/2015|MXN|11111.17||||
00:31:54.184 MNK  I 4155809232 (monklog:391): ||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||
00:31:54.184 MNK  I 4155809232 (monklog:391): S|aaaaa|bbbbb|32|D|M|28/04/2015|ccc|33208369
00:31:54.184 MNK  I 4155809232 (monklog:391): .05|28/04/2015|0428|C|105840.|dddd|fffff|9511705558|/CTC/097/eeeeee eee|/PT
00:31:54.184 MNK  I 4155809232 (monklog:391): /SC/TT/12/SN/eee eeeeeee/CeeY/ee -eee aa aaaa S.A.B. DE C.V./DC/aaaaa
00:31:54.184 MNK  I 4155809232 (monklog:391):  AND aaaaa aaaa/NA/aaaaa,/SK/aaaaa|D|M|28/04/2015|MXN|11111.17||||
00:31:54.184 MNK  I 4155809232 (monklog:391): ||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||
00:31:54.184 MNK  I 4155809232 (monklog:406): ||29/04/2015 01:31:00|||||||||^M

I need to get the message from S| and before ^M .

I tried these codes.

awk '/S|/{flag=1}/|^M/{flag=0}flag' $Log  > output2.txt
sed -n '/: S|/,/|^M/p' $Log > output.txt

Both gives me same input as output. Please help. Thanks.


Expected output

S|aaaaa|bbbbb|32|D|M|28/04/2015|ccc|33208369.05|28/04/2015|0428|C|105840.|dddd|fffff|9511705558|/CTC/097/eeeeee eee|/PT/SC/TT/12/SN/eee eeeeeee/CeeY/ee -eee aa aaaa S.A.B. DE C.V./DC/aaaaa AND aaaaa aaaa/NA/aaaaa,/SK/aaaaa|D|M|28/04/2015|MXN|11111.17||||||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||
S|aaaaa|bbbbb|32|D|M|28/04/2015|ccc|33208369.05|28/04/2015|0428|C|105840.|dddd|fffff|9511705558|/CTC/097/eeeeee eee|/PT/SC/TT/12/SN/eee eeeeeee/CeeY/ee -eee aa aaaa S.A.B. DE C.V./DC/aaaaa AND aaaaa aaaa/NA/aaaaa,/SK/aaaaa|D|M|28/04/2015|MXN|11111.17||||||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||

Each set should come in single line.

This works for your exact specifications of input and output

awk '{$0=substr($0,47)}/^S\|/{x=1}/\^M$/{x=0}x' file

S|aaaaa|bbbbb|32|D|M|28/04/2015|ccc|33208369
.05|28/04/2015|0428|C|105840.|dddd|fffff|9511705558|/CTC/097/eeeeee eee|/PT
/SC/TT/12/SN/eee eeeeeee/CeeY/ee -eee aa aaaa S.A.B. DE C.V./DC/aaaaa
 AND aaaaa aaaa/NA/aaaaa,/SK/aaaaa|D|M|28/04/2015|MXN|11111.17||||
||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||

Is this what you like?

awk -F"): " '{$0=$NF} /^S\|/ {f=1} /\^M/ {f=0} f' file
S|aaaaa|bbbbb|32|D|M|28/04/2015|ccc|33208369
.05|28/04/2015|0428|C|105840.|dddd|fffff|9511705558|/CTC/097/eeeeee eee|/PT
/SC/TT/12/SN/eee eeeeeee/CeeY/ee -eee aa aaaa S.A.B. DE C.V./DC/aaaaa
 AND aaaaa aaaa/NA/aaaaa,/SK/aaaaa|D|M|28/04/2015|MXN|11111.17||||
||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||

Not sure what S| you like to start with. You have one in line 1 and one in line 6 (Both with same data after)

sed based approach:

$ sed -n '/S\|/,/\^M/{

           /S\|/  {s/.*S|/S|/};
                  {s/.*[0-9]\+): //;H}
           /\^M/  {g;s/\n//g;s/\^M.*//p;};

       }' file.log
S|aaaaa|bbbbb|32|D|M|28/04/2015|ccc|33208369.05|28/04/2015|0428|C|105840.|dddd|fffff|9511705558|/CTC/097/eeeeee eee|/PT/SC/TT/12/SN/eee eeeeeee/CeeY/ee -eee aa aaaa S.A.B. DE C.V./DC/aaaaa AND aaaaa aaaa/NA/aaaaa,/SK/aaaaa|D|M|28/04/2015|MXN|11111.17||||||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||S|aaaaa|bbbbb|32|D|M|28/04/2015|ccc|33208369.05|28/04/2015|0428|C|105840.|dddd|fffff|9511705558|/CTC/097/eeeeee eee|/PT/SC/TT/12/SN/eee eeeeeee/CeeY/ee -eee aa aaaa S.A.B. DE C.V./DC/aaaaa AND aaaaa aaaa/NA/aaaaa,/SK/aaaaa|D|M|28/04/2015|MXN|11111.17||||||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||00|||||||||

Explanation:

  1. For lines between S| & ^M - '/S\\|/,/\\^M/{
  2. If line contains S| , remove everything till S| - /S\\|/{s/.*S|/S|/};
  3. Remove everything till <digits>): & append the remaining string to hold space - {s/.*[0-9]\\+): //;H} . This removes the prefix text like 00:31:54.184 MNK I 4155809232 (monklog:391):
  4. For lines matching ^M , copy entire hold space to pattern space. Remove newlines (which had got added because of H command. Remove everything after ^M & print. - /\\^M/ {g;s/\\n//g;s/\\^M.*//p;};

Similar logic using awk :

$ awk -v FS="[0-9]+): " '
     /S\|/ && (!a){  a = a gensub(/.*S\|/,"S|","",$2); next;}
     /\^M/ && a { print a gensub(/\^M.*/,"","",$2); a=0;}
     a{a=a $2};
     ' file.log 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM