简体   繁体   中英

grep a pattern and return all characters before and after another specific character bash

I'm interested in searching a variable inside a log file, in case the search returns something then I wish for all entries before the variable until the character '{' is met and after the pattern until the character '}' is met.

To be more precise let's take the following example:

something something {
    entry 1
    entry 2
    name foo
    entry 3
    entry 4
}
something something test
test1 test2
test3 test4

In this case I would search for 'name foo' which will be stored in a variable (which I create before in a separate part) and the expected output would be:

{
        entry 1
        entry 2
        name foo
        entry 3
        entry 4
}

I tried finding something on grep, awk or sed. I was able to only come up with options for finding the pattern and then return all lines until '}' is met, however I can't find a suitable solution for the lines before the pattern.

I found a regex in Perl that could be used but I'm not able to use the variable, in case I switch the variable with 'foo' then I will have output.

grep -Poz '.*(?s)\{[^}]*name\tfoo.*?\}'

The regex is quite simple, once the whole file is read into a variable

use warnings;
use strict; 
use feature 'say';

die "Usage: $0 filename\n" if not @ARGV;

my $file_content = do { local $/; <> };  # "slurp" file with given name

my $target = qr{name foo};

while ( $file_content =~ /({ .*? $target .*? })/gsx ) { 
    say $1; 
}

Since we undef -ine the input record separator inside the do block using local , the following read via the null filehandle <> pulls the whole file at once, as a string ("slurps" it). That is returned by the do block and assigned to the variable. The <> reads from file(s) with names in @ARGV , so what was submitted on the command-line at program's invocation.

In the regex pattern, the ? quantifier makes .* match only up to the first occurrence of the next subpattern, so after { the .*? matches up to the first (evaluated) $target , then the $target is matched, then .*? matches eveyrthing up to the first } . All that is captured by enclosing () and is thus later available in $1 .

The /s modifier makes . match newlines, what it normally doesn't, what is necessary in order to match patterns that span multiple lines. With the /g modifier it keeps going through the string searching for all such matches. With /x whitespace isn't matched so we can spread out the pattern for readability (even over lines -- and use comments.).

The $target is compiled as a proper regex pattern using the qr operator.

See regex tutorial perlretut , and then there's the full reference perlre .

Here's an Awk attempt which tries to read between the lines to articulate an actual requirement. What I'm guessing you are trying to say is that "if there is an opening brace, print all content between it and the closing brace in case of a match inside the braces. Otherwise, just print the matching line."

We accomplish this by creating a state variable in Awk which keeps track of whether you are in a brace context or not. This simple implementation will not handle nested braces correctly; if that's your requirement, maybe post a new and better question with your actual requirements.

awk -v search="foo" 'n { context[++n] = $0 }
    /{/ { delete context; n=0; matched=0; context[++n] = $0 }
    /}/ && n { if (matched) for (i=1; i<=n; i++) print context[i];
        delete context; n=0 }
    $0 ~ search { if(n) matched=1; else print }' file

The variable n is the number of lines in the collected array context ; when it is zero, we are not in a context between braces. If we find a match and are collecting lines into context , defer printing until we have collected the whole context . Otherwise, just print the current line.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM