简体   繁体   中英

Grep a result from Hive output log

I have an output from Hive. I stored that output in a variable called match .

I am isolating the line I need from the log using the command below.

echo $(echo $match | grep "COUNT_TOTAL_MATCH")

0: jdbc:hive2://hiveaddress> . . . . . . . . . . . . . . . . . . . . . . .> +--------------------+-------+--+ | stats | _c1 | +--------------------+-------+--+ | COUNT_TOTAL_MATCH | 1000 | +--------------------+-------+--+ 0: jdbc:hive2://hiveaddress> 0: jdbc:hive2://hiveaddress>

How do I grab the 1000 value knowing it could be any other number?

You can treat | (space pipe space) as the field delimiter and print the sixth field, like this:

awk -F ' \\| ' '{ print $6 }'

Notice that the pipe has to be escaped twice .


Side note:

echo $(echo $match | grep "COUNT_TOTAL_MATCH")

can be rewritten as

grep 'COUNT_TOTAL_MATCH' <<< "$match"

No echo , no pipes, and no word splitting in $match . echo "$(command)" is always the same as just command . (Notice that quoting makes a difference, though.)

This means that you can combine your grep and awk commands into this:

awk -F ' \\| ' '/COUNT_TOTAL_MATCH/ { print $6 }' <<< "$match"

try

grep -oP 'COUNT_TOTAL_MATCH\h*\|\h*\K\d+'
  • \\h*\\|\\h* optional space/tab followed by | followed by optional space/tab
  • \\K is positive lookbehind... so only if COUNT_TOTAL_MATCH\\h*\\|\\h* is matched
    • \\d+ get digits

From man grep

   -o, --only-matching
          Print  only  the matched (non-empty) parts of a matching line, with each such part on a separate output
          line.

   -P, --perl-regexp
          Interpret  the pattern as a Perl-compatible regular expression (PCRE).  This is highly experimental and
          grep -P may warn of unimplemented features.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM