简体   繁体   中英

How can I get the length of each output line of grep

I am very new to bash scripting. I have a network trace file I want to parse. Part of the trace file is (two packets):

    [continues...]
    +---------+---------------+----------+
    05:00:00,727,744   ETHER
    |0  
    |00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|55|

    +---------+---------------+----------+
    05:00:00,727,751   ETHER
    |0  
    |00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|56|00|00|3a|01|

    [continues...]

For each packet, I want to print the time stamp, and the length of the packet (the hex values coming on the next line after |0 header) so the output will look like:

    05:00:00.727744 20 bytes
    05:00:00.727751 24 bytes

I can get the line with time stamp and the packets separately using grep in bash:

times=$(grep  '..\:..\:' $fileName)
packets=$(grep  '..|..|' $fileName)

But I can't work with the separate output lines after that. The whole result is concatenated in the two variables "times" and "packets". How can I get the length of each packet?

PS a good reference that really explains how to do bash programming, rather than just doing examples would be appreciated.

Okay, with plain old shell...

You can get the length of the line like this:

line="|00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|55|"
wc -c<<<$line
62

There are sixty two characters in that line. Think of each character as |00 where 00 can be any digit. In that case, there's an extra | on the end. Plus, the wc -c includes the NL on the end.

So, if we take the value of wc -c , and subtract 2, we get 60 . If we divide that by 3, we get 20 which is the number of characters.

Okay, now we need a little loop, figure out the various lines, and then parse them:

#! /bin/bash

while read line
do
    if [[ $line =~ ^[[:digit:]]{2} ]]
    then
        echo -n "${line% *}"
    elif [[ $line =~ ^\|[[:digit:]]{2} ]]
    then
        length=$(wc -c<<<$line)
        ((length-=2))
        ((length=length/3))
        echo "$length bytes"
    fi
done < test.txt

There a PURE BASH solution to your problems!

You're a beginning Bash programmer, and you have no idea what's going on...

Let's take this one step at a time:

A common way to loop through a file in BASH is using a while read loop. This combines the while with a read :

while read line
do
   echo "My line is '$line'"
done < test.txt

Each line in test.txt is being read into the $line shell variable.

Let's take the next one:

if [[ $line =~ ^[[:digit:]]{2} ]]

This is an if statement. Always use the [[ ... ]] brackets because they fix issues with the shell interpolating stuff. Plus, they have a bit more power.

The =~ is a regular expression match. The [[:digit:]] matches any digit. The ^ anchors the regular expression to the beginning of the line, and {2} means I want exactly two of these. This says if I match a line that starts with two digits (which is your timestamp line), execute this if clause.

${line% *} is a pattern filter. The % says to match the (glob) smallest glob pattern to the right and filter it from my $line variable. I use this to remove the ETHER from my line. The -n tells echo not to do a NL.

Let's take my elif which is an else if clause.

elif [[ $line =~ ^\|[[:digit:]]{2} ]]

Again, I am matching a regular expression. This regular expression starts with (The ^ ) a | . I have to put a backslash in front because | is a magical regular expression character and \\ kills the magic. It's now just a pipe. Then, that's followed by two digits. Note this skips |0 but catches |00 .

Now, we have to do some calculations:

length=$(wc -c<<<$line)

The $(...) say to execute the enclosed command and resubstitute it back in the line. The wc -c counts the characters and <<<$line is what we're counting. This gave us 62 characters. We have to subtract 2, then divide by 3. That's the next two lines:

((length-=2))
((length/=3))

The ((...)) allows me to do integer based math. The first subtracts 2 from $length and the next divides it by 3 . Now, I can echo this out:

echo "$length bytes"

And that's our pure Bash answer to this question.

You really don't want to do such things with your shell.

You want to write a real parser that understands the format to output the needed informations.

For a quick and dirty hack you can do something like that:

perl -wne 'print "$& " if /^\d\S*/; print split(/\|/)-2, " bytes\n" if /^\|..\|/'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM