简体   繁体   中英

Using awk to count the number of occurrences of a word in a column

03/03/2014 12:31:21 BLOCK 10.1.34.1 11:22:33:44:55:66

03/03/2014 12:31:22 ALLOW 10.1.34.2 AA:BB:CC:DD:EE:FF

03/03/2014 12:31:25 BLOCK 10.1.34.1 55:66:77:88:99:AA

I am trying to use awk to count the number of occurrences of the word "block" and "access" above in one command.

I tried the word "block" at first but my counter does not appear to be working. Can anyone see where my code is wrong?

awk ' BEGIN {count=0;}  { if ($3 == "BLOCK") count+=1} end {print $count}' firewall.log

Use an array

awk '{count[$3]++} END {for (word in count) print word, count[word]}' file

If you want "block" specifically: END {print count["BLOCK"]}

Here is a non-code solution. You can string together the steps with pipes ( "|" ).

awk '{print $3}' file | sort | uniq -c
  • awk '{print $3}'

    print the 3rd column , the default record separator in awk is white space.

  • sort

    sort the results

  • uniq -c

    count the number repeated occurrences

The reason that your code may not be working is END is case sensitive so your script will be checking the variable end exists(which it doesn't) and so the last block will never be executed. If you change that then it should work.

Also you do not need the BEGIN block as all variable are instantiated at 0.

Below I have added an alternative way of doing this that you may want to use instead.

This is similar to glenn's but captures only the words you want, it should use little memory because of this.


Using Gawk(for the third arg of match)

awk 'match($3,/BLOCK|ALLOW/,b){a[b[0]]++}END{for(i in a)print i ,a[i]}' file

This block only executes if BLOCK or ALLOW are contained in the third field.
The match captures what has been matched into the array b.
Then array a is incremented for the matched field.

In the END block each captured field is outputted with a count of occurences.


The output is

ALLOW 1
BLOCK 2

I tested your statement

awk ' BEGIN {count=0;}  { if ($3 == "BLOCK") count+=1} end {print $count}' firewall.log

and was able to successfully count BLOCK by doing two changes

  1. end should be in caps
  2. remove $ from print $count

So, it should be:

awk ' BEGIN {count=0;}  { if ($3 == "BLOCK") count+=1} END {print count}' firewall.log 

A simpler statement that works too is:

awk '($3 == "BLOCK") {count++ } END { print count }' firewall.log

The error in your awk invocation is that, in your "END" block, you have print $count . That takes the content of the count variable, assumes it is an integer, and attempts to find the corresponding field in the last line of input. What you really want is just print count , as that just prints the value in the count variable. It's sometimes easy to mix up different variable referencing schemes between bash , awk , python , etc., so it's an easy mistake to make.

I have something simmilar -

im asking gitlab about list of merge request

curl -Ss -k --header "PRIVATE-TOKEN: $at" " https://gitlab/api/v4/projects/111/merge_requests?state= $1&created_after=$date&target_branch=$branch&per_page=100&page=1"| jq -r '.[] | "(.iid)\\t(.author.username)"

and i have list such: output:

11039 user7 11038 user6 11037 user5 11036 user4 11035 user1 11034 user3 11033 user2 11032 user1

how to count how many merge request rise each user. How to count how many request rise user1 how many user2 etc.

when i make this curl as a variable: request= curl -Ss -k --header "PRIVATE-TOKEN: $at" "https://gitlab/api/v4/projects/111/merge_requests?state=$1&created_after=$date&target_branch=$branch&per_page=100&page=1"| jq -r '.[] | "\\(.iid)\\t\\(.author.username)" curl -Ss -k --header "PRIVATE-TOKEN: $at" "https://gitlab/api/v4/projects/111/merge_requests?state=$1&created_after=$date&target_branch=$branch&per_page=100&page=1"| jq -r '.[] | "\\(.iid)\\t\\(.author.username)"

and print it like:

    echo "list of $1 requests rise today"
    echo "$request"
    echo
    echo "--------stats--------------"
    echo "\n$request" | awk '/^[0-9]/{a[$2]++}END{for (i in a) print i, a[i]}'
    echo "---------------------------"
    echo

this awk command dont show correct math on some options. Is there any simpler option?

Thanks for help.

The reason is that you just need to print count rather than $count. Inside awk, you do not need to use $ to find variable. In your case, the awk will try to print $2 before ending which does not exit. Below code should work:

awk ' BEGIN {count=0;} { if ($3 == "BLOCK") count+=1} END {print count}' firewall.log

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM