简体   繁体   中英

Count the number of records in 1st column of a file using awk?

Is there a way to count the number of records in 1st column of a file using awk ??

My file :-

abc|87123
cdb|
fgytw|23321
ghft|
|87635

expected output: 4

I tried below command but its not working:

awk -F'|' 'NF==$1{c++}END {print c}' file

You can use

awk -F\| 'length($1){c++} END{print c}'

See the online demo :

#!/bin/bash
s='abc|87123
cdb|
fgytw|23321
ghft|
|87635'
awk -F\| 'length($1){c++} END{print c}' <<< "$s"
# => 4

That is, the c is only incremented if Field 1 length is greater than zero.

$ awk -F'|' '$1 != ""{c++} END{print c+0}' file
4

You need the +0 at the end to get numeric 0 output instead of a blank line when no lines match the condition.

1st solution: With your shown samples, please try following awk code. Simple explanation would be, this will check if 1st field is NOT empty ( not space ) and having length then count that field and keep doing this for whole Input_file then in END block of awk code print that total number of matches found.

awk -F'|' '$1!~/[[:space:]]/ && length($1){count++} END{print count}' Input_file

NOTE: Also change from [[:space:]] to [[:blank:]] in case you may have spaces OR Tabs also in first columns.



2nd solution: Using GNU grep + wc combination in this solution.

grep -oP '^\S+\|' Input_file | wc -l


3rd solution: As per suggestion in comments by RARE kpoop Manifesto one could try following also.

awk -F'^[[:space:]]*[|]' '{ count += NF == 1 } END { print count}' Input_file

What about this:

echo $(( $(cat test.txt | wc -l) - $(grep "^|" test.txt | wc -l) ))

To give you an idea what it means:

cat test.txt | wc -l

This counts the amount of lines in the entire file. Don't use wc -l test.txt because this also outputs the name of the file, which you don't need.

grep "^|" test.txt | wc -l

That's a neat trick: ^ means "starting of line". When it gets followed by a column separator, then it means that the first column is not filled in. So, grep "^|" test.txt | wc -l grep "^|" test.txt | wc -l grep "^|" test.txt | wc -l gives the amount of lines where the first column is not filled in.

Now, how to combine both? Well, simply using $((4-1)) , which performs an integer calculation.

I admit, it looks nasty, but it does the job! :-)

Another awk solution:

awk '/^[^|]/{++c} END {print c}' file

4
$ wc -l < <(sed '/^|/d' file)
4
$ sed '/^|/d' file|sed -n '$='
4
$ grep -c "^[^|]" file
4

keep it simple - 3 ways of saying the same hting:

{m,g}awk '{ _+=    NF } END { print NR-_+NR }' FS='^[|]'
{m,g}awk '{ _+=!__~NF } END { print   +_    }' FS='^[|]'
{m,g}awk '{ _+=/^\|/  } END { print NR-_    }' FS='^$'

4

If you don't mind to loading the file all at once, then even easier :

 - single subtraction + gsub()
 - no tracking needed
 - input rows become "fields" in this context

.

{m,g}awk '$!NF = NF - gsub("(^|\n)[|]|\n$","&")' FS='\n' RS='^$'

4

or if you wanna do it reversed order (admittedly, overkill for the task) : .

{m,g}awk '$!NF= gsub("[^|]+","&", $!(NF = NF))'   RS='^$' \
           OFS='|' FS='[|]([^|\n]*[|])*[^|\n]*\n' OFS='|'

4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM