Count the number of records in 1st column of a file using awk?

Question

Is there a way to count the number of records in 1st column of a file using awk ??

My file :-

abc|87123
cdb|
fgytw|23321
ghft|
|87635

expected output: 4

I tried below command but its not working:

awk -F'|' 'NF==$1{c++}END {print c}' file

Answer 1

You can use

awk -F\| 'length($1){c++} END{print c}'

See the online demo :

#!/bin/bash
s='abc|87123
cdb|
fgytw|23321
ghft|
|87635'
awk -F\| 'length($1){c++} END{print c}' <<< "$s"
# => 4

That is, the c is only incremented if Field 1 length is greater than zero.

Answer 2

$ awk -F'|' '$1 != ""{c++} END{print c+0}' file
4

You need the +0 at the end to get numeric 0 output instead of a blank line when no lines match the condition.

Answer 3

1st solution: With your shown samples, please try following awk code. Simple explanation would be, this will check if 1st field is NOT empty ( not space ) and having length then count that field and keep doing this for whole Input_file then in END block of awk code print that total number of matches found.

awk -F'|' '$1!~/[[:space:]]/ && length($1){count++} END{print count}' Input_file

NOTE: Also change from [[:space:]] to [[:blank:]] in case you may have spaces OR Tabs also in first columns.

2nd solution: Using GNU grep + wc combination in this solution.

grep -oP '^\S+\|' Input_file | wc -l

3rd solution: As per suggestion in comments by RARE kpoop Manifesto one could try following also.

awk -F'^[[:space:]]*[|]' '{ count += NF == 1 } END { print count}' Input_file

Answer 4

What about this:

echo $(( $(cat test.txt | wc -l) - $(grep "^|" test.txt | wc -l) ))

To give you an idea what it means:

cat test.txt | wc -l

This counts the amount of lines in the entire file. Don't use wc -l test.txt because this also outputs the name of the file, which you don't need.

grep "^|" test.txt | wc -l

That's a neat trick: ^ means "starting of line". When it gets followed by a column separator, then it means that the first column is not filled in. So, grep "^|" test.txt | wc -l grep "^|" test.txt | wc -l grep "^|" test.txt | wc -l gives the amount of lines where the first column is not filled in.

Now, how to combine both? Well, simply using $((4-1)) , which performs an integer calculation.

I admit, it looks nasty, but it does the job! :-)

Answer 5

Another awk solution:

awk '/^[^|]/{++c} END {print c}' file

4

Answer 6

$ wc -l < <(sed '/^|/d' file)
4
$ sed '/^|/d' file|sed -n '$='
4
$ grep -c "^[^|]" file
4

Answer 7

keep it simple - 3 ways of saying the same hting:

{m,g}awk '{ _+=    NF } END { print NR-_+NR }' FS='^[|]'
{m,g}awk '{ _+=!__~NF } END { print   +_    }' FS='^[|]'
{m,g}awk '{ _+=/^\|/  } END { print NR-_    }' FS='^$'

4

If you don't mind to loading the file all at once, then even easier :

 - single subtraction + gsub()
 - no tracking needed
 - input rows become "fields" in this context

.

{m,g}awk '$!NF = NF - gsub("(^|\n)[|]|\n$","&")' FS='\n' RS='^$'

4

or if you wanna do it reversed order (admittedly, overkill for the task) : .

{m,g}awk '$!NF= gsub("[^|]+","&", $!(NF = NF))'   RS='^$' \
           OFS='|' FS='[|]([^|\n]*[|])*[^|\n]*\n' OFS='|'

4

Count the number of records in 1st column of a file using awk?

Question

7 answers

solution1
3 ACCPTED 2022-06-17 08:30:47

solution2
3 2022-06-17 12:36:29

solution3
2 2022-06-17 08:39:28

solution4
1 2022-06-17 08:54:19

solution5
1 2022-06-17 09:51:57

solution6
1 2022-06-17 09:58:24

solution7
0 2022-06-17 10:57:42

Count the number of records in 1st column of a file using awk?

Question

7 answers

solution1 3 ACCPTED 2022-06-17 08:30:47

solution2 3 2022-06-17 12:36:29

solution3 2 2022-06-17 08:39:28

solution4 1 2022-06-17 08:54:19

solution5 1 2022-06-17 09:51:57

solution6 1 2022-06-17 09:58:24

solution7 0 2022-06-17 10:57:42

solution1
3 ACCPTED 2022-06-17 08:30:47

solution2
3 2022-06-17 12:36:29

solution3
2 2022-06-17 08:39:28

solution4
1 2022-06-17 08:54:19

solution5
1 2022-06-17 09:51:57

solution6
1 2022-06-17 09:58:24

solution7
0 2022-06-17 10:57:42