Is there a way to count the number of records in 1st column of a file using awk ??
My file :-
abc|87123
cdb|
fgytw|23321
ghft|
|87635
expected output: 4
I tried below command but its not working:
awk -F'|' 'NF==$1{c++}END {print c}' file
You can use
awk -F\| 'length($1){c++} END{print c}'
See the online demo :
#!/bin/bash
s='abc|87123
cdb|
fgytw|23321
ghft|
|87635'
awk -F\| 'length($1){c++} END{print c}' <<< "$s"
# => 4
That is, the c
is only incremented if Field 1 length is greater than zero.
$ awk -F'|' '$1 != ""{c++} END{print c+0}' file
4
You need the +0
at the end to get numeric 0
output instead of a blank line when no lines match the condition.
1st solution: With your shown samples, please try following awk
code. Simple explanation would be, this will check if 1st field is NOT empty ( not space ) and having length then count that field and keep doing this for whole Input_file then in END
block of awk
code print that total number of matches found.
awk -F'|' '$1!~/[[:space:]]/ && length($1){count++} END{print count}' Input_file
NOTE: Also change from [[:space:]]
to [[:blank:]]
in case you may have spaces OR Tabs also in first columns.
2nd solution: Using GNU grep
+ wc
combination in this solution.
grep -oP '^\S+\|' Input_file | wc -l
3rd solution: As per suggestion in comments by RARE kpoop Manifesto one could try following also.
awk -F'^[[:space:]]*[|]' '{ count += NF == 1 } END { print count}' Input_file
What about this:
echo $(( $(cat test.txt | wc -l) - $(grep "^|" test.txt | wc -l) ))
To give you an idea what it means:
cat test.txt | wc -l
This counts the amount of lines in the entire file. Don't use wc -l test.txt
because this also outputs the name of the file, which you don't need.
grep "^|" test.txt | wc -l
That's a neat trick: ^
means "starting of line". When it gets followed by a column separator, then it means that the first column is not filled in. So, grep "^|" test.txt | wc -l
grep "^|" test.txt | wc -l
grep "^|" test.txt | wc -l
gives the amount of lines where the first column is not filled in.
Now, how to combine both? Well, simply using $((4-1))
, which performs an integer calculation.
I admit, it looks nasty, but it does the job! :-)
Another awk
solution:
awk '/^[^|]/{++c} END {print c}' file
4
$ wc -l < <(sed '/^|/d' file)
4
$ sed '/^|/d' file|sed -n '$='
4
$ grep -c "^[^|]" file
4
keep it simple - 3 ways of saying the same hting:
{m,g}awk '{ _+= NF } END { print NR-_+NR }' FS='^[|]'
{m,g}awk '{ _+=!__~NF } END { print +_ }' FS='^[|]'
{m,g}awk '{ _+=/^\|/ } END { print NR-_ }' FS='^$'
4
If you don't mind to loading the file all at once, then even easier :
- single subtraction + gsub()
- no tracking needed
- input rows become "fields" in this context
.
{m,g}awk '$!NF = NF - gsub("(^|\n)[|]|\n$","&")' FS='\n' RS='^$'
4
or if you wanna do it reversed order (admittedly, overkill for the task) : .
{m,g}awk '$!NF= gsub("[^|]+","&", $!(NF = NF))' RS='^$' \
OFS='|' FS='[|]([^|\n]*[|])*[^|\n]*\n' OFS='|'
4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.