I have a scenario where I want to count word which start with special character in entire file
My word is: $name
This exact $name
how many times appeared in file I need a count.
When I use this below command does not give count.
grep "$name" /patha/demo.txt | wc -w
grep "$name" /path/demo.txt | wc -l
My data in demo.txt file
Abc $name -> 1
name city
villagename
abczyz$name -> 1
raj
nameee
Rahul$nameeee
123name1
$namename
The count i am expecting is: 2 [exact match]
Double quotes don't protect the string from string interpolation by the shell. If name
is not a defined variable, you are actually running grep "" demo.txt
after the shell replaces $name
with the variable's (nonexistent) value.
The $
character is a regex metacharacter which needs to be escaped from the regex engine, too, or you can use the -F
flag to turn off regex matching and only select literal matches.
It's not clear what you mean by "word"; the requirement that $nameeee
should not count as a match suggests the use of the -w
option; bot its exact semantics of what is a "word" may differ from yours.
grep -c
(typically) reports the number of matching lines; if a line which contains the pattern twice or more should count as multiple matches, you need a different approach.
grep -woF '$name' demo.txt | wc -l
prints every match on a separate line ( -o
) and only searches for literal matches ( -F
) in isolated words ( -w
); the pattern is within single quotes, so that it is passed on verbatim to grep
; and we count the number of generated output lines with a pipe to wc -l
.
Alternatively, you could specify a regex with an exact boundary condition. The following assumes counting the number of matching lines is sufficient, and focuses on demonstrating how to write a regex which matches $name
only if it is not immediately followed by an alphabetic character or a dollar sign.
grep -E '\$name($|[^a-zA-Z$])' demo.txt
The -E
option selects extended regular expression syntax which enables some features which were not supported in the traditional original grep
. (By POSIX, you could equivalently backslash |
and the parentheses to enable their use as alternation and grouping characters with plain grep
; but I find this convention to be weird and the resulting regex will be harder to read). The first backslash changes $
from a regex metacharacter which matches end of line, to an expression which simply matches a literal dollar sign. The parentheses allow either end of line ( $
now with its metacharacter meaning) or a character which is not a lowercase or uppercase character after name
.
The same is moderately easy in Awk, too. Split the line on the search regex and count the number of resulting fields, minus one (if there is no separator, there will be a single field, if it occurs once, the line will be split in two fields, etc).
awk '{ n = split($0, a, /\$name($|[^a-zA-Z$])/); total += n-1 }
END { print 0+total }' demo.txt
(With GNU Awk, you could set the built-in field separator to the regex. Anyway, I went for a solution which should be portable to regular traditional / POSIX Awk.)
This is mildly more complex, but saves one external process compared to the first attempt above. That will only matter if you are running this in a really tight loop, but then you should probably optimize further to pass in a list of search strings, and search for them all in a single pass, anyway.
find the instance that end with $name
, count the lines
$ grep -oE '\$name\b' file | wc -l
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.