grep not counting characters accurately when they are clearly in file

Question

I'm trying to count the number of times '(' appears in a file. I get a number back but it's never accurate.

Why won't grep accuratly count the occurences of this character. It must be multiline and every occurrences.

I imagine my regex is off, but it's so simple.

log.txt:

(eRxîó¬Pä^oË'AqŠêêÏ-04ây9Í&ñÖbèaïÄ®h0FºßôÊ$&Ð>0dÏ“ ²ˆde^áäÖÚƒíZÝ*ö¨tM
variable        1
paren )
(¼uC¼óµr\=Œ"J§ò<ƒu³ÓùËP
<åÐ#ô{ô
½ÊªÆÏglTµ¥>¦³éùoÏWÛz·ób(ÈIH|TT]
variable        0
paren )

Output:

$ grep -o "(" log.txt | wc -l

1

EDIT:

I had a wierd mix of encoding so I dump it then count the hex values.

hexdump -C hex.txt | grep "28" | wc -l

Answer 1

You might have encoding issues, if you interpret a single-byte encoding in a multibyte locale. Here's an approach that deletes everything except ( (in a single-byte locale), then counts the remaining characters:

LC_ALL=C <log.txt tr -c -d '(' | wc -c

Answer 2

转储未知编码并计算十六进制值。

hexdump -C hex.txt | grep "28" | wc -l

Answer 3

using sed (than counting with wc because only in sed it's a bit heavy for that)

sed -e '1h;1!h;$!d' -e 'x;s/[^(]//g' yourfile | wc -c

using awk

awk -F '(' '{ Total += NF - 1 } END { print Total }' YourFile

grep not counting characters accurately when they are clearly in file

Question

3 answers

solution1
1 2015-09-15 09:20:23

solution2
0 2015-09-15 06:46:33

solution3
0 2015-09-15 06:53:26

grep not counting characters accurately when they are clearly in file

Question

3 answers

solution1 1 2015-09-15 09:20:23

solution2 0 2015-09-15 06:46:33

solution3 0 2015-09-15 06:53:26

solution1
1 2015-09-15 09:20:23

solution2
0 2015-09-15 06:46:33

solution3
0 2015-09-15 06:53:26