简体   繁体   中英

grep not counting characters accurately when they are clearly in file

I'm trying to count the number of times '(' appears in a file. I get a number back but it's never accurate.

Why won't grep accuratly count the occurences of this character. It must be multiline and every occurrences.

I imagine my regex is off, but it's so simple.

log.txt:

(eRxîó¬Pä^oË'AqŠêêÏ-04ây9Í&ñ­ÖbèaïÄ®h0FºßôÊ$&Ð>0dÏ“ ²ˆde^áä­ÖÚƒíZÝ*ö¨tM
variable        1
paren )
(¼uC¼óµr\=Œ"J§ò<ƒu³ÓùËP
<åÐ#ô{ô
½ÊªÆÏglTµ¥>¦³éùoÏWÛz·ób(ÈIH|TT]
variable        0
paren )

Output:

$ grep -o "(" log.txt | wc -l

1

EDIT:

I had a wierd mix of encoding so I dump it then count the hex values.

hexdump -C hex.txt | grep "28" | wc -l

You might have encoding issues, if you interpret a single-byte encoding in a multibyte locale. Here's an approach that deletes everything except ( (in a single-byte locale), then counts the remaining characters:

LC_ALL=C <log.txt tr -c -d '(' | wc -c

转储未知编码并计算十六进制值。

hexdump -C hex.txt | grep "28" | wc -l

using sed (than counting with wc because only in sed it's a bit heavy for that)

sed -e '1h;1!h;$!d' -e 'x;s/[^(]//g' yourfile | wc -c

using awk

awk -F '(' '{ Total += NF - 1 } END { print Total }' YourFile

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM