用BSD计算具有不可打印字符的行

Question

I am trying to sort out some bad data in a file on a BSD-style system, which means that I do not have the -P option in grep. 我试图在BSD风格的系统上的文件中整理出一些错误的数据，这意味着我在grep中没有-P选项。 I have 7 million lines of data, and a subset has some strange characters. 我有700万行数据，一个子集有一些奇怪的字符。 If you to a "less" on the file, you'll see something like this: 如果您在文件上输入“较少”，则会看到以下内容：

290437430@89
9^@0333465@88
290348389@87
290342818@8^@

The ^@ is from a bad character that is not ASCII that showed up due to noise on the serial line when the characters were sent. ^ @来自非字符的错误字符，该字符不是ASCII，由于发送字符时串行线上的噪声而显示出来。 These lines are corrupt, and I want to count the number of corrupt data strings. 这些行已损坏，我想计算损坏的数据字符串的数量。

Any suggestions would be greatly appreciated. 任何建议将不胜感激。

Answer 1

As per Chepner's suggestion adding following solution here: 根据Chepner的建议，在此处添加以下解决方案：

grep -c '\x00' Input_file

Following 2 will give only literal characters only. 以下2将仅给出文字字符。

If you want to only count @ then a simple grep could help you on same. 如果您只想计算@那么简单的grep可以帮助您。

grep -c "@"  Input_file

Or in case of counting ^@ then following may help you on same. 或者在计算^@情况下，以下内容可能会帮助您。

grep -c "\^@"  Input_file

用BSD计算具有不可打印字符的行

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-03-22 13:21:04

用BSD计算具有不可打印字符的行

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-03-22 13:21:04

解决方案1
2 已采纳 2018-03-22 13:21:04