简体   繁体   English

计算文件中的10位数字的数量

[英]Counting the number of 10-digit numbers in a file

I need to count the total number of instances in which a 10-digit number appears within a file. 我需要计算一个10位数字出现在文件中的实例总数。 All of the numbers have leading zeros, eg: 所有数字都有前导零,例如:

This is some text. 0000000001

Returns: 返回:

1

If the same number appears more than once, it is counted again, eg: 如果相同的数字出现多次,则会再次计算,例如:

0000000001 This is some text.
0000000010 This is some more text.
0000000001 This is some other text.

Returns: 返回:

3

Sometimes there are no spaces between the numbers, but each continuous string of 10-digits should be counted: 有时在数字之间没有空格,但应计算每个连续的10位数字符串:

00000000010000000010000000000100000000010000000001

Returns: 返回:

5

How can I determine the total number of 10-digit numbers appearing in a file? 如何确定文件中出现的10位数字的总数?

尝试这个:

grep -o '[0-9]\{10\}' inputfilename | wc -l

The last requirement - that you need to count multiple numbers per line - excludes grep, as far as I know it can count only per-line. 最后一个要求 - 你需要计算每行多个数字 - 不包括grep,据我所知,它只能计算每行数。

Edit: Obviously, I stand corrected by Nate :) grep's -o option is what I was looking for. 编辑:显然,我的立场由Nate纠正:) grep的-o选项是我正在寻找的。

You can however do this easily with sed like this: 但是,您可以使用这样的sed轻松完成此操作:

$ cat mkt.sh 
sed -r -e 's/[^0-9]/./g' -e 's/[0-9]{10}/num /g' -e 's/[0-9.]//g' $1
$ for i in *.txt; do echo --- $i; cat $i; echo --- number count; ./mkt.sh $i|wc -w; done
--- 1.txt
This is some text. 0000000001

--- number count
1
--- 2.txt
0000000001 This is some text.
0000000010 This is some more text.
0000000001 This is some other text.

--- number count
3
--- 3.txt
00000000010000000010000000000100000000010000000001

--- number count
5
--- 4.txt
1 2 3 4 5 6 6 7 9 0
11 22 33 44 55 66 77 88 99 00
123456789 0

--- number count
0
--- 5.txt
1.2.3.4.123
1234567890.123-AbceCMA-5553///q/\1231231230
--- number count
2
$ 

"I need to count the total number of instances in which a 10-digit number appears within a file. All of the numbers have leading zeros"

所以我认为这可能更准确:

$ grep -o '0[0-9]\{9\}' filename | wc -l

This might work for you: 这可能对你有用:

cat <<! >test.txt
0000000001 This is some text.
0000000010 This is some more text.
0000000001 This is some other text.
00000000010000000010000000000100000000010000000001
1 a 2 b 3 c 4 d 5 e 6 f 7 g 8 h 9 i 0 j
12345 67890 12 34 56 78 90
!
sed 'y/X/ /;s/[0-9]\{10\}/\nX\n/g' test.txt | sed '/X/!d' | sed '$=;d'
8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM