[英]How can I count the amount of five letter words in a txt file using grep?
我不太擅长linux,并且正在尝试使用grep计数五个字母单词。
Use the c
flag to count, look for patterns containing five characters: 使用
c
标志进行计数,查找包含五个字符的模式:
$ cat file
some text file containing many words and sentences.
$ tr ' ' '\n' < file | grep -c '^[ \t]*[a-zA-Z]\{5\}[ \t]*$'
1
You can use: 您可以使用:
grep -o -w "\w\{5\}" your_file | wc -w
With -o
only matched words will be printed, -w
denotes that regex is searched as a word, \\w\\{5\\}
- regex string itself (matches 5 continuous word characters). 如果使用
-o
仅会打印匹配的单词, -w
表示将正则表达式作为单词搜索, \\w\\{5\\}
-正则表达式字符串本身(匹配5个连续的单词字符)。 So, with your_file containing 因此,与your_file包含
word1 word2 word3
long_word 123 word4
Output of grep -o -w "\\w\\{5\\}" your_file
will be grep -o -w "\\w\\{5\\}" your_file
将是
word1
word2
word3
word4
Piped wc -w
just counts this. 管道
wc -w
只是计算在内。
Note : If you don't want to match all alphanumeric characters - replace \\w
meta-character by something more specific. 注意 :如果您不想匹配所有字母数字字符,请使用更具体的内容替换
\\w
元字符。 For example [az]
- lowercase English letters. 例如
[az]
-小写英文字母。
This gnu awk
(due to mulitple characters in Record Selector) does count how many word have 5
letters. 这个
gnu awk
(由于Record Selector中的多个字符)确实计算出有5
字母的单词。 It does ignore .,
etc. 它确实会忽略
.,
等等。
awk -v RS="[ .,?!]|\n" 'length($0)==5 {a++} END {print a}' file
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.