[英]How do I count the number of occurrences of a string in an entire file?
Is there an inbuilt command to do this or has anyone had any luck with a script that does it? 是否有一个内置命令来执行此操作,或者有任何人对运行它的脚本有任何好运?
I am looking to count the number of times a certain string (not word) appears in a file. 我想要计算某个字符串(不是单词)出现在文件中的次数。 This can include multiple occurrences per line so the count should count every occurrence not just count 1 for lines that have the string 2 or more times.
这可以包括每行多次出现,因此对于具有2次或更多次字符串的行,计数应计算每次出现不仅计数1。
For example, with this sample file: 例如,使用此示例文件:
blah(*)wasp( *)jkdjs(*)kdfks(l*)ffks(dl
flksj(*)gjkd(*
)jfhk(*)fj (*) ks)(*gfjk(*)
If I am looking to count the occurrences of the string (*)
I would expect the count to be 6, ie 2 from the first line, 1 from the second line and 3 from the third line. 如果我想计算字符串
(*)
的出现次数,我希望计数为6,即第一行为2,第二行为1,第三行为3。 Note how the one across lines 2-3 does not count because there is a LF character separating them. 请注意第2-3行中的那一行是如何计算的,因为有一个LF字符将它们分开。
Update : great responses so far! 更新 :到目前为止响应很好! Can I ask that the script handle the conversion of
(*)
to \\(*\\)
, etc? 我可以要求脚本处理
(*)
到\\(*\\)
等的转换吗? That way I could just pass any desired string as an input parameter without worrying about what conversion needs to be done to it so it appears in the correct format. 这样我就可以将任何所需的字符串作为输入参数传递,而不必担心需要对其进行哪些转换,因此它以正确的格式显示。
您可以使用grep
和wc
等基本工具:
grep -o '(\*)' input.txt | wc -l
Using perl's "Eskimo kiss" operator with the -n
switch to print a total at the end. 使用perl的“Eskimo kiss”操作符和
-n
开关在末尾打印总数。 Use \\Q...\\E
to ignore any meta characters. 使用
\\Q...\\E
忽略任何元字符。
perl -lnwe '$a+=()=/\Q(*)/g; }{ print $a;' file.txt
Script: 脚本:
use strict;
use warnings;
my $count;
my $text = shift;
while (<>) {
$count += () = /\Q$text/g;
}
print "$count\n";
Usage: 用法:
perl script.pl "(*)" file.txt
This loops over the lines of the file, and on each line finds all occurrences of the string "(*)". 这循环遍历文件的行,并在每一行上查找字符串“(*)”的所有出现。 Each time that string is found, $c is incremented.
每次找到该字符串时,$ c都会递增。 When there are no more lines to loop over, the value of $c is printed.
当没有更多行要循环时,将打印$ c的值。
perl -ne'$c++ while /\\(\\*\\)/g;END{print"$c\\n"}' filename.txt
Update: Regarding your comment asking that this be converted into a solution that accepts a regex as an argument, you might do it like this: 更新:关于您的评论要求将其转换为接受正则表达式作为参数的解决方案,您可以这样做:
perl -ne'BEGIN{$re=shift;}$c++ while /\\Q$re/g;END{print"$c\\n"}' 'regex' filename.txt
That ought to do the trick. 应该这样做。 If I felt inclined to skim through perlrun again I might see a more elegant solution, but this should work.
如果我再次倾向于浏览perlrun ,我可能会看到更优雅的解决方案,但这应该有效。
You could also eliminate the explicit inner while loop in favor of an implicit one by providing list context to the regexp: 您还可以通过向regexp提供列表上下文来消除显式内部while循环以支持隐式循环:
perl -ne'BEGIN{$re=shift}$c+=()=/\\Q$re/g;END{print"$c\\n"}' 'regex' filename.txt
You can use basic grep command: 您可以使用基本的grep命令:
Example : If you want to find the no of occurrence of "hello" word in a file 示例 :如果要查找文件中出现“hello”字的次数
grep -c "hello" filename
If you want to find the no of occurrence of a pattern then 如果你想找到一个模式的出现,那么
grep -c -P "Your Pattern"
Pattern example : hell.w, \\d+ etc 模式示例: hell.w,\\ d +等
I have used below command to find particular string count in a file 我使用下面的命令来查找文件中的特定字符串计数
grep search_String fileName|wc -l grep search_String fileName | wc -l
text="(\*)"
grep -o $text file | wc -l
You can make it into a script which accepts arguments like this: 你可以把它变成一个接受这样的参数的脚本:
script count : 脚本数 :
#!/bin/bash
text="$1"
file="$2"
grep -o "$text" "$file" | wc -l
Usage: 用法:
./count "(\*)" file_path
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.