I'm looking for the string foo=
in text files in a directory tree. It's on a common Linux machine, I have bash shell:
grep -ircl "foo=" *
In the directories are also many binary files which match "foo="
. As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images). How would I do that?
I know there are the --exclude=PATTERN
and --include=PATTERN
options, but what is the pattern format? The man page of grep says:
--include=PATTERN Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN Recurse in directories skip file matching PATTERN.
Searching on grep include , grep include exclude , grep exclude and variants did not find anything relevant
If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option. I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find ).
Use the shell globbing syntax :
grep pattern -r --include=\*.cpp --include=\*.h rootdir
The syntax for --exclude
is identical.
Note that the star is escaped with a backslash to prevent it from being expanded by the shell (quoting it, such as --include="*.cpp"
, would work just as well). Otherwise, if you had any files in the current working directory that matched the pattern, the command line would expand to something like grep pattern -r --include=foo.cpp --include=bar.cpp rootdir
, which would only search files named foo.cpp
and bar.cpp
, which is quite likely not what you wanted.
Update 2021-03-04
I've edited the original answer to remove the use of brace expansion , which is a feature provided by several shells such as Bash and zsh to simplify patterns like this; but note that brace expansion is not POSIX shell-compliant.
The original example was:
grep pattern -r --include=\*.{cpp,h} rootdir
to search through all .cpp
and .h
files rooted in the directory rootdir
.
If you just want to skip binary files, I suggest you look at the -I
(upper case i) option. It ignores binary files. I regularly use the following command:
grep -rI --exclude-dir="\.svn" "pattern" *
It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders, for whatever pattern I want. I have it aliased as "grepsvn" on my box at work.
Please take a look at ack , which is designed for exactly these situations. Your example of
grep -ircl --exclude=*.{png,jpg} "foo=" *
is done with ack as
ack -icl "foo="
because ack never looks in binary files by default, and -r is on by default. And if you want only CPP and H files, then just do
ack -icl --cpp "foo="
很长时间后我发现了这一点,您可以添加多个包含和排除,例如:
grep "z-index" . --include=*.js --exclude=*js/lib/* --exclude=*.min.js
The suggested command:
grep -Ir --exclude="*\.svn*" "pattern" *
is conceptually wrong, because --exclude works on the basename. Put in other words, it will skip only the .svn in the current directory.
在 grep 2.5.1 中,您必须将此行添加到 ~/.bashrc 或 ~/.bash 配置文件
export GREP_OPTIONS="--exclude=\*.svn\*"
I find grepping grep's output to be very helpful sometimes:
grep -rn "foo=" . | grep -v "Binary file"
Though, that doesn't actually stop it from searching the binary files.
On CentOS 6.6/Grep 2.6.3, I have to use it like this:
grep "term" -Hnir --include \*.php --exclude-dir "*excluded_dir*"
Notice the lack of equal signs "=" (otherwise --include
, --exclude
, include-dir
and --exclude-dir
are ignored)
If you are not averse to using find
, I like its -prune
feature:
find [directory] \
-name "pattern_to_exclude" -prune \
-o -name "another_pattern_to_exclude" -prune \
-o -name "pattern_to_INCLUDE" -print0 \
| xargs -0 -I FILENAME grep -IR "pattern" FILENAME
On the first line, you specify the directory you want to search. .
(current directory) is a valid path, for example.
On the 2nd and 3rd lines, use "*.png"
, "*.gif"
, "*.jpg"
, and so forth. Use as many of these -o -name "..." -prune
constructs as you have patterns.
On the 4th line, you need another -o
(it specifies "or" to find
), the patterns you DO want, and you need either a -print
or -print0
at the end of it. If you just want "everything else" that remains after pruning the *.gif
, *.png
, etc. images, then use -o -print0
and you're done with the 4th line.
Finally, on the 5th line is the pipe to xargs
which takes each of those resulting files and stores them in a variable FILENAME
. It then passes grep
the -IR
flags, the "pattern"
, and then FILENAME
is expanded by xargs
to become that list of filenames found by find
.
For your particular question, the statement may look something like:
find . \
-name "*.png" -prune \
-o -name "*.gif" -prune \
-o -name "*.svn" -prune \
-o -print0 | xargs -0 -I FILES grep -IR "foo=" FILES
git grep
Use git grep
which is optimized for performance and aims to search through certain files.
By default it ignores binary files and it is honoring your .gitignore
. If you're not working with Git structure, you can still use it by passing --no-index
.
Example syntax:
git grep --no-index "some_pattern"
For more examples, see:
I'm a dilettante, granted, but here's how my ~/.bash_profile looks:
export GREP_OPTIONS="-orl --exclude-dir=.svn --exclude-dir=.cache --color=auto" GREP_COLOR='1;32'
Note that to exclude two directories, I had to use --exclude-dir twice.
In the directories are also many binary files. I can't search only certain directories (the directory structure is a big mess). Is there's a better way of grepping only in certain files?
ripgrep
This is one of the quickest tools designed to recursively search your current directory. It is written in Rust , built on top of Rust's regex engine for maximum efficiency. Check the detailed analysis here .
So you can just run:
rg "some_pattern"
It respect your .gitignore
and automatically skip hidden files/directories and binary files.
You can still customize include or exclude files and directories using -g
/ --glob
. Globbing rules match .gitignore
globs. Check man rg
for help.
For more examples, see: How to exclude some files not matching certain extensions with grep?
On macOS, you can install via brew install ripgrep
.
find and xargs are your friends. Use them to filter the file list rather than grep's --exclude
Try something like
find . -not -name '*.png' -o -type f -print | xargs grep -icl "foo="
The advantage of getting used to this, is that it is expandable to other use cases, for example to count the lines in all non-png files:
find . -not -name '*.png' -o -type f -print | xargs wc -l
To remove all non-png files:
find . -not -name '*.png' -o -type f -print | xargs rm
etc.
As pointed out in the comments, if some files may have spaces in their names, use -print0
and xargs -0
instead.
Try this one:
$ find . -name "*.txt" -type f -print | xargs file | grep "foo=" | cut -d: -f1
Founded here: http://www.unix.com/shell-programming-scripting/42573-search-files-excluding-binary-files.html
If you search non-recursively you can use glop patterns to match the filenames.
grep "foo" *.{html,txt}
includes html and txt. It searches in the current directory only.
To search in the subdirectories:
grep "foo" */*.{html,txt}
In the subsubdirectories:
grep "foo" */*/*.{html,txt}
those scripts don't accomplish all the problem...Try this better:
du -ha | grep -i -o "\./.*" | grep -v "\.svn\|another_file\|another_folder" | xargs grep -i -n "$1"
this script is so better, because it uses "real" regular expressions to avoid directories from search. just separate folder or file names with "\\|"on the grep -v
enjoy it! found on my linux shell! XD
看看@这个。
grep --exclude="*\.svn*" -rn "foo=" * | grep -v Binary | grep -v tags
The --binary-files=without-match
option to GNU grep
gets it to skip binary files. (Equivalent to the -I
switch mentioned elsewhere.)
(This might require a recent version of grep
; 2.5.3 has it, at least.)
suitable for tcsh .alias file:
alias gisrc 'grep -I -r -i --exclude="*\.svn*" --include="*\."{mm,m,h,cc,c} \!* *'
Took me a while to figure out that the {mm,m,h,cc,c} portion should NOT be inside quotes. ~Keith
To ignore all binary results from grep
grep -Ri "pattern" * | awk '{if($1 != "Binary") print $0}'
The awk part will filter out all the Binary file foo matches lines
Try this:
--F
" under currdir ..(or link another folder there renamed to " --F
" ie double-minus-F
.#> grep -i --exclude-dir="\\-\\-F" "pattern" *
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.