简体   繁体   中英

Use grep --exclude/--include syntax to not grep through certain files

I'm looking for the string foo= in text files in a directory tree. It's on a common Linux machine, I have bash shell:

grep -ircl "foo=" *

In the directories are also many binary files which match "foo=" . As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images). How would I do that?

I know there are the --exclude=PATTERN and --include=PATTERN options, but what is the pattern format? The man page of grep says:

--include=PATTERN     Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN     Recurse in directories skip file matching PATTERN.

Searching on grep include , grep include exclude , grep exclude and variants did not find anything relevant

If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option. I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find ).

Use the shell globbing syntax :

grep pattern -r --include=\*.cpp --include=\*.h rootdir

The syntax for --exclude is identical.

Note that the star is escaped with a backslash to prevent it from being expanded by the shell (quoting it, such as --include="*.cpp" , would work just as well). Otherwise, if you had any files in the current working directory that matched the pattern, the command line would expand to something like grep pattern -r --include=foo.cpp --include=bar.cpp rootdir , which would only search files named foo.cpp and bar.cpp , which is quite likely not what you wanted.

Update 2021-03-04

I've edited the original answer to remove the use of brace expansion , which is a feature provided by several shells such as Bash and zsh to simplify patterns like this; but note that brace expansion is not POSIX shell-compliant.

The original example was:

grep pattern -r --include=\*.{cpp,h} rootdir

to search through all .cpp and .h files rooted in the directory rootdir .

If you just want to skip binary files, I suggest you look at the -I (upper case i) option. It ignores binary files. I regularly use the following command:

grep -rI --exclude-dir="\.svn" "pattern" *

It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders, for whatever pattern I want. I have it aliased as "grepsvn" on my box at work.

Please take a look at ack , which is designed for exactly these situations. Your example of

grep -ircl --exclude=*.{png,jpg} "foo=" *

is done with ack as

ack -icl "foo="

because ack never looks in binary files by default, and -r is on by default. And if you want only CPP and H files, then just do

ack -icl --cpp "foo="

grep 2.5.3 introduced the --exclude-dir parameter which will work the way you want.

grep -rI --exclude-dir=\.svn PATTERN .

You can also set an environment variable: GREP_OPTIONS="--exclude-dir=\\.svn"

I'll second Andy's vote for ack though, it's the best.

很长时间后我发现了这一点,您可以添加多个包含和排除,例如:

grep "z-index" . --include=*.js --exclude=*js/lib/* --exclude=*.min.js

The suggested command:

grep -Ir --exclude="*\.svn*" "pattern" *

is conceptually wrong, because --exclude works on the basename. Put in other words, it will skip only the .svn in the current directory.

在 grep 2.5.1 中,您必须将此行添加到 ~/.bashrc 或 ~/.bash 配置文件

export GREP_OPTIONS="--exclude=\*.svn\*"

I find grepping grep's output to be very helpful sometimes:

grep -rn "foo=" . | grep -v "Binary file"

Though, that doesn't actually stop it from searching the binary files.

On CentOS 6.6/Grep 2.6.3, I have to use it like this:

grep "term" -Hnir --include \*.php --exclude-dir "*excluded_dir*"

Notice the lack of equal signs "=" (otherwise --include , --exclude , include-dir and --exclude-dir are ignored)

If you are not averse to using find , I like its -prune feature:


find [directory] \
        -name "pattern_to_exclude" -prune \
     -o -name "another_pattern_to_exclude" -prune \
     -o -name "pattern_to_INCLUDE" -print0 \
| xargs -0 -I FILENAME grep -IR "pattern" FILENAME

On the first line, you specify the directory you want to search. . (current directory) is a valid path, for example.

On the 2nd and 3rd lines, use "*.png" , "*.gif" , "*.jpg" , and so forth. Use as many of these -o -name "..." -prune constructs as you have patterns.

On the 4th line, you need another -o (it specifies "or" to find ), the patterns you DO want, and you need either a -print or -print0 at the end of it. If you just want "everything else" that remains after pruning the *.gif , *.png , etc. images, then use -o -print0 and you're done with the 4th line.

Finally, on the 5th line is the pipe to xargs which takes each of those resulting files and stores them in a variable FILENAME . It then passes grep the -IR flags, the "pattern" , and then FILENAME is expanded by xargs to become that list of filenames found by find .

For your particular question, the statement may look something like:


find . \
     -name "*.png" -prune \
     -o -name "*.gif" -prune \
     -o -name "*.svn" -prune \
     -o -print0 | xargs -0 -I FILES grep -IR "foo=" FILES

git grep

Use git grep which is optimized for performance and aims to search through certain files.

By default it ignores binary files and it is honoring your .gitignore . If you're not working with Git structure, you can still use it by passing --no-index .

Example syntax:

git grep --no-index "some_pattern"

For more examples, see:

I'm a dilettante, granted, but here's how my ~/.bash_profile looks:

export GREP_OPTIONS="-orl --exclude-dir=.svn --exclude-dir=.cache --color=auto" GREP_COLOR='1;32'

Note that to exclude two directories, I had to use --exclude-dir twice.

In the directories are also many binary files. I can't search only certain directories (the directory structure is a big mess). Is there's a better way of grepping only in certain files?

ripgrep

This is one of the quickest tools designed to recursively search your current directory. It is written in Rust , built on top of Rust's regex engine for maximum efficiency. Check the detailed analysis here .

So you can just run:

rg "some_pattern"

It respect your .gitignore and automatically skip hidden files/directories and binary files.

You can still customize include or exclude files and directories using -g / --glob . Globbing rules match .gitignore globs. Check man rg for help.

For more examples, see: How to exclude some files not matching certain extensions with grep?

On macOS, you can install via brew install ripgrep .

find and xargs are your friends. Use them to filter the file list rather than grep's --exclude

Try something like

find . -not -name '*.png' -o -type f -print | xargs grep -icl "foo="

The advantage of getting used to this, is that it is expandable to other use cases, for example to count the lines in all non-png files:

find . -not -name '*.png' -o -type f -print | xargs wc -l

To remove all non-png files:

find . -not -name '*.png' -o -type f -print | xargs rm

etc.

As pointed out in the comments, if some files may have spaces in their names, use -print0 and xargs -0 instead.

Try this one:

$ find . -name "*.txt" -type f -print | xargs file | grep "foo=" | cut -d: -f1

Founded here: http://www.unix.com/shell-programming-scripting/42573-search-files-excluding-binary-files.html

If you search non-recursively you can use glop patterns to match the filenames.

grep "foo" *.{html,txt}

includes html and txt. It searches in the current directory only.

To search in the subdirectories:

   grep "foo" */*.{html,txt}

In the subsubdirectories:

   grep "foo" */*/*.{html,txt}

those scripts don't accomplish all the problem...Try this better:

du -ha | grep -i -o "\./.*" | grep -v "\.svn\|another_file\|another_folder" | xargs grep -i -n "$1"

this script is so better, because it uses "real" regular expressions to avoid directories from search. just separate folder or file names with "\\|"on the grep -v

enjoy it! found on my linux shell! XD

看看@这个。

grep --exclude="*\.svn*" -rn "foo=" * | grep -v Binary | grep -v tags

The --binary-files=without-match option to GNU grep gets it to skip binary files. (Equivalent to the -I switch mentioned elsewhere.)

(This might require a recent version of grep ; 2.5.3 has it, at least.)

suitable for tcsh .alias file:

alias gisrc 'grep -I -r -i --exclude="*\.svn*" --include="*\."{mm,m,h,cc,c} \!* *'

Took me a while to figure out that the {mm,m,h,cc,c} portion should NOT be inside quotes. ~Keith

To ignore all binary results from grep

grep -Ri "pattern" * | awk '{if($1 != "Binary") print $0}'

The awk part will filter out all the Binary file foo matches lines

Try this:

  1. Create a folder named " --F " under currdir ..(or link another folder there renamed to " --F " ie double-minus-F .
  2. #> grep -i --exclude-dir="\\-\\-F" "pattern" *

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM