简体   繁体   中英

How to grep for a file extension

I am currently trying to a make a script that would grep input to see if something is of a certain file type (zip for instance), although the text before the file type could be anything, so for instance

something.zip
this.zip
that.zip

would all fall under the category. I am trying to grep for these using a wildcard, and so far I have tried this

grep ".*.zip"

But whenever I do that, it will find the .zip files just fine, but it will still display output if there are additional characters after the .zip so for instance .zippppppp or .zipdsjdskjc would still be picked up by grep. Having said that, what should I do to prevent grep from displaying matches that have additional characters after the .zip ?

Test for the end of the line with $ and escape the second . with a backslash so it only matches a period and not any character.

grep ".*\.zip$"

However ls *.zip is a more natural way to do this if you want to list all the .zip files in the current directory or find . -name "*.zip" find . -name "*.zip" for all .zip files in the sub-directories starting from (and including) the current directory.

在 UNIX 上,尝试:

find . -type f -name \*.zip

You can also use grep to find all files with a specific extension:

find .|grep -e "\.gz$"

The . means the current folder. If you want to specify a folder other than the current folder, just replace the . with the path of the folder. Here is an example: Let's find all files that end with .gz and are in the folder /var/log

  find /var/log/ |grep -e "\.gz$"

The output is something similar to the following:

 ✘ ⚙> find /var/log/ |grep -e "\.gz$"

/var/log//mail.log.1.gz
/var/log//mail.log.0.gz
/var/log//system.log.3.gz
/var/log//system.log.7.gz
/var/log//system.log.6.gz
/var/log//system.log.2.gz
/var/log//system.log.5.gz
/var/log//system.log.1.gz
/var/log//system.log.0.gz
/var/log//system.log.4.gz

The $ sign says that the file extension is ending with gz

You need to do a couple of things. It should look like this:

grep '.*\.zip$'

You need to escape the second dot, so it will just match a dot, and not any character. Using single quotes makes the escaping a bit easier.

You need the dollar sign at the end of the line to indicate that you want the "zip" to occur at the end of the line.

I use this to get a listing of the file types inside a folder.

find . -type f | egrep -i -E -o "\.{1}\w*$" | sort -su

Outputs for example:

.DS_Store
.MP3
.aif
.aiff
.asd
.doc
.flac
.jpg
.m4a
.m4p
.m4r
.mp3
.pdf
.png
.txt
.wav
.wma
.zip

BONUS: with

find . -type f | egrep -i -E -o "\.{1}\w*$" | sort | uniq -c

You'll get the file count:

    106 .DS_Store
     35 .MP3
     89 .aif
      5 .aiff
    525 .asd
      1 .doc
     60 .flac
     48 .jpg
    149 .m4a
     11 .m4p
      1 .m4r
  12844 .mp3
      1 .pdf
      5 .png
      9 .txt
    108 .wav
     44 .wma
      2 .zip

Try: grep -o -E "(\\.([Az])+)+"

I used this to get multi-dotted/multiple extensions. So if the input was hello.tar.gz , then it would output .tar.gz . For single dotted, use grep -o -E "\\.([Az])+$" . Tested on Cygwin/MingW+MSYS.

One more fix/addon of the above example:

# multi-dotted/multiple extensions
grep -oEi "(\\.([A-z0-9])+)+" file.txt

# single dotted
grep -oEi "\\.([A-z0-9])+$" file.txt

This will get file extensions like '.mp3' and etc.

如果您只想在当前文件夹中查找,为什么不用这个没有 grep 的简单命令呢?

ls *.zip 

Just reviewing some of the other answers. The .* isn't necessary, and if you're looking for a certain file extension, it's best to include -i so that it's case-insensitive; in case the file is HELLO.ZIP, for example. I don't think the quotes are necessary, either.

grep -i \.zip$
grep -r pattern --include="*.txt" /path/to/dir/

Simply do :

grep ".*.zip$"

The "$" indicates the end of line

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM