简体   繁体   中英

script for getting extensions of a file

I need to get all the file extension types in a folder. For instance, if the directory's ls gives the following:

a.t  
b.t.pg  
c.bin  
d.bin  
e.old  
f.txt  
g.txt  

I should get this by running the script

.t  
.t.pg  
.bin  
.old  
.txt  

I have a bash shell.

Thanks a lot!

See the BashFAQ entry on ParsingLS for a description of why many of these answers are evil.

The following approach avoids this pitfall (and, by the way, completely ignores files with no extension):

shopt -s nullglob
for f in *.*; do
  printf '%s\n' ".${f#*.}"
done | sort -u

Among the advantages:

  • Correctness: ls behaves inconsistently and can result in inappropriate results. See the link at the top.
  • Efficiency: Minimizes the number of subprocess invoked (only one, sort -u , and that could be removed also if we wanted to use Bash 4's associative arrays to store results)

Things that still could be improved:

  • Correctness: this will correctly discard newlines in filenames before the first . (which some other answers won't) -- but filenames with newlines after the first . will be treated as separate entries by sort . This could be fixed by using nulls as the delimiter, or by the aforementioned bash 4 associative-array storage approach.

try this:

ls -1 | sed 's/^[^.]*\(\..*\)$/\1/' | sort -u
  • ls lists files in your folder, one file per line
  • sed magic extracts extensions
  • sort -u sorts extensions and removes duplicates

sed magic reads as:

  • s/ / / : substitutes whatever is between first and second / by whatever is between second and third /
  • ^ : match beginning of line
  • [^.] : match any character that is not a dot
  • * : match it as many times as possible
  • \\( and \\) : remember whatever is matched between these two parentheses
  • \\. : match a dot
  • . : match any character
  • * : match it as many times as possible
  • $ : match end of line
  • \\1 : this is what has been matched between parentheses

People are really over-complicating this - particularly the regex:

ls | grep -o "\..*" | uniq

ls - get all the files
grep -o "\\..*" - -o only show the match; "\\..*" match at the first "." & everything after it
uniq - don't print duplicates but keep the same order

you can also sort if you like, but sorting doesn't match the example

This is what happens when you run it:

> ls -1
a.t
a.t.pg
c.bin
d.bin
e.old
f.txt
g.txt

> ls | grep -o "\..*" | uniq
.t
.t.pg
.bin
.old
.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM