简体   繁体   English

用于获取文件扩展名的脚本

[英]script for getting extensions of a file

I need to get all the file extension types in a folder. 我需要在文件夹中获取所有文件扩展名类型。 For instance, if the directory's ls gives the following: 例如,如果目录的ls给出以下内容:

a.t  
b.t.pg  
c.bin  
d.bin  
e.old  
f.txt  
g.txt  

I should get this by running the script 我应该通过运行脚本来获取

.t  
.t.pg  
.bin  
.old  
.txt  

I have a bash shell. 我有一个bash外壳。

Thanks a lot! 非常感谢!

See the BashFAQ entry on ParsingLS for a description of why many of these answers are evil. 请参阅ParsingLS上的BashFAQ条目,以获取其中许多答案为何有害的描述。

The following approach avoids this pitfall (and, by the way, completely ignores files with no extension): 下面的方法避免了这种陷阱(顺便说一句,完全忽略了没有扩展名的文件):

shopt -s nullglob
for f in *.*; do
  printf '%s\n' ".${f#*.}"
done | sort -u

Among the advantages: 优势之一:

  • Correctness: ls behaves inconsistently and can result in inappropriate results. 正确性: ls行为不一致,并可能导致不合适的结果。 See the link at the top. 请参阅顶部的链接。
  • Efficiency: Minimizes the number of subprocess invoked (only one, sort -u , and that could be removed also if we wanted to use Bash 4's associative arrays to store results) 效率:最小化被调用的子流程的数量(仅一个sort -u ,如果我们想使用Bash 4的关联数组来存储结果,也可以将其删除)

Things that still could be improved: 仍有待改进的地方:

  • Correctness: this will correctly discard newlines in filenames before the first . 正确性:这将正确地删除文件名中第一个之前的换行符. (which some other answers won't) -- but filenames with newlines after the first . (其他答案不会)-但在第一个文件名后带有换行符. will be treated as separate entries by sort . 将按sort方式视为单独的条目。 This could be fixed by using nulls as the delimiter, or by the aforementioned bash 4 associative-array storage approach. 这可以通过使用空值作为定界符或通过上述bash 4关联数组存储方法来解决。

try this: 尝试这个:

ls -1 | sed 's/^[^.]*\(\..*\)$/\1/' | sort -u
  • ls lists files in your folder, one file per line ls列出文件夹中的文件,每行一个文件
  • sed magic extracts extensions sed magic提取扩展名
  • sort -u sorts extensions and removes duplicates sort -u扩展名sort -u排序并删除重复项

sed magic reads as: sed魔术的读法是:

  • s/ / / : substitutes whatever is between first and second / by whatever is between second and third / s/ / / :用第一和第二之间的内容替换为第二和第三之间的内容/
  • ^ : match beginning of line ^ :匹配行首
  • [^.] : match any character that is not a dot [^.] :匹配任何非点的字符
  • * : match it as many times as possible * :尽可能多地匹配
  • \\( and \\) : remember whatever is matched between these two parentheses \\(\\) :记住这两个括号之间的匹配项
  • \\. : match a dot :匹配一个点
  • . : match any character :匹配任何字符
  • * : match it as many times as possible * :尽可能多地匹配
  • $ : match end of line $ :匹配行尾
  • \\1 : this is what has been matched between parentheses \\1 :这是括号之间匹配的

People are really over-complicating this - particularly the regex: 人们确实使这一问题变得过于复杂-特别是正则表达式:

ls | grep -o "\..*" | uniq

ls - get all the files ls获取所有文件
grep -o "\\..*" - -o only show the match; grep -o "\\..*" - -o只显示匹配; "\\..*" match at the first "." "\\..*"与第一个“。”匹配 & everything after it 及其后的一切
uniq - don't print duplicates but keep the same order uniq不重复打印,但保持相同顺序

you can also sort if you like, but sorting doesn't match the example 您也可以根据需要进行排序,但是排序与示例不匹配

This is what happens when you run it: 当您运行它时,将发生以下情况:

> ls -1
a.t
a.t.pg
c.bin
d.bin
e.old
f.txt
g.txt

> ls | grep -o "\..*" | uniq
.t
.t.pg
.bin
.old
.txt

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM