简体   繁体   English

如何计算每个目录中的文件数?

[英]How to count number of files in each directory?

I am able to list all the directories by我能够列出所有目录

find ./ -type d

I attempted to list the contents of each directory and count the number of files in each directory by using the following command我尝试使用以下命令列出每个目录的内容并计算每个目录中的文件数

find ./ -type d | xargs ls -l | wc -l

But this summed the total number of lines returned by但这总结了返回的总行数

find ./ -type d | xargs ls -l

Is there a way I can count the number of files in each directory?有没有办法可以计算每个目录中的文件数?

这将打印当前目录级别的每个目录的文件数:

du -a | cut -d/ -f2 | sort | uniq -c | sort -nr

Assuming you have GNU find, let it find the directories and let bash do the rest:假设你有 GNU find,让它找到目录,让 bash 做剩下的事情:

find . -type d -print0 | while read -d '' -r dir; do
    files=("$dir"/*)
    printf "%5d files in directory %s\n" "${#files[@]}" "$dir"
done
find . -type f | cut -d/ -f2 | sort | uniq -c
  • find . -type f find . -type f to find all items of the type file , in current folder and subfolders find . -type f在当前文件夹和子文件夹中查找类型file所有项目
  • cut -d/ -f2 to cut out their specific folder cut -d/ -f2剪切他们的特定文件夹
  • sort to sort the list of foldernames sort对文件夹名称列表进行排序
  • uniq -c to return the number of times each foldername has been counted uniq -c返回每个文件夹名被计算的次数

You could arrange to find all the files, remove the file names, leaving you a line containing just the directory name for each file, and then count the number of times each directory appears:您可以安排查找所有文件,删除文件名,留下一行只包含每个文件的目录名,然后计算每个目录出现的次数:

find . -type f |
sed 's%/[^/]*$%%' |
sort |
uniq -c

The only gotcha in this is if you have any file names or directory names containing a newline character, which is fairly unlikely.唯一的问题是如果您有任何包含换行符的文件名或目录名,这是不太可能的。 If you really have to worry about newlines in file names or directory names, I suggest you find them, and fix them so they don't contain newlines (and quietly persuade the guilty party of the error of their ways).如果您真的必须担心文件名或目录名中的换行符,我建议您找到它们,并修复它们,使它们不包含换行符(并悄悄地说服有罪的一方承认他们的方式错误)。


If you're interested in the count of the files in each sub-directory of the current directory, counting any files in any sub-directories along with the files in the immediate sub-directory, then I'd adapt the sed command to print only the top-level directory:如果您对当前目录的每个子目录中的文件数感兴趣,计算任何子目录中的任何文件以及直接子目录中的文件,那么我会调整sed命令来打印只有顶级目录:

find . -type f |
sed -e 's%^\(\./[^/]*/\).*$%\1%' -e 's%^\.\/[^/]*$%./%' |
sort |
uniq -c

The first pattern captures the start of the name, the dot, the slash, the name up to the next slash and the slash, and replaces the line with just the first part, so:第一个模式捕获名称的开头、点、斜线、直到下一个斜线和斜线的名称,并仅用第一部分替换该行,因此:

./dir1/dir2/file1

is replaced by被替换为

./dir1/

The second replace captures the files directly in the current directory;第二次replace直接捕获当前目录下的文件; they don't have a slash at the end, and those are replace by ./ .它们末尾没有斜线,它们被替换为./ The sort and count then works on just the number of names.然后排序和计数仅适用于名称的数量。

Here's one way to do it, but probably not the most efficient.这是一种方法,但可能不是最有效的。

find -type d -print0 | xargs -0 -n1 bash -c 'echo -n "$1:"; ls -1 "$1" | wc -l' --

Gives output like this, with directory name followed by count of entries in that directory.给出这样的输出,目录名后跟该目录中的条目数。 Note that the output count will also include directory entries which may not be what you want.请注意,输出计数还将包括可能不是您想要的目录条目。

./c/fa/l:0
./a:4
./a/c:0
./a/a:1
./a/a/b:0

Slightly modified version of Sebastian's answer using find instead of du (to exclude file-size-related overhead that du has to perform and that is never used):使用find而不是du稍微修改了Sebastian 的答案的版本(排除du必须执行且从未使用过的与文件大小相关的开销):

 find ./ -mindepth 2 -type f | cut -d/ -f2 | sort | uniq -c | sort -nr

-mindepth 2 parameter is used to exclude files in current directory. -mindepth 2参数用于排除当前目录中的文件。 If you remove it, you'll see a bunch of lines like the following:如果你删除它,你会看到一堆如下所示的行:

  234 dir1
  123 dir2
    1 file1
    1 file2
    1 file3
      ...
    1 fileN

(much like the du -based variant does) (很像基于du的变体)

If you do need to count the files in current directory as well, use this enhanced version:如果您还需要计算当前目录中的文件,请使用此增强版本:

{ find ./ -mindepth 2 -type f | cut -d/ -f2 | sort && find ./ -maxdepth 1 -type f | cut -d/ -f1; } | uniq -c | sort -nr

The output will be like the following:输出将如下所示:

  234 dir1
  123 dir2
   42 .

Everyone else's solution has one drawback or another.其他人的解决方案都有一个或另一个缺点。

find -type d -readable -exec sh -c 'printf "%s " "$1"; ls -1UA "$1" | wc -l' sh {} ';'

Explanation:解释:

  • -type d : we're interested in directories. -type d :我们对目录感兴趣。
  • -readable : We only want them if it's possible to list the files in them. -readable :如果可以列出其中的文件,我们只需要它们。 Note that find will still emit an error when it tries to search for more directories in them, but this prevents calling -exec for them.请注意,当它尝试在其中搜索更多目录时, find仍然会发出错误,但这会阻止为它们调用-exec
  • -exec sh -c BLAH sh {} ';' : for each directory, run this script fragment, with $0 set to sh and $1 set to the filename. :对于每个目录,运行此脚本片段,将$0设置为sh ,将$1设置为文件名。
  • printf "%s " "$1" : portably and minimally print the directory name, followed by only a space, not a newline. printf "%s " "$1" :可移植且最少打印目录名称,后跟仅一个空格,而不是换行符。
  • ls -1UA : list the files, one per line, in directory order (to avoid stalling the pipe), excluding only the special directories . ls -1UA :列出文件,每行一个,按目录顺序(以避免管道阻塞),排除特殊目录. and ....
  • wc -l : count the lines wc -l : 计​​算行数

This can also be done with looping over ls instead of find这也可以通过循环 ls 而不是 find 来完成

for f in */; do echo "$f -> $(ls $f | wc -l)"; done

Explanation:解释:

for f in */; - loop over all directories - 遍历所有目录

do echo "$f -> - print out each directory name do echo "$f -> - 打印出每个目录名

$(ls $f | wc -l) - call ls for this directory and count lines $(ls $f | wc -l) - 为此目录调用 ls 并计算行数

This should return the directory name followed by the number of files in the directory.这应该返回目录名称,后跟目录中的文件数。

findfiles() {
    echo "$1" $(find "$1" -maxdepth 1 -type f | wc -l)
}

export -f findfiles

find ./ -type d -exec bash -c 'findfiles "$0"' {} \;

Example output:示例输出:

./ 6
./foo 1
./foo/bar 2
./foo/bar/bazzz 0
./foo/bar/baz 4
./src 4

The export -f is required because the -exec argument of find does not allow executing a bash function unless you invoke bash explicitly, and you need to export the function defined in the current scope to the new shell explicitly. export -f是必需的,因为find-exec参数不允许执行 bash 函数,除非您显式调用 bash,并且您需要将当前作用域中定义的函数显式导出到新 shell。

My answer is a little different, due to the options of find, you can actually be much more flexible.我的答案有点不同,由于 find 的选项,你实际上可以更加灵活。 Just try:试一试:

find . -type f -printf "%h\n" | sort | uniq -c

With the "%h" option to "-printf", find prints only the directory of the files it found.使用“-printf”的“%h”选项,find 只打印它找到的文件的目录。 Then sort and count with "uniq -c".然后使用“uniq -c”进行排序和计数。 This prints the number of search result entries with the same directory, per directory.这将打印每个目录中具有相同目录的搜索结果条目数。

Using further options on find, you can be much more flexible.在查找中使用更多选项,您可以更加灵活。 For example, to get an overview how many files in which directory have been modified at a certain date, use:例如,要了解在某个日期某个目录中有多少文件被修改,请使用:

find . -newermt "2022-01-01 00:00:00" -type f -printf "%TY-%Tm-%Td %h\n" | sort | uniq -c

This finds all files that have been modified since 1. January 2022, prints (with "-printf") the modification date and the directory, then sorts and counts them.这将查找自 2022 年 1 月 1 日以来修改过的所有文件,打印(使用“-printf”)修改日期和目录,然后对它们进行排序和计数。 In this example, each line in the result has the number of files, the date of modification (without time), and the directory.在此示例中,结果中的每一行都有文件数、修改日期(不包括时间)和目录。

Note that "-printf" may not be available in all versions of find I think.请注意,我认为“-printf”可能并非在所有版本的 find 中都可用。

I combined @glenn jackman's answer and @pcarvalho's answer(in comment list, there is something wrong with pcarvalho's answer because the extra style control function of character ' ` '(backtick)).我结合了@glenn jackman 的回答和@pcarvalho 的回答(在评论列表中,pcarvalho 的回答有问题,因为字符' ` '(反引号)的额外样式控制功能)。

My script can accept path as an augument and sort the directory list as ls -l , also it can handles the problem of "space in file name" .我的脚本可以接受路径作为参数并将目录列表排序为ls -l它还可以处理“文件名中的空格”问题

#!/bin/bash
OLD_IFS="$IFS"
IFS=$'\n'
for dir in $(find $1 -maxdepth 1 -type d | sort); 
do
    files=("$dir"/*)
    printf "%5d,%s\n" "${#files[@]}" "$dir"
done
FS="$OLD_IFS"

My first answer in stackoverflow, and I hope it can help someone ^_^我在stackoverflow中的第一个答案,我希望它可以帮助某人^_^

这可能是浏览目录结构并提供深度结果的另一种方式。

find . -type d  | awk '{print "echo -n \""$0"  \";ls -l "$0" | grep -v total | wc -l" }' | sh 

find .寻找 。 -type f -printf '%h\\n' | -type f -printf '%h\\n' | sort |排序 | uniq -c uniq -c

gives for example:例如:

  5 .
  4 ./aln
  5 ./aln/iq
  4 ./bs
  4 ./ft
  6 ./hot

I tried with some of the others here but ended up with subfolders included in the file count when I only wanted the files.我在这里尝试了其他一些文件,但当我只想要这些文件时,文件计数中包含了子文件夹。 This prints ./folder/path<tab>nnn with the number of files, not including subfolders, for each subfolder in the current folder.这会为当前文件夹中的每个子文件夹打印./folder/path<tab>nnn和文件数,不包括子文件夹。

for d in `find . -type d -print` 
do 
  echo -e "$d\t$(find $d -maxdepth 1 -type f -print | wc -l)"
done

Easy way to recursively find files of a given type.递归查找给定类型文件的简单方法。 In this case, .jpg files for all folders in current directory:在这种情况下,当前目录中所有文件夹的 .jpg 文件:

find . -name *.jpg -print | wc -l

A super fast miracle command, which recursively traverses files to count the number of images in a directory and organize the output by image extension:一个超快的奇迹命令,它递归遍历文件来统计一个目录中的图像数量并按图像扩展名组织输出:

find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -n | grep -Ei '(tiff|bmp|jpeg|jpg|png|gif)$'

Credits: https://unix.stackexchange.com/a/386135/354980学分: https : //unix.stackexchange.com/a/386135/354980

I edited the script in order to exclude all node_modules directories inside the analyzed one.我编辑了脚本以排除分析的目录中的所有node_modules目录。

This can be used to check if the project number of files is exceeding the maximum number that the file watcher can handle.这可用于检查文件的项目数量是否超过文件观察器可以处理的最大数量。

find . -type d ! -path "*node_modules*" -print0 | while read -d '' -r dir; do
    files=("$dir"/*)
    printf "%5d files in directory %s\n" "${#files[@]}" "$dir"
done

To check the maximum files that your system can watch:要检查您的系统可以观看的最大文件数:

cat /proc/sys/fs/inotify/max_user_watches

node_modules folder should be added to your IDE/editor excluded paths in slow systems, and the other files count shouldn't ideally exceed the maximum (which can be changed though). node_modules文件夹应该添加到慢系统中的 IDE/编辑器排除路径中,其他文件数量最好不要超过最大值(尽管可以更改)。

omg why the complex commands.天啊为什么复杂的命令。 just use something like只需使用类似的东西

find whatever_folder | wc -l

Easy Method:简易方法:

find./|grep "Search_file.txt" |cut -d"/" -f2|sort |uniq -c find./|grep "Search_file.txt" |cut -d"/" -f2|sort |uniq -c

In my case I needed the count at subfolder level, so I did:在我的例子中,我需要子文件夹级别的计数,所以我做了:

du -a | cut -d/ -f3 | sort | uniq -c | sort -nr

这将给出总体计数。

for file in */; do echo "$file -> $(ls $file | wc -l)"; done | cut -d ' ' -f 3| py --ji -l 'numpy.sum(l)'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM