递归计数 Linux 目录中的文件

Question

如何递归计算 Linux 目录中的文件？

我找到了这个：

find DIR_NAME -type f ¦ wc -l

但是当我运行它时，它返回以下错误。

查找：路径必须在表达式之前：¦

Answer 1

这应该有效：

find DIR_NAME -type f | wc -l

解释：

-type f仅包含文件。
| （而不是¦ ）将find命令的标准输出重定向到wc命令的标准输入。
wc （字数的缩写）计算其输入（文档）中的换行符、字数和字节数。
-l只计算换行符。

笔记：

将DIR_NAME替换为. 在当前文件夹中执行命令。
您还可以删除-type f以在计数中包含目录（和符号链接）。
如果文件名可以包含换行符，则此命令可能会过度计数。

解释为什么您的示例不起作用：

在您显示的命令中，您不使用“管道”（ | ）来连接两个命令，而是使用 shell 无法识别为命令或类似内容的断线（ ¦ ）。 这就是您收到该错误消息的原因。

Answer 2

对于当前目录：

find -type f | wc -l

Answer 3

如果您想了解当前目录下每个目录中有多少文件：

for i in */ .*/ ; do 
    echo -n $i": " ; 
    (find "$i" -type f | wc -l) ; 
done

当然，这可以全部放在一条线上。 括号阐明了应该观察谁的输出wc -l （在这种情况下是find $i -type f ）。

Answer 4

在我的电脑上， rsync比find | wc -l find | wc -l在接受的答案中：

$ rsync --stats --dry-run -ax /path/to/dir /tmp

Number of files: 173076
Number of files transferred: 150481
Total file size: 8414946241 bytes
Total transferred file size: 8414932602 bytes

第二行包含文件数，在上面的示例中为 150,481。 作为奖励，您还可以获得总大小（以字节为单位）。

评论：

第一行是文件、目录、符号链接等的计数，这就是它比第二行大的原因。
--dry-run （或简称-n ）选项对于不实际传输文件很重要！
我使用-x选项来“不跨越文件系统边界”，这意味着如果你为/执行它并且你连接了外部硬盘，它只会计算根分区上的文件。

Answer 5

您可以使用

$ tree

安装树包后

$ sudo apt-get install tree

（在 Debian / Mint / Ubuntu Linux 机器上）。

该命令不仅显示文件数，还分别显示目录数。 选项 -L 可用于指定最大显示级别（默认情况下，它是目录树的最大深度）。

通过提供-a选项也可以包含隐藏文件。

Answer 6

由于 UNIX 中的文件名可能包含换行符（是的，换行符）， wc -l可能会计算太多文件。 我会为每个文件打印一个点，然后计算点数：

find DIR_NAME -type f -printf "." | wc -c

注意： -printf选项仅适用于来自 GNU findutils 的 find。 您可能需要安装它，例如在 Mac 上。

Answer 7

将这里的几个答案结合在一起，最有用的解决方案似乎是：

find . -maxdepth 1 -type d -print0 |
xargs -0 -I {} sh -c 'echo -e $(find "{}" -printf "\n" | wc -l) "{}"' |
sort -n

它可以处理奇怪的事情，例如包含空格括号甚至换行的文件名。 它还按文件数对输出进行排序。

您可以增加-maxdepth之后的数字来计算子目录。 请记住，这可能需要很长时间，特别是如果您有一个高度嵌套的目录结构和一个高-maxdepth数。

Answer 8

如果您想知道当前工作目录中存在多少文件和子目录，您可以使用此单行

find . -maxdepth 1 -type d -print0 | xargs -0 -I {} sh -c 'echo -e $(find {} | wc -l) {}' | sort -n

这将适用于 GNU 风格，并且只需从 BSD linux（例如 OSX）的 echo 命令中省略 -e。

Answer 9

您可以使用命令ncdu 。 它将递归地计算一个 Linux 目录包含多少个文件。 这是一个输出示例：

它有一个进度条，如果您有很多文件，这很方便：

在 Ubuntu 上安装它：

sudo apt-get install -y ncdu

基准测试：我使用https://archive.org/details/cv_corpus_v1.tar（380390个文件，11 GB）作为必须计算文件数量的文件夹。

find . -type f | wc -l find . -type f | wc -l : 大约 1m20s 完成
ncdu : 大约 1m20s 完成

Answer 10

如果您需要递归计算特定文件类型，您可以执行以下操作：

find YOUR_PATH -name '*.html' -type f | wc -l

-l只是显示输出中的行数。

如果您需要排除某些文件夹，请使用-not -path

find . -not -path './node_modules/*' -name '*.js' -type f | wc -l

Answer 11

tree $DIR_PATH | tail -1

样本输出：

5309 个目录，2122 个文件

Answer 12

如果您想避免错误情况，请不要让wc -l查看带有换行符的文件（这将被视为 2+ 个文件）

例如，假设我们有一个文件，其中包含一个 EOL 字符

> mkdir emptydir && cd emptydir
> touch $'file with EOL(\n) character in it'
> find -type f
./file with EOL(?) character in it
> find -type f | wc -l
2

由于至少 gnu wc似乎没有读取/计数空终止列表（文件除外）的选项，因此最简单的解决方案就是不传递文件名，而是每次找到文件时传递静态输出，例如在与上面相同的目录中

> find -type f -exec printf '\n' \; | wc -l
1

或者，如果您的find支持它

> find -type f -printf '\n' | wc -l
1

Answer 13

要确定当前目录中有多少文件，请输入ls -1 | wc -l ls -1 | wc -l 。 这使用wc来计算ls -1输出中的行数(-l) 。 它不计算点文件。 请注意，我在本 HOWTO 的先前版本中使用的ls -l （这是一个“L”而不是前面示例中的“1”）实际上会给你一个比实际计数大一的文件计数。 感谢 Kam Nejad 的这一点。

相对速度：“ls -1 /usr/bin/ | wc -l”在卸载的 486SX25 上大约需要 1.03 秒（这台机器上的 /usr/bin/ 有 355 个文件）。 “ ls -l /usr/bin/ | grep -v ^l | wc -l ” 大约需要 1.19 秒。

来源： http ://www.tldp.org/HOWTO/Bash-Prompt-HOWTO/x700.html

Answer 14

使用 bash：

使用 ( ) 创建一个条目数组并使用 # 获取计数。

FILES=(./*); echo ${#FILES[@]}

好的，这不会递归地计算文件，但我想先展示简单的选项。 一个常见的用例可能是创建文件的翻转备份。 这将创建 logfile.1、logfile.2、logfile.3 等。

CNT=(./logfile*); mv logfile logfile.${#CNT[@]}

启用 bash 4+ globstar的递归计数（如@tripleee 所述）

FILES=(**/*); echo ${#FILES[@]}

要递归地获取文件数，我们仍然可以使用 find 以相同的方式。

FILES=(`find . -type f`); echo ${#FILES[@]}

Answer 15

对于名称中带有空格的目录...（基于上面的各种答案）-递归打印目录名称以及其中的文件数：

find . -mindepth 1 -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done

示例（为便于阅读而格式化）：

pwd
  /mnt/Vancouver/Programming/scripts/claws/corpus

ls -l
  total 8
  drwxr-xr-x 2 victoria victoria 4096 Mar 28 15:02 'Catabolism - Autophagy; Phagosomes; Mitophagy'
  drwxr-xr-x 3 victoria victoria 4096 Mar 29 16:04 'Catabolism - Lysosomes'

ls 'Catabolism - Autophagy; Phagosomes; Mitophagy'/ | wc -l
  138

## 2 dir (one with 28 files; other with 1 file):
ls 'Catabolism - Lysosomes'/ | wc -l
  29

使用tree可以更好地可视化目录结构：

tree -L 3 -F .
  .
  ├── Catabolism - Autophagy; Phagosomes; Mitophagy/
  │   ├── 1
  │   ├── 10
  │   ├── [ ... SNIP! (138 files, total) ... ]
  │   ├── 98
  │   └── 99
  └── Catabolism - Lysosomes/
      ├── 1
      ├── 10
      ├── [ ... SNIP! (28 files, total) ... ]
      ├── 8
      ├── 9
      └── aaa/
          └── bbb

  3 directories, 167 files

man find | grep mindep
  -mindepth levels
    Do not apply any tests or actions at levels less than levels
    (a non-negative integer).  -mindepth 1 means process all files
    except the starting-points.

ls -p | grep -v / ls -p | grep -v / （在下面使用）来自https://unix.stackexchange.com/questions/48492/list-only-regular-files-but-not-directories-in-current-directory的答案 2

find . -mindepth 1 -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done
./Catabolism - Autophagy; Phagosomes; Mitophagy: 138
./Catabolism - Lysosomes: 28
./Catabolism - Lysosomes/aaa: 1

应用程序：我想在数百个目录中找到最大文件数（所有深度 = 1）[下面的输出再次格式化以提高可读性]：

date; pwd
    Fri Mar 29 20:08:08 PDT 2019
    /home/victoria/Mail/2_RESEARCH - NEWS

time find . -mindepth 1 -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done > ../../aaa
    0:00.03

[victoria@victoria 2_RESEARCH - NEWS]$ head -n5 ../../aaa
    ./RNA - Exosomes: 26
    ./Cellular Signaling - Receptors: 213
    ./Catabolism - Autophagy; Phagosomes; Mitophagy: 138
    ./Stress - Physiological, Cellular - General: 261
    ./Ancient DNA; Ancient Protein: 34

[victoria@victoria 2_RESEARCH - NEWS]$ sed -r 's/(^.*): ([0-9]{1,8}$)/\2: \1/g' ../../aaa | sort -V | (head; echo ''; tail)

    0: ./Genomics - Gene Drive
    1: ./Causality; Causal Relationships
    1: ./Cloning
    1: ./GenMAPP 2
    1: ./Pathway Interaction Database
    1: ./Wasps
    2: ./Cellular Signaling - Ras-MAPK Pathway
    2: ./Cell Death - Ferroptosis
    2: ./Diet - Apples
    2: ./Environment - Waste Management

    988: ./Genomics - PPM (Personalized & Precision Medicine)
    1113: ./Microbes - Pathogens, Parasites
    1418: ./Health - Female
    1420: ./Immunity, Inflammation - General
    1522: ./Science, Research - Miscellaneous
    1797: ./Genomics
    1910: ./Neuroscience, Neurobiology
    2740: ./Genomics - Functional
    3943: ./Cancer
    4375: ./Health - Disease

sort -V是自然排序。 ...因此，我在这些（Claws Mail）目录中的最大文件数是 4375 个文件。 如果我左填充（ https://stackoverflow.com/a/55409116/1904943 ）那些文件名 - 它们都以数字命名，在每个目录中从 1 开始 - 并填充到 5 个总数字，我应该没问题.

附录

$ date; pwd
Tue 14 May 2019 04:08:31 PM PDT
/home/victoria/Mail/2_RESEARCH - NEWS

$ ls | head; echo; ls | tail
Acoustics
Ageing
Ageing - Calorie (Dietary) Restriction
Ageing - Senescence
Agriculture, Aquaculture, Fisheries
Ancient DNA; Ancient Protein
Anthropology, Archaeology
Ants
Archaeology
ARO-Relevant Literature, News

Transcriptome - CAGE
Transcriptome - FISSEQ
Transcriptome - RNA-seq
Translational Science, Medicine
Transposons
USACEHR-Relevant Literature
Vaccines
Vision, Eyes, Sight
Wasps
Women in Science, Medicine

$ find . -type f | wc -l
70214    ## files

$ find . -type d | wc -l
417      ## subdirectories

Answer 16

这里有很多正确的答案。 这是另一个！

find . -type f | sort | uniq -w 10 -c

哪里. 是要查看的文件夹， 10是用于对目录进行分组的字符数。

Answer 17

我编写了ffcnt来加速特定情况下的递归文件计数：支持范围映射的旋转磁盘和文件系统。

它可能比ls或基于find的方法快一个数量级，但 YMMV。

Answer 18

我们可以使用tree命令递归地显示所有文件和文件夹。 以及它在 output 的最后一行显示文件夹和文件的数量。

$ tree path/to/folder/
path/to/folder/
├── a-first.html
├── b-second.html
├── subfolder
│   ├── readme.html
│   ├── code.cpp
│   └── code.h
└── z-last-file.html

1 directories, 6 files

对于 tree 命令中 output 的最后一行，我们可以在它的 output 上使用 tail 命令

$ tree path/to/folder/ | tail -1
1 directories, 6 files

为了安装树，我们可以使用下面的命令

$ sudo apt-get install tree

Answer 19

这种过滤格式的替代方法会计算所有可用的 grub 内核模块：

ls -l /boot/grub/*.mod | wc -l

Answer 20

假设您想要每个目录的总文件，请尝试：

for d in `find YOUR_SUBDIR_HERE -type d`; do 
   printf "$d - files > "
   find $d -type f | wc -l
done

对于当前目录试试这个：

for d in `find . -type d`; do printf "$d - files > "; find $d -type f | wc -l; done;

如果你有长的空间名称，你需要改变 IFS，像这样：

OIFS=$IFS; IFS=$'\n'
for d in `find . -type d`; do printf "$d - files > "; find $d -type f | wc -l; done
IFS=$OIFS

Answer 21

根据上面给出的回复和评论，我想出了以下文件计数列表。 特别是它结合了@Greg Bell 提供的解决方案，以及来自@Arch Stanton 和@Schneems 的评论

function countit { find . -maxdepth 1000000 -type d -print0 | while IFS= read -r -d '' i ; do file_count=$(find "$i" -type f | wc -l) ; echo "$file_count: $i" ; done }; countit | sort -n -r >file-count.txt

计算当前目录和子目录中给定名称的所有文件

function countit { find . -maxdepth 1000000 -type d -print0 | while IFS= read -r -d '' i ; do file_count=$(find "$i" -type f | grep <enter_filename_here> | wc -l) ; echo "$file_count: $i" ; done }; countit | sort -n -r >file-with-name-count.txt

Answer 22

你可以尝试：

find `pwd` -type f -exec ls -l {} ; | wc -l

Answer 23

查找-type f | wc -l

寻找。 -类型 f | wc -l

Answer 24

这将完全正常。 简单的短。 如果要计算文件夹中存在的文件数。

ls | wc -l

Answer 25

ls -l | grep -e -x -e -dr | wc -l

长长的清单
过滤文件和目录
计算过滤后的行号

递归计数 Linux 目录中的文件

问题描述

24 个解决方案

解决方案1
1695 已采纳 2012-02-06 08:02:30

解决方案2
125 2014-04-07 10:16:35

解决方案3
93 2014-10-20 05:03:44

解决方案4
83 2016-01-22 07:23:00

解决方案5
57 2014-12-19 09:36:14

解决方案6
41 2018-04-24 19:01:15

解决方案7
22 2015-08-29 14:53:33

解决方案8
16 2015-03-12 07:20:43

解决方案9
15 2018-04-24 18:55:01

解决方案10
14 2018-10-18 14:16:06

解决方案11
10 2018-05-19 09:56:58

解决方案12
9 2015-03-16 01:31:19

解决方案13
5 2014-07-02 02:58:41

解决方案14
4 2017-08-19 23:30:32

解决方案15
3 2019-03-29 23:24:39

解决方案16
2 2016-10-31 09:53:26

解决方案17
2 2017-01-27 21:47:25

解决方案18
1 2022-10-19 10:03:03

解决方案19
0 2015-10-12 16:08:19

解决方案20
0 2022-07-29 19:40:24

解决方案21
0 2022-07-31 15:03:39

解决方案22
-1 2015-07-09 03:52:47

解决方案23
-1 2017-11-02 10:55:44

解决方案24
-2 2018-04-09 06:37:55

解决方案25
-3 2016-06-01 17:16:00

递归计数 Linux 目录中的文件

问题描述

24 个解决方案

解决方案1 1695 已采纳 2012-02-06 08:02:30

解决方案2 125 2014-04-07 10:16:35

解决方案3 93 2014-10-20 05:03:44

解决方案4 83 2016-01-22 07:23:00

解决方案5 57 2014-12-19 09:36:14

解决方案6 41 2018-04-24 19:01:15

解决方案7 22 2015-08-29 14:53:33

解决方案8 16 2015-03-12 07:20:43

解决方案9 15 2018-04-24 18:55:01

解决方案10 14 2018-10-18 14:16:06

解决方案11 10 2018-05-19 09:56:58

解决方案12 9 2015-03-16 01:31:19

解决方案13 5 2014-07-02 02:58:41

解决方案14 4 2017-08-19 23:30:32

解决方案15 3 2019-03-29 23:24:39

解决方案16 2 2016-10-31 09:53:26

解决方案17 2 2017-01-27 21:47:25

解决方案18 1 2022-10-19 10:03:03

解决方案19 0 2015-10-12 16:08:19

解决方案20 0 2022-07-29 19:40:24

解决方案21 0 2022-07-31 15:03:39

解决方案22 -1 2015-07-09 03:52:47

解决方案23 -1 2017-11-02 10:55:44

解决方案24 -2 2018-04-09 06:37:55

解决方案25 -3 2016-06-01 17:16:00

解决方案1
1695 已采纳 2012-02-06 08:02:30

解决方案2
125 2014-04-07 10:16:35

解决方案3
93 2014-10-20 05:03:44

解决方案4
83 2016-01-22 07:23:00

解决方案5
57 2014-12-19 09:36:14

解决方案6
41 2018-04-24 19:01:15

解决方案7
22 2015-08-29 14:53:33

解决方案8
16 2015-03-12 07:20:43

解决方案9
15 2018-04-24 18:55:01

解决方案10
14 2018-10-18 14:16:06

解决方案11
10 2018-05-19 09:56:58

解决方案12
9 2015-03-16 01:31:19

解决方案13
5 2014-07-02 02:58:41

解决方案14
4 2017-08-19 23:30:32

解决方案15
3 2019-03-29 23:24:39

解决方案16
2 2016-10-31 09:53:26

解决方案17
2 2017-01-27 21:47:25

解决方案18
1 2022-10-19 10:03:03

解决方案19
0 2015-10-12 16:08:19

解决方案20
0 2022-07-29 19:40:24

解决方案21
0 2022-07-31 15:03:39

解决方案22
-1 2015-07-09 03:52:47

解决方案23
-1 2017-11-02 10:55:44

解决方案24
-2 2018-04-09 06:37:55

解决方案25
-3 2016-06-01 17:16:00