简体   繁体   English

如何递归遍历目录树并只查找文件?

[英]How to recursively traverse a directory tree and find only files?

I am working on a scp call to download a folder present on a remote system. 我正在进行scp调用以下载远程系统上的文件夹。 Downloaded folder has subfolders and within these subfolders there are a bunch of files which I want to pass as arguments to a python script like this: 下载的文件夹有子文件夹,在这些子文件夹中有一堆文件我想作为参数传递给python脚本,如下所示:

scp -r researcher@192.168.150.4:SomeName/SomeNameElse/$folder_name/ $folder_name/
echo "File downloaded successfully"
echo "Running BD scanner"
for d in $folder_name/*; do
        if [[ -d $d ]]; then
                echo "It is a directory"
        elif [[ -f $d ]]; then
                echo "It is a file"
                echo "Running the scanner :"
                 python bd_scanner_new.py /home/nsadmin/Some/bash_script_run_files/$d
        else
                echo "$d is invalid file"
                exit 1
        fi
done

I have added the logic to find if there are any directories and excluding them. 我添加了逻辑,以查找是否有任何目录并排除它们。 However, I don't traverse down those directories recursively. 但是,我不会递归地遍历这些目录。

Partial results below: 部分结果如下:

File downloaded succesfully
Running BD scanner
It is a directory
It is a directory
It is a directory
Exiting

I want to improve this code so that it traverses all directories and picks up all files. 我想改进这个代码,以便它遍历所有目录并获取所有文件。 Please help me with any suggestions. 请帮助我任何建议。

You can use shopt -s globstar in Bash 4.0+: 你可以在Bash 4.0+中使用shopt -s globstar

#!/bin/bash

shopt -s globstar nullglob
cd _your_base_dir
for file in **/*; do
  # will loop for all the regular files across the entire tree
  # files with white spaces or other special characters are gracefully handled
  python bd_scanner_new.py "$file"
done

Bash manual says this about globstar : 关于globstar Bash手册说这个:

If set, the pattern '**' used in a filename expansion context will match all files and zero or more directories and subdirectories. 如果设置,则文件名扩展上下文中使用的模式“**”将匹配所有文件以及零个或多个目录和子目录。 If the pattern is followed by a '/', only directories and subdirectories match. 如果模式后跟'/',则只有目录和子目录匹配。

More globstar discussion here: https://unix.stackexchange.com/questions/117826/bash-globstar-matching 这里有更多的globstar讨论: httpsglobstar

Why go through the trouble of using globbing for file matching but rather use find with is meant for this by using a process-substitution ( <() ) with a while-loop. 为什么要经历使用globbing进行文件匹配的麻烦,而是通过使用带有while循环的进程替换( <() )来使用find with。

#!/bin/bash

while IFS= read -r -d '' file; do
    # single filename is in $file
    python bd_scanner_new.py "$file"
done < <(find "$folder_name" -type f -print0)

Here, find does a recursive search of all the files from the mentioned path to any level of sub-directories below. 这里, find会对从上述路径中的所有文件到下面任何级别的子目录进行递归搜索。 Filenames can contain blanks, tabs, spaces, newlines. 文件名可以包含空格,制表符,空格,换行符。 To process filenames in a safe way, find with -print0 is used: filename is printed with all control characters & terminated with NUL which then is read command processes with the same de-limit character. 要以安全的方式处理文件名,请使用-print0查找:使用所有控制字符打印文件名并使用NUL终止,然后使用相同的限制字符read命令进程。

Note; 注意; On a side note, always double-quote variables in bash to avoid expansion by shell. 另外,在bash总是双引号变量以避免shell扩展。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM