简体   繁体   English

如何递归遍历目录以删除具有某些扩展名的文件

[英]How to loop through a directory recursively to delete files with certain extensions

I need to loop through a directory recursively and remove all files with extension .pdf and .doc .我需要递归遍历目录并删除所有扩展名为.pdf.doc文件。 I'm managing to loop through a directory recursively but not managing to filter the files with the above mentioned file extensions.我设法递归地遍历目录,但没有设法过滤具有上述文件扩展名的文件。

My code so far到目前为止我的代码

#/bin/sh

SEARCH_FOLDER="/tmp/*"

for f in $SEARCH_FOLDER
do
    if [ -d "$f" ]
    then
        for ff in $f/*
        do      
            echo "Processing $ff"
        done
    else
        echo "Processing file $f"
    fi
done

I need help to complete the code, since I'm not getting anywhere.我需要帮助来完成代码,因为我无处可去。

As a followup to mouviciel's answer, you could also do this as a for loop, instead of using xargs.作为 mouviciel 答案的后续,您也可以将其作为 for 循环执行,而不是使用 xargs。 I often find xargs cumbersome, especially if I need to do something more complicated in each iteration.我经常发现 xargs 很麻烦,特别是如果我需要在每次迭代中做一些更复杂的事情。

for f in $(find /tmp -name '*.pdf' -or -name '*.doc'); do rm $f; done

As a number of people have commented, this will fail if there are spaces in filenames.正如许多人所评论的那样,如果文件名中有空格,这将失败。 You can work around this by temporarily setting the IFS (internal field seperator) to the newline character.您可以通过将 IFS(内部字段分隔符)临时设置为换行符来解决此问题。 This also fails if there are wildcard characters \\[?* in the file names.如果文件名中包含通配符\\[?* ,这也会失败。 You can work around that by temporarily disabling wildcard expansion (globbing).您可以通过暂时禁用通配符扩展(通配符)来解决这个问题。

IFS=$'\n'; set -f
for f in $(find /tmp -name '*.pdf' -or -name '*.doc'); do rm "$f"; done
unset IFS; set +f

If you have newlines in your filenames, then that won't work either.如果您的文件名中有换行符,那么这也不起作用。 You're better off with an xargs based solution:您最好使用基于 xargs 的解决方案:

find /tmp \( -name '*.pdf' -or -name '*.doc' \) -print0 | xargs -0 rm

(The escaped brackets are required here to have the -print0 apply to both or clauses.) (此处需要转义括号以使-print0应用于两者or子句。)

GNU and *BSD find also has a -delete action, which would look like this: GNU 和 *BSD find 也有一个-delete动作,看起来像这样:

find /tmp \( -name '*.pdf' -or -name '*.doc' \) -delete

find就是为此而生的。

find /tmp -name '*.pdf' -or -name '*.doc' | xargs rm

Without find :没有find

for f in /tmp/* tmp/**/* ; do
  ...
done;

/tmp/* are files in dir and /tmp/**/* are files in subfolders. /tmp/*是目录中的文件, /tmp/**/*是子文件夹中的文件。 It is possible that you have to enable globstar option ( shopt -s globstar ).您可能必须启用 globstar 选项( shopt -s globstar )。 So for the question the code should look like this:所以对于这个问题,代码应该是这样的:

shopt -s globstar
for f in /tmp/*.pdf /tmp/*.doc tmp/**/*.pdf tmp/**/*.doc ; do
  rm "$f"
done

Note that this requires bash ≥4.0 (or zsh without shopt -s globstar , or ksh with set -o globstar instead of shopt -s globstar ).请注意,这需要 bash ≥4.0(或 zsh 不带shopt -s globstar ,或 ksh 带set -o globstar而不是shopt -s globstar )。 Furthermore, in bash <4.3, this traverses symbolic links to directories as well as directories, which is usually not desirable.此外,在 bash <4.3 中,这会遍历目录和目录的符号链接,这通常是不可取的。

If you want to do something recursively, I suggest you use recursion (yes, you can do it using stacks and so on, but hey).如果你想递归地做某事,我建议你使用递归(是的,你可以使用堆栈等来做,但是嘿)。

recursiverm() {
  for d in *; do
    if [ -d "$d" ]; then
      (cd -- "$d" && recursiverm)
    fi
    rm -f *.pdf
    rm -f *.doc
  done
}

(cd /tmp; recursiverm)

That said, find is probably a better choice as has already been suggested.也就是说,正如已经建议的那样, find可能是更好的选择。

This doesn't answer your question directly, but you can solve your problem with a one-liner:这不会直接回答您的问题,但您可以使用单线解决您的问题:

find /tmp \( -name "*.pdf" -o -name "*.doc" \) -type f -exec rm {} +

Some versions of find (GNU, BSD) have a -delete action which you can use instead of calling rm :某些版本的 find(GNU、BSD)具有-delete操作,您可以使用它来代替调用rm

find /tmp \( -name "*.pdf" -o -name "*.doc" \) -type f -delete

Here is an example using shell ( bash ):这是一个使用 shell ( bash ) 的示例:

#!/bin/bash

# loop & print a folder recusively,
print_folder_recurse() {
    for i in "$1"/*;do
        if [ -d "$i" ];then
            echo "dir: $i"
            print_folder_recurse "$i"
        elif [ -f "$i" ]; then
            echo "file: $i"
        fi
    done
}


# try get path from param
path=""
if [ -d "$1" ]; then
    path=$1;
else
    path="/tmp"
fi

echo "base path: $path"
print_folder_recurse $path

This method handles spaces well.这种方法可以很好地处理空格。

files="$(find -L "$dir" -type f)"
echo "Count: $(echo -n "$files" | wc -l)"
echo "$files" | while read file; do
  echo "$file"
done

Edit, fixes off-by-one编辑,一一修复

function count() {
    files="$(find -L "$1" -type f)";
    if [[ "$files" == "" ]]; then
        echo "No files";
        return 0;
    fi
    file_count=$(echo "$files" | wc -l)
    echo "Count: $file_count"
    echo "$files" | while read file; do
        echo "$file"
    done
}

For bash (since version 4.0):对于 bash(从 4.0 版开始):

shopt -s globstar nullglob dotglob
echo **/*".ext"

That's all.仅此而已。
The trailing extension ".ext" there to select files (or dirs) with that extension.尾随扩展名“.ext”用于选择具有该扩展名的文件(或目录)。

Option globstar activates the ** (search recursivelly).选项 globstar 激活 **(递归搜索)。
Option nullglob removes an * when it matches no file/dir.选项 nullglob 在不匹配任何文件/目录时删除 *。
Option dotglob includes files that start wit a dot (hidden files).选项 dotglob 包括以点开头的文件(隐藏文件)。

Beware that before bash 4.3, **/ also traverses symbolic links to directories which is not desirable.请注意,在 bash 4.3 之前, **/还会遍历指向目录的符号链接,这是不可取的。

The following function would recursively iterate through all the directories in the \\home\\ubuntu directory( whole directory structure under ubuntu ) and apply the necessary checks in else block.以下函数将递归遍历\\home\\ubuntu目录中的所有目录(ubuntu 下的整个目录结构),并在else块中应用必要的检查。

function check {
        for file in $1/*      
        do
        if [ -d "$file" ]
        then
                check $file                          
        else
               ##check for the file
               if [ $(head -c 4 "$file") = "%PDF" ]; then
                         rm -r $file
               fi
        fi
        done     
}
domain=/home/ubuntu
check $domain

There is no reason to pipe the output of find into another utility.没有理由将find的输出通过管道传输到另一个实用程序中。 find has a -delete flag built into it. find有一个内置的-delete标志。

find /tmp -name '*.pdf' -or -name '*.doc' -delete

This is the simplest way I know to do this: rm **/@(*.doc|*.pdf)这是我知道的最简单的方法: rm **/@(*.doc|*.pdf)

** makes this work recursively **使这项工作递归

@(*.doc|*.pdf) looks for a file ending in pdf OR doc @(*.doc|*.pdf)查找以 pdf 或 doc 结尾的文件

Easy to safely test by replacing rm with ls通过将rm替换为ls轻松安全地进行测试

The other answers provided will not include files or directories that start with a .提供的其他答案不包括以 . the following worked for me:以下对我有用:

#/bin/sh
getAll()
{
  local fl1="$1"/*;
  local fl2="$1"/.[!.]*; 
  local fl3="$1"/..?*;
  for inpath in "$1"/* "$1"/.[!.]* "$1"/..?*; do
    if [ "$inpath" != "$fl1" -a "$inpath" != "$fl2" -a "$inpath" != "$fl3" ]; then 
      stat --printf="%F\0%n\0\n" -- "$inpath";
      if [ -d "$inpath" ]; then
        getAll "$inpath"
      #elif [ -f $inpath ]; then
      fi;
    fi;
  done;
}

I think the most straightforward solution is to use recursion, in the following example, I have printed all the file names in the directory and its subdirectories.我认为最直接的解决方案是使用递归,在下面的示例中,我已经打印了目录及其子目录中的所有文件名。

You can modify it according to your needs.您可以根据需要对其进行修改。

#!/bin/bash    
printAll() {
    for i in "$1"/*;do # for all in the root 
        if [ -f "$i" ]; then # if a file exists
            echo "$i" # print the file name
        elif [ -d "$i" ];then # if a directroy exists
            printAll "$i" # call printAll inside it (recursion)
        fi
    done 
}
printAll $1 # e.g.: ./printAll.sh .

OUTPUT:输出:

> ./printAll.sh .
./demoDir/4
./demoDir/mo st/1
./demoDir/m2/1557/5
./demoDir/Me/nna/7
./TEST

It works fine with spaces as well!它也适用于空格!

Note : You can use echo $(basename "$i") # print the file name to print the file name without its path.注意:您可以使用echo $(basename "$i") # print the file name打印不带路径的文件名。

OR : Use echo ${i%/##*/}; # print the file name:使用echo ${i%/##*/}; # print the file name echo ${i%/##*/}; # print the file name which runs extremely faster, without having to call the external basename . echo ${i%/##*/}; # print the file name运行速度极快echo ${i%/##*/}; # print the file name ,无需调用外部basename

就做

find . -name '*.pdf'|xargs rm

以下将递归遍历给定目录并列出所有内容:

for d in /home/ubuntu/*; do echo "listing contents of dir: $d"; ls -l $d/; done

If you can change the shell used to run the command, you can use ZSH to do the job.如果您可以更改用于运行命令的 shell,则可以使用 ZSH 来完成这项工作。

#!/usr/bin/zsh

for file in /tmp/**/*
do
    echo $file
done

This will recursively loop through all files/folders.这将递归地遍历所有文件/文件夹。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM