简体   繁体   中英

bash: recursively find the subdirectory which contains the largest number of immediate child files

I'm looking for a shell command to run in Bash which finds the subdirectory which contains the largest number of files. Execution time isn't a huge concern; it's clear that there will need to be a big trawl/sort operation to determine this result. The question is, how to compute this?

My first thought was to use a command of the form find -type d -exec find {} -maxdepth 1 -type f | wc -l find -type d -exec find {} -maxdepth 1 -type f | wc -l , but it turns out that you can't pipe within a find command like that.

So ... A find based option could work, and you can still pipe as long as what you exec is a shell.

For example, perhaps something like this, to get a list:

find /path -type d -exec sh -c 'find "$0" -maxdepth 1 -type f | wc -l' {} \; -print | paste - -

But .. I'd probably do this in pure bash:

shopt -s globstar nullglob

for d in **/; do
  printf '%s\t%s\n' $( cd "$d"; a=(*); b=(*/); echo $((${#a[@]}-${#b[@]})) ) "$d"
done

In both of these cases, the result can be sorted numerically and trimmed with a pipe:

  | sort -nr | head -1

or if you're sensitive to too many pipes, with a tiny awk script:

  | awk '$1>n{n=$1;line=$0} END {print line}'

I'm not sure which of these is simpler, find or bash. I would expect that the find solution will run faster, but I'd love to hear of your results with each.

Note that globstar requires bash 4 or above.

It turns out that the way to do this is with a bash script. This should produce the intended results:

(
    for path in $(find . -type d) ; do 
        # assigning the output to a variable strips the newline
        files=$(find "$path" -maxdepth 1 -type f | wc -l) ; 
        echo $files $path ; 
    done
) | sort -rg | head

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM