简体   繁体   中英

Finding directories with find in bash using a exclude list

now before you think, "this has been done before" please read on.

Like most of the people trying to do a find bash script you end up hard-coding the script to a single line command, but end up editing the thing over the following months/years so often that you wish in the end you did it right the first time.

I am writing a little backup program right now to do backups of directories and need to find them, against a list of directorie's that needs to be excluded. Easier said than done. Let me set the stage:

#!/bin/bash
BasePath="/home/adesso/baldar"
declare -a Iggy
Iggy=( "/cgi-bin" 
    "/tmp" 
    "/test" 
    "/html" 
    "/icons" )
IggySubdomains=$(printf ",%s" "${Iggy[@]}")
IggySubdomains=${IggySubdomains:1}
echo $IggySubdomains
exit 0

Now at the end of this you get /cgi-bin,/tmp,/test,/html,/icons This proves that the concept works, but now to take it a bit further I need to use find to search the BasePath and search only one level deep for all subdirectories and exclude the list of subdirectories in the array...

If I type this by hand it would be:

find /var/www/* \( -path '*/cgi-bin' -o -path '*/tmp' -o -path '*/test' -o -path '*/html' -o -path '*/icons' \) -prune -type d

And should I maybe want to loop into each subdirectory and do the same... I hope you get my point.

So What I am trying to do seem possible, but I have a bit of a problem, printf ",%s" doesn't like me using all those find -path or -o options. Does this mean I have to use eval again?

I am trying to use the power of bash here, and not some for loop. Any constructive input would be appreciated.

Try something like

find /var/www/* \( -path "${Iggy[0]}" $(printf -- '-o -path "*%s" ' "${Iggy[@]:1}") \) -prune -type d

and see what happens.

EDIT: added the leading * to each path as in your example.

And here's a complete solution based on your description.

#!/usr/bin/env bash
basepath="/home/adesso/baldar"
ignore=("/cgi-bin" "/tmp" "/test" "/html" "/icons")

find "${basepath}" -maxdepth 1 -not \( -path "*${ignore[0]}" $(printf -- '-o -path "*%s" ' "${ignore[@]:1}") \) -not -path "${basepath}" -type d

Subdirectories of $basepath excluding those listed in $ignore, presuming at least two in $ignore (fixing that is not hard).

The existing answers are buggy when given directory names that contain literal whitespace. The safe and robust practice is to use a loop. If your concern is leveraging "the power of bash" -- I'd argue that a robust solution is more powerful than a buggy one. :)

BasePath="/home/adesso/baldar"
declare -a Iggy=( "/cgi-bin" "/tmp" "/test" "/html" "/icons" )

find_cmd=( find "$BasePath" '(' )

## This is the conventional approach:
# for x in "${Iggy[@]}"; do
#  find_cmd+=( -path "*${x}" -o )
#done

## This is the unconventional, only-barely-safe approach
## ...used only to avoid looping:
printf -v find_cmd_str ' -path "*"%q -o ' "${Iggy[@]}"
find_cmd_str=${find_cmd_str%" -o "}
eval "find_cmd+=( $find_cmd_str )"

find_cmd=( "${find_cmd[@]:0:${#find_cmd[@]} - 1}"

# and add the suffix
find_cmd+=( ')' -prune -type d )

# ...finally, to run the command:
"${find_cmd[@]}"
FIND="$(which find --skip-alias)"
BasePath="/home/adesso/baldar"
Iggy=( "/cgi-bin" 
    "/tmp" 
    "/test" 
    "/html" 
    "/icons" )
SubDomains=( $(${FIND} ${BasePath}/* -maxdepth 0 -not \( -path "*${Iggy[0]}" $(printf -- '-o -path "*%s" ' "${Iggy[@]:1}") \) -type d) )
echo ${SubDomains[1]}

Thanks to @Sorpigal I have a solution. I ended up nesting the command substitution so I can use the script in a cron, and finally added the Array definition around all of it. A known problem would be a directory containing a space in the name. This however has been solved, so trying to keep it simple, I think this answers my question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM