简体   繁体   English

使用排除列表在bash中查找包含find的目录

[英]Finding directories with find in bash using a exclude list

now before you think, "this has been done before" please read on. 现在在你想到之前,“这已经完成了”请继续阅读。

Like most of the people trying to do a find bash script you end up hard-coding the script to a single line command, but end up editing the thing over the following months/years so often that you wish in the end you did it right the first time. 像大多数尝试查找bash脚本的人一样,你最终会将脚本硬编码为单行命令,但最终会在接下来的几个月/几年内编辑这个东西,所以你希望最后你做得对第一次。

I am writing a little backup program right now to do backups of directories and need to find them, against a list of directorie's that needs to be excluded. 我现在正在编写一个小备份程序来备份目录并需要找到它们,而不是需要排除的Directorie列表。 Easier said than done. 说起来容易做起来难。 Let me set the stage: 让我开始吧:

#!/bin/bash
BasePath="/home/adesso/baldar"
declare -a Iggy
Iggy=( "/cgi-bin" 
    "/tmp" 
    "/test" 
    "/html" 
    "/icons" )
IggySubdomains=$(printf ",%s" "${Iggy[@]}")
IggySubdomains=${IggySubdomains:1}
echo $IggySubdomains
exit 0

Now at the end of this you get /cgi-bin,/tmp,/test,/html,/icons This proves that the concept works, but now to take it a bit further I need to use find to search the BasePath and search only one level deep for all subdirectories and exclude the list of subdirectories in the array... 现在在这结束时你得到/ cgi-bin,/ tmp,/ test,/ html,/ icons这证明这个概念有效,但是现在为了更进一步,我需要使用find来搜索BasePath并搜索所有子目录只有一个级别,并排除数组中的子目录列表...

If I type this by hand it would be: 如果我手动输入,它将是:

find /var/www/* \( -path '*/cgi-bin' -o -path '*/tmp' -o -path '*/test' -o -path '*/html' -o -path '*/icons' \) -prune -type d

And should I maybe want to loop into each subdirectory and do the same... I hope you get my point. 我是否应该想循环到每个子目录并做同样的事情......我希望你明白我的观点。

So What I am trying to do seem possible, but I have a bit of a problem, printf ",%s" doesn't like me using all those find -path or -o options. 所以我想要做的事情似乎有可能,但我有点问题, printf“,%s”不喜欢我使用所有这些find -path或-o选项。 Does this mean I have to use eval again? 这是否意味着我必须再次使用eval?

I am trying to use the power of bash here, and not some for loop. 我试图在这里使用bash的功能,而不是一些for循环。 Any constructive input would be appreciated. 任何建设性的意见将不胜感激。

Try something like 尝试类似的东西

find /var/www/* \( -path "${Iggy[0]}" $(printf -- '-o -path "*%s" ' "${Iggy[@]:1}") \) -prune -type d

and see what happens. 看看会发生什么。

EDIT: added the leading * to each path as in your example. 编辑:在示例中将前导*添加到每个路径。

And here's a complete solution based on your description. 这是基于您的描述的完整解决方案。

#!/usr/bin/env bash
basepath="/home/adesso/baldar"
ignore=("/cgi-bin" "/tmp" "/test" "/html" "/icons")

find "${basepath}" -maxdepth 1 -not \( -path "*${ignore[0]}" $(printf -- '-o -path "*%s" ' "${ignore[@]:1}") \) -not -path "${basepath}" -type d

Subdirectories of $basepath excluding those listed in $ignore, presuming at least two in $ignore (fixing that is not hard). $ basepath的子目录,不包括$ ignore中列出的那些,假设$ ignore中至少有两个(修复并不难)。

The existing answers are buggy when given directory names that contain literal whitespace. 当给定包含文字空格的目录名时,现有答案是错误的。 The safe and robust practice is to use a loop. 安全可靠的做法是使用循环。 If your concern is leveraging "the power of bash" -- I'd argue that a robust solution is more powerful than a buggy one. 如果你关心的是利用“bash的力量” - 我认为一个强大的解决方案比一个有缺陷的解决方案更强大。 :) :)

BasePath="/home/adesso/baldar"
declare -a Iggy=( "/cgi-bin" "/tmp" "/test" "/html" "/icons" )

find_cmd=( find "$BasePath" '(' )

## This is the conventional approach:
# for x in "${Iggy[@]}"; do
#  find_cmd+=( -path "*${x}" -o )
#done

## This is the unconventional, only-barely-safe approach
## ...used only to avoid looping:
printf -v find_cmd_str ' -path "*"%q -o ' "${Iggy[@]}"
find_cmd_str=${find_cmd_str%" -o "}
eval "find_cmd+=( $find_cmd_str )"

find_cmd=( "${find_cmd[@]:0:${#find_cmd[@]} - 1}"

# and add the suffix
find_cmd+=( ')' -prune -type d )

# ...finally, to run the command:
"${find_cmd[@]}"
FIND="$(which find --skip-alias)"
BasePath="/home/adesso/baldar"
Iggy=( "/cgi-bin" 
    "/tmp" 
    "/test" 
    "/html" 
    "/icons" )
SubDomains=( $(${FIND} ${BasePath}/* -maxdepth 0 -not \( -path "*${Iggy[0]}" $(printf -- '-o -path "*%s" ' "${Iggy[@]:1}") \) -type d) )
echo ${SubDomains[1]}

Thanks to @Sorpigal I have a solution. 感谢@Sorpigal我有一个解决方案。 I ended up nesting the command substitution so I can use the script in a cron, and finally added the Array definition around all of it. 我最终嵌套了命令替换,因此我可以在cron中使用该脚本,最后在所有部分中添加了Array定义。 A known problem would be a directory containing a space in the name. 已知问题是名称中包含空格的目录。 This however has been solved, so trying to keep it simple, I think this answers my question. 然而这已经解决了,所以试图保持简单,我认为这回答了我的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM