简体   繁体   English

bash shell脚本找到几个文件最近的父目录

[英]bash shell script to find the closest parent directory of several files

Suppose the input arguments are the FULL paths of several files. 假设输入参数是几个文件的FULL路径。 Say, 说,

/abc/def/file1
/abc/def/ghi/file2
/abc/def/ghi/file3
  1. How can I obtain the directory name /abc/def in a bash shell script? 如何在bash shell脚本中获取目录名/abc/def
  2. How can I obtain only file1 , /ghi/file2 , and /ghi/file3 ? 我怎样才能获得file1/ghi/file2/ghi/file3

Given the answer for part 1 (the common prefix), the answer for part 2 is straight-forward; 鉴于第1部分(共同前缀)的答案,第2部分的答案是直截了当的; you slice the prefix off each name, which could be a done with sed amongst other options. 你将每个名字的前缀切掉,这可以用sed和其他选项完成。

The interesting part, then, is finding the common prefix. 然后,有趣的部分是找到共同的前缀。 The minimum common prefix is / (for /etc/passwd and /bin/sh , for example). 最小公共前缀是/ (例如,对于/etc/passwd/bin/sh )。 The maximum common prefix is (by definition) present in all the strings, so we simply need to split one of the strings into segments, and compare possible prefixes against the other strings. 最大公共前缀(根据定义)存在于所有字符串中,因此我们只需将其中一个字符串拆分为段,并将可能的前缀与其他字符串进行比较。 In outline: 概述:

split name A into components
known_prefix="/"
for each extra component from A
do
    possible_prefix="$known_prefix/$extra/"
    for each name
    do
        if $possible_prefix is not a prefix of $name
        then ...all done...break outer loop...
        fi
    done
    ...got here...possible prefix is a prefix!
    known_prefix=$possible_prefix
done

There are some administrivial details to deal with, such as spaces in names. 有一些管理细节需要处理,例如名称中的空格。 Also, what is the permitted weaponry. 还有什么是允许的武器。 The question is tagged bash but which external commands are allowed (Perl, for example)? 问题是标记为bash但允许哪些外部命令(例如Perl)?

One undefined issue — suppose the list of names was: 一个未定义的问题 - 假设名称列表是:

/abc/def/ghi
/abc/def/ghi/jkl
/abc/def/ghi/mno

Is the longest common prefix /abc/def or /abc/def/ghi ? 是最长的共同前缀/abc/def还是/abc/def/ghi I'm going to assume that the longest common prefix here is /abc/def . 我将假设这里最长的公共前缀是/abc/def (If you really wanted it to be /abc/def/ghi , then use /abc/def/ghi/. for the first of the names.) (如果你真的希望它是/abc/def/ghi ,那么使用/abc/def/ghi/.作为第一个名字。)

Also, there are invocation details: 此外,还有调用详细信息:

  • How is this function or command invoked? 如何调用此函数或命令?
  • How are the values returned? 如何返回值?
  • Is this one or two functions or commands ( longest_common_prefix and 'path_without_prefix`)? 这是一个或两个函数或命令( longest_common_prefix和'path_without_prefix`)?

Two commands are easier: 两个命令更容易:

  • prefix=$(longest_common_prefix name1 [name2 ...])
  • suffix=$(path_without_prefix /pre/fix /pre/fix/to/file [...])

The path_without_prefix command removes the prefix if it is present, leaving the argument unchanged if the prefix does not start the name. path_without_prefix命令会删除前缀(如果存在),如果前缀未启动名称,则保留参数不变。

longest_common_prefix longest_common_prefix

longest_common_prefix()
{
    declare -a names
    declare -a parts
    declare i=0

    names=("$@")
    name="$1"
    while x=$(dirname "$name"); [ "$x" != "/" ]
    do
        parts[$i]="$x"
        i=$(($i + 1))
        name="$x"
    done

    for prefix in "${parts[@]}" /
    do
        for name in "${names[@]}"
        do
            if [ "${name#$prefix/}" = "${name}" ]
            then continue 2
            fi
        done
        echo "$prefix"
        break
    done
}

Test: 测试:

set -- "/abc/def/file 0" /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3 "/abc/def/ghi/file 4"
echo "Test: $@"
longest_common_prefix "$@"
echo "Test: $@" abc/def
longest_common_prefix "$@" abc/def
set --  /abc/def/ghi/jkl /abc/def/ghi /abc/def/ghi/mno
echo "Test: $@"
longest_common_prefix "$@"
set -- /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3
echo "Test: $@"
longest_common_prefix "$@"
set -- "/a c/d f/file1" "/a c/d f/ghi/file2" "/a c/d f/ghi/file3"
echo "Test: $@"
longest_common_prefix "$@"

Output: 输出:

Test: /abc/def/file 0 /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3 /abc/def/ghi/file 4
/abc/def
Test: /abc/def/file 0 /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3 /abc/def/ghi/file 4 abc/def
Test: /abc/def/ghi/jkl /abc/def/ghi /abc/def/ghi/mno
/abc/def
Test: /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3
/abc/def
Test: /a c/d f/file1 /a c/d f/ghi/file2 /a c/d f/ghi/file3
/a c/d f

path_without_prefix path_without_prefix

path_without_prefix()
{
    local prefix="$1/"
    shift
    local arg
    for arg in "$@"
    do
        echo "${arg#$prefix}"
    done
}

Test: 测试:

for name in /pre/fix/abc /pre/fix/def/ghi /usr/bin/sh
do
    path_without_prefix /pre/fix $name
done

Output: 输出:

abc
def/ghi
/usr/bin/sh

A more "portable" solution, in the sense that it doesn't use bash-specific features: First define a function to compute the longest common prefix of two paths: 一个更“便携”的解决方案,在某种意义上它不使用特定于bash的特性:首先定义一个函数来计算两条路径的最长公共前缀:

function common_path()
{
  lhs=$1
  rhs=$2
  path=
  OLD_IFS=$IFS; IFS=/
  for w in $rhs; do
    test "$path" = / && try="/$w" || try="$path/$w"
    case $lhs in
      $try*) ;;
      *) break ;;
    esac
    path=$try
  done
  IFS=$OLD_IFS
  echo $path
}

Then use it for a long list of words: 然后将它用于一长串单词:

function common_path_all()
{
  local sofar=$1
  shift
  for arg
  do
    sofar=$(common_path "$sofar" "$arg")
  done
  echo ${sofar:-/}
}

With your input, it gives 根据您的输入,它给出了

$ common_path_all /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3
/abc/def

As Jonathan Leffler pointed out, once you have that, the second question is trivial. 正如Jonathan Leffler指出的那样,一旦你有了这个,第二个问题就是微不足道的。

Here's one that's been shown to work with arbitrarily complex file names (containing newlines, backspaces and the like): 这是一个被证明可以处理任意复杂文件名 (包含换行符,退格键等)的文件:

path_common() {
    if [ $# -ne 2 ]
    then
        return 2
    fi

    # Remove repeated slashes
    for param
    do
        param="$(printf %s. "$1" | tr -s "/")"
        set -- "$@" "${param%.}"
        shift
    done

    common_path="$1"
    shift

    for param
    do
        while case "${param%/}/" in "${common_path%/}/"*) false;; esac; do
            new_common_path="${common_path%/*}"
            if [ "$new_common_path" = "$common_path" ]
            then
                return 1 # Dead end
            fi
            common_path="$new_common_path"
        done
    done
    printf %s "$common_path"
}

It seems to me that the solution below is much simpler. 在我看来,下面的解决方案要简单得多。

As mentioned previously, only part 1 is tricky. 如前所述,只有第1部分是棘手的。 Part 2 is straightforward with sed. 第2部分是sed直截了当。

Part 1 can be cut into 2 subparts : 第1部分可以分为2个子部分:

  1. Finding the longest common prefix of all strings 查找所有字符串的最长公共前缀
  2. Making sure this prefix is a directory, and if not trimming it to get the corresponding directory 确保此前缀是一个目录,如果没有修改它以获取相应的目录

It can be done with the following code. 可以使用以下代码完成。 For the sake of clarity, this example uses only 2 strings, but a while loop gives you what you want with n strings. 为了清楚起见,此示例仅使用2个字符串,但while循环为您提供了n个字符串所需的内容。

LONGEST_PREFIX=$(printf "%s\n%s\n" "$file_1" "$file_2" | sed -e 'N;s/^\(.*\).*\n\1.*$/\1/')
CLOSEST_PARENT=$(echo "$LONGEST_PREFIX" | sed 's/\(.*\)\/.*/\1/')

which can of course be rewritten in just one line : 当然可以只用一行重写:

CLOSEST_PARENT=$(printf "%s\n%s\n" "$file_1" "$file_2" | sed -e 'N;s/^\(.*\).*\n\1.*$/\1/'  | sed 's/\(.*\)\/.*/\1/')

To get Parent's Directory: 获取父母的目录:

  dirname /abc/def/file1

will give /abc/def 将给/ abc / def

And to get the file name 并获取文件名

   basename /abc/def/file1

will give file1 将给file1

And According to your question to get only Closest Parent Directory name use 并根据您的问题,只使用最近的父目录名称

basename $(dirname $(/abc/def/file1))

will give def enter code here 会给DEF在此处输入代码

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM