简体   繁体   English

使用 bash 循环嵌套文件夹以在当前工作目录中运行脚本

[英]Using bash to loop through nested folders to run script in current working directory

I've got (what feels like) a fairly simple problem but my complete lack of experience in bash has left me stumped.我有(感觉像)一个相当简单的问题,但我完全缺乏 bash 的经验让我很难过。 I've spent all day trying to synthesize a script from many different SO threads explaining how to do specific things with unintuitive commands, but I can't figure out how to make them work together for the life of me.我花了一整天的时间试图从许多不同的 SO 线程中合成一个脚本,解释如何使用不直观的命令来做特定的事情,但我不知道如何让它们在我的一生中一起工作。

Here is my situation: I've got a directory full of nested folders each containing a file with extension.7 and another file with extension.pc, plus a whole bunch of unrelated stuff.这是我的情况:我有一个充满嵌套文件夹的目录,每个文件夹都包含一个扩展名为.7 的文件和另一个扩展名为.pc 的文件,以及一大堆不相关的东西。 It looks like this:它看起来像这样:

Folder A
   Folder 1
      Folder x
        data_01.7
        helper_01.pc
        ...
      Folder y
        data_02.7
        helper_02.pc
        ...
   ...
   Folder 2
      Folder z
        data_03.7
        helper_03.pc
      ...
   ...
Folder B
...

I've got a script that I need to run in each of these folders that takes in the name of the.7 file as an input.我有一个脚本,我需要在这些文件夹中的每一个中运行该脚本,该脚本将 .7 文件的名称作为输入。

pc_script -f data.7 -flag1 -other_flags

The current working directory needs to be the folder with the.7 file when running the script and the helper.pc file also needs to be present in it.运行脚本时,当前工作目录需要是.7文件所在的文件夹,并且其中还需要有helper.pc文件。 After the script is finished running, there are a ton of new files and directories.脚本运行完成后,会有大量的新文件和目录。 However, I need to take just one of those output files, result.h5, and copy it to a new directory maintaining the same folder structure but with a new name:但是,我只需要获取其中一个 output 文件 result.h5,并将其复制到保持相同文件夹结构但使用新名称的新目录中:

Result Folder/Folder A/Folder 1/Folder x/new_result1.h5

I then need to run the same script again with a different flag, flag2, and copy the new version of that output file to the same result directory with a different name, new_result2.h5.然后,我需要使用不同的标志 flag2 再次运行相同的脚本,并将该 output 文件的新版本复制到具有不同名称 new_result2.h5 的相同结果目录。 The folders all have pretty arbitrary names, though there aren't any spaces or special characters beyond underscores.这些文件夹都有相当随意的名称,尽管除了下划线之外没有任何空格或特殊字符。

Here is an example of what I've tried:这是我尝试过的示例:

#!/bin/bash

DIR=".../project/data"
for d in */ ; do
    for e in */ ; do
        for f in */ ; do
            for PFILE in *.7 ; do
                echo "$d/$e/$f/$PFILE"
                cd "$DIR/$d/$e/$f"
                echo "Performing operation 1"
                pc_script -f "$PFILE" -flag1
                mkdir -p ".../results/$d/$e/$f"
                mv "results.h5" ".../project/results/$d/$e/$f/new_results1.h5"
                echo "Performing operation 2"
                pc_script -f "$PFILE" -flag 2
                mv "results.h5" ".../project/results/$d/$e/$f/new_results2.h5"
            done
        done
    done
done

Obviously, this didn't work.显然,这没有奏效。 I've also tried using find with -execdir but then I couldn't figure out how to insert the name of the file into the script flag.我也尝试过将 find 与 -execdir 一起使用,但后来我不知道如何将文件名插入到脚本标志中。 I'd appreciate any help or suggestions on how to carry this out.我将不胜感激有关如何执行此操作的任何帮助或建议。

If there's only one .7 file in each directory then you can try this:如果每个目录中只有一个.7文件,那么您可以试试这个:

#!/bin/bash
shopt -s globstar nullglob

saveroot=project/results
dataroot=project/data

for filepath in "${dataroot}"/**/*.7
do
    dirpath="${filepath%/*}"
    filename=${filepath#"$dirpath"/}

    pushd "$dirpath" > /dev/null || continue

    echo "$filepath"
    echo "Performing operation 1"
    #pc_script -f "$filename" -flag1
    touch results.h5
    mv results.h5 results_1.h5

    echo "Performing operation 2"
    #pc_script -f "$filename" -flag2
    touch results.h5
    mv results.h5 results_2.h5

    popd > /dev/null

    savepath="$saveroot/${dirpath#"$dataroot"}"
    mkdir -p "${savepath}"
    mv "${dirpath}"/results_*.h5 "$savepath"/
done

The script doesn't check for the existence of the .pc file, but if the naming of your files is like in the question then it's feasible.该脚本不会检查.pc文件是否存在,但如果文件的命名与问题中的一样,那么它是可行的。

Another, perhaps more flexible, approach to the problem is to use the find command with the -exec option to run a short "helper-script" for each file found below a directory path that ends in ".7" .另一种可能更灵活的解决问题的方法是使用带有-exec选项的find命令为在以".7" The -name option allows find to locate all files ending in ".7" below a given directory using simple file-globbing (wildcards). -name选项允许find使用简单的文件通配符(通配符)在给定目录下找到所有以".7"结尾的文件。 The helper-script then performs the same operation on each file found by find and handles moving the result.h5 to the proper directory.然后,帮助程序脚本对find的每个文件执行相同的操作,并将result.h5移动到正确的目录。

The form of the command will be:命令的形式为:

find /path/to/search -type f -name "*.7" -exec /path/to/helper-script '{}` \;

Where the -f option tells find to only return files (not directories) ending in ".7" . -f选项告诉find只返回以".7"结尾的文件(而不是目录)。 Your helper-script needs to be executable (eg chmod +x helper-script ) and unless it is in your PATH , you must provide the full path to the script in the find command.您的helper-script必须是可执行的(例如chmod +x helper-script ),除非它在您的PATH中,否则您必须在 find 命令中提供脚本的完整路径。 The '{}' will be replaced by the filename (including relative path) and passed as an argument to your helper-script . '{}'将被文件名(包括相对路径)替换并作为参数传递给您的helper-script The \; \; simply terminates the command executed by -exec .简单地终止由-exec执行的命令。

(note there is another form for -exec called -execdir and another terminator '+' that can be used to process the command on all files in a given directory -- that is a bit safer, but has additional PATH requirements for the command being run. Since you have only one ".7" file per-directory -- there isn't much benefit here) (请注意, -exec有另一种形式,称为-execdir和另一个终止符'+' ,可用于处理给定目录中所有文件的命令——这更安全一些,但对命令有额外的PATH要求运行。因为每个目录只有一个".7"文件——这里没有太多好处)

The helper-script just does what you need to do in each directory. helper-script只是在每个目录中执行您需要执行的操作。 Based on your description it could be something like the following:根据您的描述,它可能类似于以下内容:

#!/bin/bash

dir="${1%/*}"     ## trim file.7 from end of path
cd "$dir" || {    ## change to directory or handle error
  printf "unable to change to directory %s\n" "$dir" >&2
  exit 1
}

destdir="/Result_Folder/$dir"   ## set destination dir for result.h5
mkdir -p "$destdir" || {        ## create with all parent dirs or exit
  printf "unable to create directory %s\n" "$dir" >&2
  exit 1
}

ls *.pc 2>/dev/null || exit 1   ## check .pc file exists or exit

file7="${1##*/}"  ## trim path from file.7 name

pc_script -f "$file7" -flags1 -other_flags    ## first run

## check result.h5 exists and non-empty and copy to destdir
[ -s "result.h5" ] && cp -a "result.h5" "$destdir/new_result1.h5"

pc_script -f "$file7" -flags2 -other_flags    ## second run

## check result.h5 exists and non-empty and copy to destdir
[ -s "result.h5" ] && cp -a "result.h5" "$destdir/new_result2.h5"

Which essentially stores the path part of the file.7 argument in dir and changes to that directory.它本质file.7参数的路径部分存储在dir中并更改为该目录。 If unable to change to the directory (due to read-permissions, etc..) the error is handled and the script exits.如果无法更改到目录(由于读取权限等),则会处理错误并退出脚本。 Next the full directory structure is created below your Result_Folder with mkdir -p with the same error handling if the directory cannot be created.接下来,使用mkdir -pResult_Folder下创建完整的目录结构,如果无法创建目录,则使用相同的错误处理。

ls is used as a simple check to verify that a file ending in ".pc" exits in that directory. ls用作一个简单的检查,以验证以".pc"结尾的文件是否存在于该目录中。 There are other ways to do this by piping the results to wc -l , but that spawns additional subshells that are best avoided.还有其他方法可以通过将结果传递给wc -l来做到这一点,但这会产生最好避免的额外子shell。

(also note that Linux and Mac have files ending in ".pc" for use by pkg-config used when building programs from source -- they should not conflict with your files -- but be aware they exists in case you start chasing why weird ".pc" files are found) (另请注意,Linux 和 Mac 的文件以".pc"结尾,供pkg-config在从源代码构建程序时使用——它们不应与你的文件冲突——但请注意它们的存在,以防你开始追逐为什么奇怪找到".pc"文件)

After all tests are performed, the path is trimmed from the current ".7" filename storing just the filename in file7 .执行完所有测试后,将从当前的".7"文件名修剪路径,仅将文件名存储在file7中。 The file7 variabli is then used in your pc_script command (which should also include the full path to the script if not in you PATH ).然后在您的file7命令中使用pc_script (如果不在PATH中,还应包括脚本的完整路径)。 After the pc_script is run [ -s "result.h5" ] is used to verify that result.h5 exists and is non-empty before moving that file to your Result_Folder location.运行pc_script[ -s "result.h5" ]用于验证result.h5是否存在并且在将该文件移动到Result_Folder位置之前是否为非空。

That should get you started.那应该让你开始。 Using find to locate all .7 files is a simple way to let the tool designed to find the files for you do its job -- rather than trying to hand-roll your own solution.使用find定位所有.7文件是一种简单的方法,可以让旨在为您查找文件的工具完成其工作 - 而不是尝试手动推出您自己的解决方案。 That way you only have to concentrate on what should be done for each file found.这样,您只需专注于对找到的每个文件应该做什么。 (note: I don't have pc_script or the files, so I have not testes this end-to-end, but it should be very close if not right-on-the-money) (注意:我没有pc_script或文件,所以我没有测试这个端到端,但如果不是正确的,它应该非常接近)

There is nothing wrong in writing your own routine, but using find eliminates a lot of area where bugs can hide in your own solution.编写自己的例程并没有错,但是使用find消除很多错误可以隐藏在您自己的解决方案中的区域。

Let me know if you have further questions.如果您还有其他问题,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM