简体   繁体   English

Shell脚本迭代抛出目录和拆分文件名

[英]shell script iterate throw directories and split filenames

I need to extract 2 things from filenames - the extension and a number. 我需要从文件名中提取2件事-扩展名和数字。

I have a folder "/var/www/html/MyFolder/", this folder contains a few more folders and in each folder are some files stored. 我有一个文件夹“ / var / www / html / MyFolder /”,该文件夹包含更多文件夹,并且在每个文件夹中都存储了一些文件。 The file has the following structure: "a_X_mytest.jpg" or "a_X_mytest.png". 该文件具有以下结构:“ a_X_mytest.jpg”或“ a_X_mytest.png”。 The "a_" is fix and in each folder the same, and i need the "X" and the file extension. “ a_”是固定的,并且在每个文件夹中都相同,我需要“ X”和文件扩展名。

My script looks like this: 我的脚本如下所示:

#!/bin/bash
for dir in /var/www/html/MyFolder/*/
do
  dir=${dir%*/}
  find "/var/www/html/MyFolder/${dir##*/}/a_*.*" -maxdepth 1 -mindepth 1 -type f
done

That's only the beginning from my script. 那只是我脚本的开始。

There is a mistake in my script: 我的脚本有一个错误:

find: `/var/www/html/MyFolder/first/a_*.*': No such file or directory
find: `/var/www/html/MyFolder/sec/a_*.*': No such file or directory
find: `/var/www/html/MyFolder/test/a_*.*': No such file or directory

Does anybody know where the mistake is? 有人知道错误在哪里吗? The next step, when the lines above are working, is to split the found files and get the two parts. 当上面的行有效时,下一步是拆分找到的文件并获得两部分。

To split i would use this: 拆分我将使用此:

arrFIRST=(${IN//_/ })
echo ${arrFIRST[1]}
arrEXT=(${IN//./ })
echo ${arrEXT[1]}

Can anybody help me with my problem? 有人可以帮助我解决我的问题吗?

tl;dr: tl; dr:

Your script can be simplified to the following: 您的脚本可以简化为以下内容:

for file in /var/www/html/MyFolder/*/a_*.*; do
  [[ -f $file ]] || continue
  [[ "${file##*/}" =~ _(.*)_.*\.(.*)$ ]] && 
    x=${BASH_REMATCH[1]} ext=${BASH_REMATCH[2]}
  echo "$x"
  echo "$ext"
done
  • A single glob (filename pattern, wildcard pattern) is sufficient in your case , because a glob can have multiple wildcards across levels of the hierarchy : /var/www/html/MyFolder/*/a_*.* finds files matching a_*.* in any immediate subfolder of ( */ ) of folder /var/www/html/MyFolder . 在您的情况下 ,一个glob(文件名模式,通配符模式)就足够了 ,因为glob可以在层次结构的各个级别上具有多个通配符: /var/www/html/MyFolder/*/a_*.* a_*.* /var/www/html/MyFolder/*/a_*.* _ a_*.*查找与a_*.*匹配的文件a_*.*/var/www/html/MyFolder文件夹( */ )的任何直接子文件夹中。
    You only need find to match files located on different levels of a subtree (but you may also need it for more complex matching needs). 您只需要find匹配位于子树不同级别上的文件(但是您可能还需要它来满足更复杂的匹配需求)。
  • [[ -f $file ]] || break [[ -f $file ]] || break ensures that only files are considered and also effectively exits the loop if NO matches are found. [[ -f $file ]] || break确保仅考虑文件 ,并且如果找不到匹配项,也可以有效退出循环。
  • [[ ... =~ ... ]] uses bash's regex-matching operator, =~ , to extract the tokens of interest from the filename part of each matching file ( ${file##*/} ). [[ ... =~ ... ]]使用bash的正则表达式匹配运算符=~从每个匹配文件( ${file##*/} )的文件名部分中提取感兴趣的标记。
  • The results of the regex matching are stored in reserved array variable "${BASH_REMATCH}" , with the 1st element containing what the 1st parenthesized subexpression ( (...) - aka capture group) captured, and so on. 正则表达式匹配的结果存储在保留的数组变量"${BASH_REMATCH}" ,第一个元素包含捕获的第一个带括号的子表达式( (...) -aka捕获组)的内容,依此类推。

    • Alternatively, you could have used read with an array to parse matching filenames into their components: 或者,您可以使用read with array来将匹配的文件名解析为它们的组件:

       IFS='_.' read -ra tokens <<<"${file##*/}" x="${tokens[0]}" ext="${tokens[@]: -1}" 

As for why what you tried didn't work : 至于为什么你尝试不起作用

  • find does NOT support globs as filename arguments, so it interprets "/var/www/html/MyFolder/${dir##*/}/a_*.*" literally . find不支持将glob作为文件名参数,因此按字面意义解释为"/var/www/html/MyFolder/${dir##*/}/a_*.*"
  • Also, you have to separate the root folder for your search from the filename pattern to look for on any level of the root folder's subtree: 同样,您必须将要搜索的根文件夹文件名模式分开,以在根文件夹的子树的任何级别上进行查找:
    • the root folder becomes the filename argument 根文件夹成为文件名参数
    • the filename pattern is passed (always quoted) via the -name or -iname (for case-insensitive matching) options 文件名模式通过-name-iname (用于不区分大小写的匹配)选项传递(始终用引号引起来)
    • Ergo: find "/var/www/html/MyFolder/${dir##*/}" -name 'a_*.*' ... , analogous to @konsolebox' answer . 如此: find "/var/www/html/MyFolder/${dir##*/}" -name 'a_*.*' ... -name'a find "/var/www/html/MyFolder/${dir##*/}" -name 'a_*.*' ... ,类似于@konsolebox'answer

I'm not sure about the needed complexity but perhaps what you want is 我不确定所需的复杂性,但也许您想要的是

find /var/www/html/MyFolder/ -mindepth 2 -maxdepth 2 -type f -name 'a_*.*'

Thus: 从而:

while IFS= read -r FILE; do
    # Do something with "$FILE"...
done < <(exec find /var/www/html/MyFolder/ -mindepth 2 -maxdepth 2 -type f -name 'a_*.*')

Or 要么

readarray -t FILES < <(exec find /var/www/html/MyFolder/ -mindepth 2 -maxdepth 2 -type f -name 'a_*.*')
for FILE in "${FILES[@]}"; do
    # Do something with "$FILE"...
done

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM