简体   繁体   中英

Bash Scripting - REGEX to dump a file list

I have 4 files extensions as result of previous works, stored in the $SEARCH array, as follows :

declare -a SEARCH=("toggled" "jtr" "jtr.toggled" "cupp")

I want to issue one file list for each of the 4 above extension patterns, as follows, except for the case with 2 dots and 2 extensions (marked "NO") :

################################################################################
1 - SEARCH FOR toggled in /media
regex   : ([^\/]+)(\.)(toggled)$
command : find /media -type f | grep --color -P ([^\/]+)(\.)(toggled)$
################################################################################
/media/myfile_1.jtr.toggled --> NO
/media/myfile_1.toggled
/media/myfile_2.jtr.toggled --> NO
/media/myfile_2.toggled
/media/myfile_3.jtr.toggled --> NO
/media/myfile_3.toggled


################################################################################
2 - SEARCH FOR jtr in /media
regex   : ([^\/]+)(\.)(jtr)$
command : find /media -type f | grep --color -P ([^\/]+)(\.)(jtr)$
################################################################################
/media/myfile_1.jtr
/media/myfile_2.jtr
/media/myfile_3.jtr


################################################################################
3 - SEARCH FOR jtr.toggled in /media
regex   : ([^\/]+)(\.)(jtr.toggled)$
command : find /media -type f | grep --color -P ([^\/]+)(\.)(jtr.toggled)$
################################################################################
/media/myfile_1.jtr.toggled
/media/myfile_2.jtr.toggled
/media/myfile_3.jtr.toggled


################################################################################
4 - SEARCH FOR cupp in /media
regex   : ([^\/]+)(\.)(cupp)$
command : find /media -type f | grep --color -P ([^\/]+)(\.)(cupp)$
################################################################################
/media/myfile_1.cupp
/media/myfile_2.cupp
/media/myfile_3.cupp

Obviously I spent hours on regex101 w/o success. I also tried to achieve my target with other methods, which does not fit with the rest of the code.

Here is a code extract :

for ext in "${SEARCH[@]}"
do

    COUNTi=$((COUNTi+1))

    REGEX="([^\/]+)(\.)("$ext")$" #
    # Ideally, the Regex should come from a pattern array

    printf '%*s' "$len" | tr ' ' "$mychar"
    echo -e "\n$COUNTi - SEARCH FOR $ext in $BASEDIR"
    echo "regex   : $REGEX"
    echo "command : find $BASEDIR -type f | grep --color -P $REGEX"
    printf '%*s' "$len" | tr ' ' "$mychar" && echo

    find $BASEDIR -type f | grep --color -P $REGEX 
    # the Regex caveats as the double dot extensions are not parsed correctly.

    echo -e "\n"

done

So my 2 questions related to the same piece of code :

  1. REGEX : what would be a correct regex, to be able to parse and dump the files by extension family (pls see the 4 SEARCH patterns and related dumps) ?

  2. ARRAYS : Once above point is solved, how to use a pattern array data, containing the $extension placeholder, into the looped REGEX ?

      PATTERN+=( "([^\\/]+)(\\.)($ext)$" ) # All of these below : CAVEATS escaping $ or not... # REGEX=${PATTERN[5]} # REGEX=$(eval "${PATTERN[5]}" ) # echo "pattern : ${PATTERN[5]}" # eval "$REGEX=\\$REGEX" # eval "$REGEX=\\"\\$REGEX\\"" # REGEX=$(echo "${REGEX}") # REGEX=${!PATTERN[5]} 

Notes: I read all regex documentations for hours, tried hundreds of regex patterns, w/o success as I can't understand these regex rationales.
I also tried other ways, for example find / -name "sayONEnameinmysearchpattern" ! -iname "theothernamesfromtehsearchpattern" find / -name "sayONEnameinmysearchpattern" ! -iname "theothernamesfromtehsearchpattern" . This is not what I'm looking for.

Thx

Change the REGEX line in your code to:

REGEX='^(.*\/|)[^\/\.]+\.'"$ext\$"

The perl regular expression to match the basename of the file is in single quotes. This prevents the shell from trying to expand it. The $ext is in double quotes, so it will be expanded by the shell. The trailing $ is escaped with a backslash just for form.

The leading ^(.*/|) will match a leading directory (ending with /), the [^/\\.]+ will match one or more characters that are NOT '.' or '/'. That must then be followed by a '.' and your extension, followed by the end of the file name ($) to match.

The key here is to anchor your match at both ends (^ and $) and not allow any dots '.' except the ones you really want.

You also might want to put $REGEX in quotes... "$REGEX" in the grep command near the end of your code extract.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM