I'm writing a bash script to automate a few tasks. One of the things I have to do is search for a pattern among filenames in a directory, then loop through the results.
When I run this script:
data=$(ls $A_PATH_VAR/*.ext | grep -o '201601[0-9]\{2\}\|201602[0-9]\{2\}')
echo $data
I get the expected result - a list of all the matches that were found among the filenames in $A_PATH_VAR/
with the extension .ext
. However, when I store said pattern in a variable and then use it, like this:
startmo=201601
endmo=201602
mo=$((startmo+1))
grepstr="'$startmo[0-9]\{2\}"
while [ $mo -le $endmo ]
do
grepstr="$grepstr\|$mo[0-9]\{2\}"
mo=$((mo+1))
done
grepstr="$grepstr'"
echo $grepstr # correct
data=$(ls $A_PATH_VAR/*.ext | grep -o $grepstr)
echo $data
The pattern in $grepstr
is correctly echoed - that is, it contains the value '201601[0-9]\\{2\\}\\|201602[0-9]\\{2\\}'
, but $data
is empty. Why is this?
My solution:
mo=$((startmo+1))
grepstr="($startmo[0-9][0-9]"
while [ $mo -le $endmo ]
do
grepstr="$grepstr|$mo[0-9][0-9]"
mo=$((mo+1))
done
grepstr="$grepstr)"
files=$(ls $A_PATH_VAR/*.ext)
setopt shwordsplit
for file in $files
do
if [[ $file =~ $grepstr ]]
then
date=$BASH_REMATCH
fi
...
done
In the below, I'm ignoring that your input source is ls
, beyond this opening note that ls
should not be used in this manner , and find
(which, in GNU-extended forms, contains a -regex
operator) should be considered instead.
In:
pattern="'pattern'"
grep $pattern
...the double quotes ( "
) are syntactic -- they're consumed by the shell during its parsing phase, whereas the single quotes, inside of them, are literal -- the outer, syntactic quotes specified that everything inside them is to be considered a part of the string (except where the rules for parsing double-quoted content differ).
Thus, when you run grep $pattern
, the following happens:
$pattern
are broken into words on any characters within IFS. By default, IFS contains only whitespace; however, if you had IFS=a
, then this would be broken into a word "pa
and a word ttern"
pattern
had contained "hello * world"
, and you had a default value of IFS parsing on whitespace, we would have broken into the words "hello
, *
, and world"
-- and the *
would then be replaced with a list of files in the current directory. Obviously, you don't want this. Thus, use only syntactic quotes if your goal is to prevent string-splitting and glob expansion:
pattern="pattern"
grep "$pattern"
BTW, if I had this task, I might write it as follows [to avoid needing to hand-build a regex for each possible date range]:
startmo=201601
endmo=201705
currmo=$startmo
# this requires GNU date
# on MacOS, you can install this via macports and invoke it as gdate
next_month() {
date -d "+1 month ${1:0:4}-${1:4:2}-15" +%Y%m
}
while [[ $currmo <= $endmo ]]; do
currmo=$(next_month "$currmo")
files=( *"$currmo"* )
[[ -e $files ]] || { echo "No files found for month $currmo" >&2; continue; }
printf '%s\n' "${files[@]}"
done
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.