I'm trying to iterate over a file directory and extract files which have the same id in the range between 5: 10 but are not identical filenames. This script works for the first loop, but won't identify any suitable results in the second output (exits without error). Manually passing in an example of two suitable filenames runs without error. I'm not really sure what's wrong here.
$FILES="data/*"
for f in $FILES; do
for g in $FILES; do
if [[ ${f: 5: 10} == ${g: 5: 10} ]]; then
if [[ ${g: -2} != ${f: -2} ]]; then
echo "$f"
echo "$g"
fi
fi
done
done
eg if the data/* contained:
data/wordA_ln
data/wordB_ln
data/wordA_ap
data/wordB_ap
The script would output:
data/wordA_ln
data/wordA_ap
data/wordB_ln
data/wordB_ap
The difficulty here is that you are looping twice on the same files. And ${var: 5: 10}
will include, for exemple, wordA_ln
or wordA_cp
. Therefore these will never be equal.
Anyway, I took another approach. Get a list of filenames, well the portion you want to check:
find data -type f -exec basename {} \; | cut -d'_' -f1 | sort | uniq
Ex:
find
found "data/wordA_ln". basename {}
will return "wordA_ln". | cut -d'_' -f1
| cut -d'_' -f1
will return "wordA". | sort | uniq
| sort | uniq
will remove duplicates. From that list of possible filenames, do another find in directory data to list the matching files.
Complete script:
#!/bin/bash
while IFS= read -r line
do
find data -type f -name "*${line}*" -print
echo ""
done < <(find data -type f -exec basename {} \; | cut -d'_' -f1 | sort | uniq)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.