简体   繁体   中英

Bash Comparing strings in a directory

Hi I am trying to compare two strings in a directory. the format is as follows.

{sametext}difference{sametext} .

Note: {sametext} is not static for each file

for example

myfile_1_exercise.txt compared to myfile_2_exercise.txt

Can you tell me how I would match the above strings in an if statement.

Basically I need to know how I would ignore the number in the two strings so that these would be same.

Some example code is shown below:

My example code looks like this:

for g in `ls -d */`;
do                      
  if [ -d $g ]; then 
    cd $g                 # down 1 directories
    for h in `ls *root`;
    do
      printf "${Process[${count}]} = ${basedir}/${f}${g}${h}\n"
      h1=${h}

      if [ "${h1}" = "${h2}" ]; then # NEED to MATCH SOME HOW??????????
        echo we have a match
      fi

      h2=${h1}
      let count+=1
    done

    cd ../
    #printf "\n\n\n\n"      
  fi
done

What should be the test to determine this instead of "${h1}" = "${h2}" ?

Cheers,

Mike

sed comes in handy here.

This basically goes through every file in the directory, extracts the two strings from the filename, and keeps a list of all the unique combinations thereof.

Then, it walks through this list, and uses bash's wildcard expansion to allow you to loop over each collection.

EDIT: Got rid of an ugly hack.

i=0
for f in *_*_*.txt
do
    a=`echo "$f" | sed 's/\(.*\)_.*_\(.*\).txt/\1/g'`
    b=`echo "$f" | sed 's/\(.*\)_.*_\(.*\).txt/\2/g'`

    tmp=${all[@]}
    expr match "$tmp" ".*$a:$b.*" >/dev/null
    if [ "$?" == "1" ]
    then
      all[i]="$a:$b"
      let i+=1
    fi
done

for f in ${all[@]}
do
    a=`echo "$f" | sed 's/\(.*\):\(.*\)/\1/g'`
    b=`echo "$f" | sed 's/\(.*\):\(.*\)/\2/g'`
    echo $a - $b
    for f2 in $a_*_$b.txt
    do
        echo "  $f2"
        # ...
    done
done

Of course, this assumes that all the files you care about follow the *_*_*.txt pattern.

"myfile_1_exercise.txt" == "myfile_2_exercise.txt"

You mean the above test should return true (ignoring the numbers) right?
This is what I would have done:

h1="myfile_1_exercise.txt"
h2="myfile_2_exercise.txt"
if [ $( echo ${h1} | sed 's/[0-9]*//g' ) == $( echo ${h2} | sed 's/[0-9]*//g' ) ] ; then 
    # do something here.
fi

Disclaimer:

  1. Mileage can vary and you may have to adjust and debug the script for corner cases.
  2. You may be better off using Perl for your task.
  3. There could be better solutions even in Bash. This one is not very efficient but it seems to work.

Said that, here is a script that compares two strings according to your requirements. I am sure you can figure how to use it in your directory listing script (for which you may want to consider find by the way)

This script takes two strings and prints match! if they match

$ bash compare.sh myfile_1_exercise.txt myfile_2_exercise.txt
match!
$ bash compare.sh myfile_1_exercise.txt otherfile_2_exercise.txt
$

The script:

#!/bin/bash
fname1=$1
fname2=$2

findStartMatch() {
  match=""
  rest1=$1 ;
  rest2=$2 ;
  char1=""
  char2=""
  while [[  "$rest1" != "" && "$rest2" != "" && "$char1" == "$char2" ]] ; do
    char1=$(echo $rest1 | sed 's/\(.\).*/\1/');
    rest1=$(echo $rest1 | sed 's/.\(.*\)/\1/') ;
    char2=$(echo $rest2 | sed 's/\(.\).*/\1/');
    rest2=$(echo $rest2 | sed 's/.\(.*\)/\1/') ;
    if [[ "$char1" == "$char2" ]] ; then
      match="${match}${char1}"
    fi
  done
}

findEndMatch() {
  match=""
  rest1=$1 ;
  rest2=$2 ;
  char1=""
  char2=""
  while [[  "$rest1" != "" && "$rest2" != "" && "$char1" == "$char2" ]] ; do
    char1=$(echo $rest1 | sed 's/.*\(.\)/\1/');
    rest1=$(echo $rest1 | sed 's/\(.*\)./\1/') ;
    char2=$(echo $rest2 | sed 's/.*\(.\)/\1/');
    rest2=$(echo $rest2 | sed 's/\(.*\)./\1/') ;
    if [[ "$char1" == "$char2" ]] ; then
      match="${char1}${match}"
    fi
  done
}

findStartMatch $fname1 $fname2
startMatch=$match
findEndMatch $fname1 $fname2
endMatch=$match

if [[ "$startMatch" != "" && "$endMatch" != "" ]] ; then
  echo "match!"
fi

If you are actually comparing two files like you mentiond... probably you can use diff command like

diff myfile_1_exercise.txt myfile_2_exercise.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM