简体   繁体   English

Bash 比较目录中的字符串

[英]Bash Comparing strings in a directory

Hi I am trying to compare two strings in a directory.您好我正在尝试比较目录中的两个字符串。 the format is as follows.格式如下。

{sametext}difference{sametext} . {sametext}difference{sametext}

Note: {sametext} is not static for each file注意:{sametext} 不是每个文件的 static

for example例如

myfile_1_exercise.txt compared to myfile_2_exercise.txt myfile_1_exercise.txtmyfile_2_exercise.txt相比

Can you tell me how I would match the above strings in an if statement.你能告诉我如何在 if 语句中匹配上述字符串。

Basically I need to know how I would ignore the number in the two strings so that these would be same.基本上我需要知道如何忽略两个字符串中的数字,以便它们相同。

Some example code is shown below:一些示例代码如下所示:

My example code looks like this:我的示例代码如下所示:

for g in `ls -d */`;
do                      
  if [ -d $g ]; then 
    cd $g                 # down 1 directories
    for h in `ls *root`;
    do
      printf "${Process[${count}]} = ${basedir}/${f}${g}${h}\n"
      h1=${h}

      if [ "${h1}" = "${h2}" ]; then # NEED to MATCH SOME HOW??????????
        echo we have a match
      fi

      h2=${h1}
      let count+=1
    done

    cd ../
    #printf "\n\n\n\n"      
  fi
done

What should be the test to determine this instead of "${h1}" = "${h2}" ?应该用什么测试来确定这个而不是"${h1}" = "${h2}"

Cheers,干杯,

Mike麦克风

sed comes in handy here. sed在这里派上用场。

This basically goes through every file in the directory, extracts the two strings from the filename, and keeps a list of all the unique combinations thereof.这基本上遍历目录中的每个文件,从文件名中提取两个字符串,并保留其所有唯一组合的列表。

Then, it walks through this list, and uses bash's wildcard expansion to allow you to loop over each collection.然后,它遍历这个列表,并使用 bash 的通配符扩展来允许您遍历每个集合。

EDIT: Got rid of an ugly hack.编辑:摆脱了一个丑陋的黑客。

i=0
for f in *_*_*.txt
do
    a=`echo "$f" | sed 's/\(.*\)_.*_\(.*\).txt/\1/g'`
    b=`echo "$f" | sed 's/\(.*\)_.*_\(.*\).txt/\2/g'`

    tmp=${all[@]}
    expr match "$tmp" ".*$a:$b.*" >/dev/null
    if [ "$?" == "1" ]
    then
      all[i]="$a:$b"
      let i+=1
    fi
done

for f in ${all[@]}
do
    a=`echo "$f" | sed 's/\(.*\):\(.*\)/\1/g'`
    b=`echo "$f" | sed 's/\(.*\):\(.*\)/\2/g'`
    echo $a - $b
    for f2 in $a_*_$b.txt
    do
        echo "  $f2"
        # ...
    done
done

Of course, this assumes that all the files you care about follow the *_*_*.txt pattern.当然,这假设您关心的所有文件都遵循*_*_*.txt模式。

"myfile_1_exercise.txt" == "myfile_2_exercise.txt"

You mean the above test should return true (ignoring the numbers) right?你的意思是上面的测试应该返回true (忽略数字)对吗?
This is what I would have done:这就是我会做的:

h1="myfile_1_exercise.txt"
h2="myfile_2_exercise.txt"
if [ $( echo ${h1} | sed 's/[0-9]*//g' ) == $( echo ${h2} | sed 's/[0-9]*//g' ) ] ; then 
    # do something here.
fi

Disclaimer:免责声明:

  1. Mileage can vary and you may have to adjust and debug the script for corner cases.里程可能会有所不同,您可能必须针对极端情况调整和调试脚本。
  2. You may be better off using Perl for your task.您最好使用 Perl 来完成您的任务。
  3. There could be better solutions even in Bash.即使在 Bash 中也可能有更好的解决方案。 This one is not very efficient but it seems to work.这个效率不是很高,但似乎有效。

Said that, here is a script that compares two strings according to your requirements.也就是说,这是一个根据您的要求比较两个字符串的脚本。 I am sure you can figure how to use it in your directory listing script (for which you may want to consider find by the way)我相信你可以弄清楚如何在你的目录列表脚本中使用它(你可能想顺便考虑一下find

This script takes two strings and prints match!该脚本需要两个字符串并打印匹配! if they match如果他们匹配

$ bash compare.sh myfile_1_exercise.txt myfile_2_exercise.txt
match!
$ bash compare.sh myfile_1_exercise.txt otherfile_2_exercise.txt
$

The script:剧本:

#!/bin/bash
fname1=$1
fname2=$2

findStartMatch() {
  match=""
  rest1=$1 ;
  rest2=$2 ;
  char1=""
  char2=""
  while [[  "$rest1" != "" && "$rest2" != "" && "$char1" == "$char2" ]] ; do
    char1=$(echo $rest1 | sed 's/\(.\).*/\1/');
    rest1=$(echo $rest1 | sed 's/.\(.*\)/\1/') ;
    char2=$(echo $rest2 | sed 's/\(.\).*/\1/');
    rest2=$(echo $rest2 | sed 's/.\(.*\)/\1/') ;
    if [[ "$char1" == "$char2" ]] ; then
      match="${match}${char1}"
    fi
  done
}

findEndMatch() {
  match=""
  rest1=$1 ;
  rest2=$2 ;
  char1=""
  char2=""
  while [[  "$rest1" != "" && "$rest2" != "" && "$char1" == "$char2" ]] ; do
    char1=$(echo $rest1 | sed 's/.*\(.\)/\1/');
    rest1=$(echo $rest1 | sed 's/\(.*\)./\1/') ;
    char2=$(echo $rest2 | sed 's/.*\(.\)/\1/');
    rest2=$(echo $rest2 | sed 's/\(.*\)./\1/') ;
    if [[ "$char1" == "$char2" ]] ; then
      match="${char1}${match}"
    fi
  done
}

findStartMatch $fname1 $fname2
startMatch=$match
findEndMatch $fname1 $fname2
endMatch=$match

if [[ "$startMatch" != "" && "$endMatch" != "" ]] ; then
  echo "match!"
fi

If you are actually comparing two files like you mentiond... probably you can use diff command like如果您实际上是在比较您提到的两个文件...也许您可以使用diff命令,例如

diff myfile_1_exercise.txt myfile_2_exercise.txt

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM