简体   繁体   English

比较java中的字符串并删除它们相同的字符串部分

[英]Compare strings in java and remove the part of string where they are identical

I have two strings with me: 我和我有两个字符串:

s1="MICROSOFT"
s2="APPLESOFT"

I need to compare the strings and remove the duplicate part (always towards the end) from the second string. 我需要比较字符串并从第二个字符串中删除重复的部分(总是朝向末尾)。 So I should get "MICROSOFT" and "APPLE" as output. 所以我应该把“MICROSOFT”和“APPLE”作为输出。

I have compared both the strings character by character. 我已逐字符地比较了两个字符串。

               String s1 = "MICROSOFT";
               String s2 = "APPLESOFT";

               for(int j=0; j<s1.length(); j++)
               {
                   char c1 = s1.charAt(j);
                   char c2 = s2.charAt(j);

                   if(c1==c2)
                       System.out.println("Match found!!!");
                   else
                       System.out.println("No match found!");
               }

It should check the strings and if the two strings have same characters until the end of string, then I need to remove that redundant part, SOFT in this case, from the second string. 它应检查字符串,如果两个字符串在字符串结尾之前具有相同的字符,那么我需要从第二个字符串中删除该冗余部分,在这种情况下为SOFT。 But I can't think of how to proceed from here. 但我想不出如何从这里开始。

There can be more duplicates...but we have to remove only those which are continuously identical. 可能会有更多重复...但我们必须只删除那些持续相同的。 if i have APPWWSOFT and APPLESOFT, i should get APPLE again in the second string since we got LE different than WW in between 如果我有APPWWSOFT和APPLESOFT,我应该在第二个字符串中再次获得APPLE,因为我们得到的LE与WW之间不同

Can you guys please help me out here? 你能帮帮我吗?

Search and read about Longest Common Subsequence , you can find efficient algorithms to find out the LCS of two input strings. 搜索并阅读有关最长公共子序列的信息,您可以找到有效的算法来找出两个输入字符串的LCS。 After finding the LCS of the input strings, it is easy to manipulate the inputs. 找到输入字符串的LCS后,很容易操作输入。 For example, in your case an LCS algorithm will find "SOFT" as the LCS of these two strings, then you might check whether the LCS is in the final part of the 2nd input and then remove it easily. 例如,在您的情况下,LCS算法会将“SOFT”作为这两个字符串的LCS,然后您可以检查LCS是否在第二个输入的最后部分,然后轻松删除它。 I hope this idea helps. 我希望这个想法有所帮助。

An example LCS code in Java is here, try it: http://introcs.cs.princeton.edu/java/96optimization/LCS.java.html Java中的示例LCS代码在这里,请尝试: http//introcs.cs.princeton.edu/java/96optimization/LCS.java.html

Example scenario (pseudocode): 示例场景(伪代码):

input1: "MISROSOFT";
input2: "APPLESOFT";

execute LCS(input1, input2);
store the result in lcs, now lcs = "SOFT";

iterate over the characters of input2,
if a character exists in lcs then remove it from input2.

As far as I understand, you want to remove any identical characters from the two strings. 据我所知,你想从两个字符串中删除任何相同的字符。 By identical I mean: same position and same character(code). 相同的意思是:相同的位置和相同的字符(代码)。 I think the following linear complexity solution is the simplest: 我认为以下线性复杂性解决方案是最简单的:

 StringBuilder sb1 = new StringBuilder();
 StringBuilder sb2 = new StringBuilder(); //if you want to remove the identical char 
                                          //only from one string you don't need the 2nd sb
 char c;
 for(int i = 0; i<Math.min(s1.length,s2.length);i++){
     if((c = s1.charAt(i)) != s2.charAt(i)){
           sb1.append(c);
     }
 }
 return sb1.toString();

Try this algo- Create characters sequences of your first string and find it in second string. 试试这个算法 - 创建第一个字符串的字符序列,并在第二个字符串中找到它。

performance - 表现 -
Average case = (s1.length()-1)sq 平均情况=(s1.length() - 1)平方

public class SeqFind {
    public static String searchReplace(String s1,String s2) {
        String s3;
        boolean brk=false;
        for(int j=s1.length();j>0&&!brk;j--){
        for (int i = j-4; i > 0; i--) {
            String string = s1.substring( i,j);
            if(s2.contains(string)){
                System.out.println(s2+" - "+string+" "+s2.replace( string,""));
                brk=true;
                break;
            }
        }
    }
        return s3;      
    }
    public static void main(String[] args) {
        String s1 = "MICROSOFT";
        String s2 = "APPLESOFT";
        String s3 = searchReplace(s1,s2);
    }
}

Out put - APPLESOFT - SOFT - APPLE 输出 - APPLESOFT - SOFT - APPLE

You should rather use StringBuffer if you want your String to be modified.. 如果希望修改String ,则应该使用StringBuffer

And in this case, you can have one extra StringBuffer , in which you can keep on appending non-matching character: - 在这种情况下,你可以有一个额外的StringBuffer ,你可以继续追加不匹配的字符: -

    StringBuffer s1 = new StringBuffer("MICROSOFT");
    StringBuffer s2 = new StringBuffer("APPLESOFT");
    StringBuffer s3 = new StringBuffer();

    for(int j=0; j<s1.length(); j++)
    {
        char c1 = s1.charAt(j);
        char c2 = s2.charAt(j);

        if(c1==c2) {
            System.out.println("Match found!!!");
        } else {
            System.out.println("No match found!");
            s3.append(c1);
        }
    }
    s1 = s3;
    System.out.println(s1);    // Prints "MICRO"
   public class Match {

public static void main(String[] args)
{
    String s1="MICROSOFT";
    String s2="APPLESOFT";
    String[] s=new String[10];
    String s3;
    int j=0,k=0;
    for(int i=s2.length();i>0;i--)
    {
        s[j]=s2.substring(k,s2.length());
        if(s1.contains(s[j]))
        {
            s3=s2.substring(0,j);
                                 System.out.println(s1+""+s3);

            System.exit(0);

        }
        else
        {
            System.out.println("");
        }
                                j++;
                                k++;
    }


}

     }

I have edited the code you can give it an another try. 我编辑了代码,你可以再试一次。

try this, not tested thou 试试这个,没试过你

 String s1 = "MICROSOFT";
         String s2 = "APPLESOFT";
         String s3="";
         for(int j=0; j<s1.length(); j++)
         {
             if(s1.charAt(j)==s2.charAt(j)){
                 s3+=s1.charAt(j);
             }
         }
         System.out.println(s1.replace(s3, " ") + " \n"+ s2.replace(s3, " "));

I have solved my problem after racking some brains off. 我绞尽脑汁后解决了我的问题。 Please feel free to correct/improve/refine my code. 请随时纠正/改进/优化我的代码。 The code not only works for "MICROSOFT" and "APPLESOFT" inputs, but also for inputs like "APPWWSOFT" and "APPLESOFT" (i needed to remove the continuous duplicates from the end - SOFT in both the above inputs). 该代码不仅适用于“MICROSOFT”和“APPLESOFT”输入,还适用于“APPWWSOFT”和“APPLESOFT”之类的输入(我需要从上面的两个输入中删除连续重复项 - SOFT)。 I'm in the learning stage and I'll appreciate any valuable inputs. 我正处于学习阶段,我会感激任何有价值的投入。

public class test
    {           
        public static void main(String[] args)
        {
            String s1 = "MICROSOFT";
            String s2 = "APPLESOFT";

            int counter1=0;
            int counter2=0;

            String[] test = new String[100];
            test[0]="";

            for(int j=0; j<s1.length(); j++)
            {
                char c1 = s1.charAt(j);
                char c2 = s2.charAt(j);

                if(c1==c2)
                {
                    if(counter1==counter2)
                    {
                        //System.out.println("Match found!!!");
                        test[0]=test[0]+c2;
                        counter2++;
                        //System.out.println("Counter 2: "+counter2);
                    }
                    else
                        test[0]="";
                }
               else
               {
                   //System.out.print("No match found!");
                   //System.out.println("Counter 2: "+counter2);
                   counter2=counter1+1;
                   test[0]="";
               }

               counter1++;
               //System.out.println("Counter 1: "+counter1);
                           }

             System.out.println(test[0]);
             System.out.println(s2.replaceAll(test[0]," "));
        }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM