简体   繁体   English

两个字符串的共同子字符串

[英]Common Substring of two strings

This particular interview-question stumped me: 这个特殊的面试问题让我难过:

Given two Strings S1 and S2. Find the longest Substring which is a Prefix of S1 and suffix of S2 Given two Strings S1 and S2. Find the longest Substring which is a Prefix of S1 and suffix of S2 . Given two Strings S1 and S2. Find the longest Substring which is a Prefix of S1 and suffix of S2

Through Google, I came across the following solution, but didnt quite understand what it was doing. 通过Google,我遇到了以下解决方案,但并不太了解它在做什么。

public String findLongestSubstring(String s1, String s2) {
        List<Integer> occurs = new ArrayList<>();
        for (int i = 0; i < s1.length(); i++) {
            if (s1.charAt(i) == s2.charAt(s2.length()-1)) {
                occurs.add(i);
            }
        }

        Collections.reverse(occurs);

        for(int index : occurs) {
            boolean equals = true;
            for(int i = index; i >= 0; i--) {
                if (s1.charAt(index-i) != s2.charAt(s2.length() - i - 1)) {
                    equals = false;
                    break;
                }
            }
            if(equals) {
                return s1.substring(0,index+1);
            }
        }

        return null;
    }

My questions: 我的问题:

  1. How does this solution work? 该解决方案如何运作?
    • And how do you get to discovering this solution? 以及如何找到这个解决方案?
  2. Is there a more intuitive / easier solution? 有没有更直观/更简单的解决方案?

Part 2 of your question 问题的第二部分

Here is a shorter variant: 这是一个较短的变体:

public String findLongestPrefixSuffix(String s1, String s2) {

   for( int i = Math.min(s1.length(), s2.length()); ; i--) {
      if(s2.endsWith(s1.substring(0, i))) {
         return s1.substring(0, i);
      }
   }    
}

I am using Math.min to find the length of the shortest String, as I don't need to and cannot compare more than that. 我正在使用Math.min来查找最短String的长度,因为我不需要而且不能进行更多比较。

someString.substring(x,y) returns you the String you get when reading someString beginning from character x and stopping at character y . someString.substring(x,y)返回您从字符x开始到字符y读取someString时获得的字符串。 I go backwards from the biggest possible substring ( s1 or s2 ) to the smallest possible substring, the empty string. 我从可能的最大子字符串( s1s2 )倒退到最小的子字符串(空字符串)。 This way the first time my condition is true it will be biggest possible substring the fulfills it. 这样,我的条件第一次为真时,将最大可能满足该条件的子串。

If you prefer you can go the other way round, but you have to introduce a variable saving the length of the longest found substring fulfilling the condition so far: 如果您愿意,可以采用相反的方法,但是您必须引入一个变量,该变量保存到目前为止满足条件的最长找到子字符串的长度:

public static String findLongestPrefixSuffix(String s1, String s2) {

   if (s1.equals(s2)) { // this part is optional and will 
      return s1;        // speed things up if s1 is equal to s2
   }                    //

   int max = 0;
   for (int i = 0; i < Math.min(s1.length(), s2.length()); i++) {
      if (s2.endsWith(s1.substring(0, i))) {
         max = i;
      }
   }
   return s1.substring(0, max);
}

For the record: You could start with i = 1 in the latter example for a tiny bit of extra performance. 作为记录:在后面的示例中,您可以从i = 1开始,以获得一点点额外的性能。 On top of this you can use i to specify how long the suffix has at least to be you want to get. 最重要的是,您可以使用i来指定后缀至少要保留多长时间。 ;) If you writ Math.min(s1.length(), s2.length()) - x you can use x to specify how long the found substring may be at most. ;)如果Math.min(s1.length(), s2.length()) - x ,则可以使用x来指定找到的子字符串最多可以有多长时间。 Both of these things are possible with the first solution, too, but the min length is a bit more involving. 这两种情况在第一种解决方案中都是可行的,但是最小长度会涉及更多的问题。 ;) ;)


Part 1 of your question 问题的第1部分

In the part above the Collections.reverse the author of the code searches for all positions in s1 where the last letter of s2 is and saves this position. Collections.reverse上方的部分中,代码的作者在s1中搜索s2的最后一个字母所在的所有位置,然后保存该位置。

What follows is essentially what my algorithm does, the difference is, that he doesn't check every substring but only those that end with the last letter of s2 . 接下来的内容本质上是我的算法所做的事情,不同之处在于,他不检查每个子字符串,而是仅检查那些以s2的最后一个字母结尾的子字符串。

This is some sort of optimization to speed things up. 这是某种加快速度的优化。 If speed is not that important my naive implementation should suffice. 如果速度不是那么重要,那么我幼稚的实现就足够了。 ;) ;)

Where did you find that solution? 您在哪里找到该解决方案? Was it written by a credible, well-respected coder? 它是由可信的,受人尊敬的编码器编写的吗? If you're not sure of that, then it might not be worth reading it. 如果您不确定,则可能不值得阅读。 One could write really complex and inefficient code to accomplish something really simple, and it will not be worth understanding the algorithm. 一个人可能会写出非常复杂而效率低下的代码来完成一件非常简单的事情,这不值得理解该算法。

Rather than trying to understand somebody else's solution, it might be easier to come up with it on your own. 与其尝试了解别人的解决方案,不如自己想出办法。 I think you understand the problem much better that way, and the logic becomes your own. 我认为您以这种方式更好地理解了问题,并且逻辑变成了自己的逻辑。 Over time and practice the thought process will start to come more naturally. 随着时间的流逝和实践,思维过程将开始变得更加自然。 Practice makes perfect. 实践使完美。

Anyway, I put a more simple implementation in Python here (spoiler alert!). 无论如何,我在这里放置了一个更简单的Python实现(剧透警报!)。 I suggest you first figure out the solution on your own, and compare it to mine later. 我建议您先自己找出解决方案,然后再与我比较。

Apache commons lang3, StringUtils.getCommonPrefix() Apache公共语言lang3, StringUtils.getCommonPrefix()

Java is really bad in providing useful stuff via stdlib. Java很难通过stdlib提供有用的东西。 On the plus side there's almost always some reasonable tool from Apache. 从好的方面来说,几乎总是有来自Apache的一些合理工具。

I converted the @TheMorph's answer to javascript. 我将@TheMorph的答案转换为javascript。 Hope this helps js developer 希望这对js开发人员有所帮助

if (typeof String.prototype.endsWith !== 'function') {
    String.prototype.endsWith = function(suffix) {
        return this.indexOf(suffix, this.length - suffix.length) !== -1;
    };
}

function findLongestPrefixSuffix(s2, s1) {

   for( var i = Math.min(s1.length, s2.length); ; i--) {
      if(s2.endsWith(s1.substring(0, i))) {
         return s1.substring(0, i);
      }
   }    
}

console.log(findLongestPrefixSuffix('abc', 'bcd')); // result: 'bc'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM