简体   繁体   English

从两个N个或更多的重叠字符串中创建一个字符串?

[英]Creating a string from two overlapping strings of N amount or more?

Say I have two Strings, 说我有两个弦,

str1 = "abcdefg"
str2 = "cdefghijkl"

If a user specified an integer amount, say n = 4. 如果用户指定了整数,则说n = 4。

how do I find if the two Strings overlap by 4 or more characters? 如何找到两个字符串是否重叠4个或更多字符? In this case they overlap "cdefg" by 5 to make "abcdefghijkl" 在这种情况下,它们将“ cdefg”重叠5以使“ abcdefghijkl”

If n = 6 then they would not overlap because "bcdefg" != "cdefgh" 如果n = 6,则它们将不会重叠,因为“ bcdefg”!=“ cdefgh”

The issue I am having is if there is more overlap than user specified. 我遇到的问题是,如果重叠超过用户指定的数量。

The issue I am having is if there is more overlap than user specified. 我遇到的问题是,如果重叠超过用户指定的数量。

That's where the loop comes in: start at the position defined by user's overlap, and keep incrementing it until (1) you detect an overlap or (2) you run out of characters in one of the strings. 这就是循环的开始:从用户重叠定义的位置开始,并不断增加直到(1)检测到重叠或(2)字符串之一中的字符用完。

Pseudo-code would look like this: 伪代码如下所示:

int userOverlap = ... // Get user overlap
int minLength = Math.min(stringOne.length(), stringTwo.length());
for (int overlap = userOverlap  ; overlap <= minLength; overlap++) {
    if (testOverlap(stringOne, stringTwo, overlap)) {
        return true;
    }
}
return false;

private static bool testOverlap(String a, String b, int overlap) {
    ... // This is your method that tests for one specific overlap
}

The first step is to see if they overlap at all. 第一步是查看它们是否完全重叠。 You could use the following routine to check that: 您可以使用以下例程进行检查:

/**
 * Determine if string 2 overlaps string 1 at all
 * @param st1 The first string to check.
 * @param st2 The second string to check
 * @param n The minimum number of characters.
 * @return The character index in string 2 that 
 *         overlaps with something in string 1
 *         a -1 indicates no overlap
 */
static int st1Pos;
static int st2Pos;
static int getOverlap(String st1, String st2, int n)
{
    st1Pos = -1;
    st2Pos = -1;
    if ((n <= 0) || (st1.length() < n) || (st2.length() < n))
    {
        return -1;
    }
    int index;
    for (int i = 0; i < st1.length() - n; i++)
    {
        String sub = st2.substring(i, i + n);
        index = st2.indexOf(sub);
        if (index >= 0)
        {
            st1Pos = index;
            st2Pos = i;
            return i;
        }
    }
    return -1;
}

Once you do this you st1Pos and st2Pos indicate where the overlap starts. 完成此操作后,st1Pos和st2Pos会指示重叠的开始位置。 You can then getChars to convert the two strings into character arrays and then start at the locations and compare as shown below: 然后,您可以使用getChars将两个字符串转换为字符数组,然后从这些位置开始并进行比较,如下所示:

/**
 * Return the number of characters that the two strings overlap 
 * @param str1
 * @param str2
 * @param min - minimum required overlap
 * @return The number they actually overlap for first detected overlap
 */    
static int numOverlap(String str1, String str2, int min)
{
    int pos = getOverlap(str1, str2, min);
    if (pos < 0)
    {
        return -1;
    }
    char a[] = str1.toCharArray();
    char b[] = str2.toCharArray();
    int i = 0;
    while (a[i + st1Pos] == b[i + st2Pos])
    {
        i++;
        if ((i + st1Pos >= a.length) || (i + st2Pos >= b.length))
        {
            break;
        }
    }
    return i;
}

A few things to note about this implementation. 关于此实现的一些注意事项。 First, I did everything static for ease of explanation - you probably want to make this a class and put the two position variables as part of the class. 首先,为了使说明容易,我将所有静态操作都做了-您可能想将其设为一个类,并将这两个位置变量作为该类的一部分。 Second, this implementation only finds the first such overlap. 其次,此实现只找到第一个这样的重叠。 It is possible that there are more than one overlap and later overlaps could be longer. 有可能存在多个重叠,以后的重叠可能会更长。 You would have to do a variety of permutations to get the different overlaps (like maybe starting from the back and working forward). 您必须进行各种排列才能获得不同的重叠(例如,可能从背面开始并向前进行)。

Anyway, good luck with this one. 无论如何,祝你好运。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM