简体   繁体   中英

Longest common subsequence for 3+ sequences in c

I have already written the part of LCS.
I want to know If I give N(N>3) ,that means how many set of input.
Like this :
Input :
4 ab abc abcd abcde
Output :
3
Just find the longest of those lcs(3 sequences a part)
ab abc abcd->ab->2
abc abcd abcde->abc->3
3>2
My thinking is that every number of set just use the way of 3 sequences then find the bigest one.
But I dont't know how to do it or any better way?
This is a part of my code:

#define EQUAL(x,y,z) ((x)==(y)&&(y)==(z)) 


int main(){

int set;
int longest;

while (scanf("%d", &set) != EOF){
    while (set){
        scanf("%s", c1);
        set--;
        scanf("%s", c2);
        set--;
        scanf("%s", c3);
        set--;
        longest = LCS(strlen(c1), strlen(c2), strlen(c3));
    }
}
return 0;
}

LCS:

int LCS(int c1_length, int c2_length, int c3_length)
    {
        memset(lcs, 0, N*N);
        int i;
        int j;
        int k;
        for (i = 1; i <= c1_length; i++)
            for (j = 1; j <= c2_length; j++)
                for (k = 1; k <= c3_length; k++)
                {
            if (EQUAL(c1[i], c2[j], c3[k]))
                lcs[i][j][k] = lcs[i - 1][j - 1][k - 1] + 1;
            else
                lcs[i][j][k] = max(lcs[i - 1][j][k], lcs[i][j - 1][k], lcs[i][j][k - 1]);
                }
        return lcs[i - 1][j - 1][k - 1];
    }

Thanks everybody~ I have solved this question by using 2d array to store the sequence.

An iterative procedure may be a way to solve your problem. But the subsequence of maximum length can start everywhere in the first string. As a new string is introduced in the procedure, keeping the current maximum subsequence is not sufficient. Here is a way to store an array of strings :

char s[nb][N]; //nb strings of max length N-1

You may try to keep trace of an array int seqlen[j] , as long as the first string s[0] , storing the length of the maximum common subsequence starting at place j in the first string s[0] .

Initialization : if s[0] is the only string, then the length of the maximum common subsequence starting at place j is strlen(s[0])-j

Introducing a new string s[i] : seqlen[j] needs to be updated (for all j). Create a copy temp of the current substring of s[0] , starting at s[0][j] of length seqlen[j] . This is where strstr(temp,s[i]) may be used. While strstr() returns NULL and seqlen[j]>0 , reduce the size of temp by introducing null-terminating character '\\0' at the end of temp and decrease seqlen[j] . At the end, seqlen[j] is the length of the maximum common subsequence starting at place j in the first string s[0] .

The final step is to take the maximum of seqlen[j] , that is the length of the largest common substring. This substring starts at the corresponding position j in s[0]

Memory footprint and algorithmic refinement : find the smallest string and use it as s[0] .

Algorithmic refinement : the procedure to update seqlen[j] may be updated using a binary search method.

Memory refinement : allocate memory for the array of strings using malloc() , while taking account of the exact length of strings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM