I have 2 sequences, AACAGTTACC<\/code> and
TAAGGTCA<\/code> , and I'm trying to find a global sequence alignment.
I managed to create a 2D array and create the matrix, and I even filled it with semi-dynamic approach.
void process() {
for (int i = 1; i <= sequenceA.length; i++) {
for (int j = 1; j <= sequenceB.length; j++) {
int scoreDiag = opt[i-1][j-1] + equal(i, j);
int scoreLeft = opt[i][j-1] - 1;
int scoreUp = opt[i-1][j] - 1;
opt[i][j] = Math.max(Math.max(scoreDiag, scoreLeft), scoreUp);
}
}
}
private int equal(int i, int j) {
if (sequenceA[i - 1] == sequenceB[j - 1]) {
return 1;
} else {
return -1;
}
}
There are several things that you need to modify:
AACAGTTACC
and TAAGGTCA
, but CCATTGACAA
and ACTGGAAT
. The full solution is:
// Note that these sequences are reversed!
String sequenceA ="CCATTGACAA";
String sequenceB = "ACTGGAAT";
// The penalties to apply
int gap = 2, substitution = 1, match = 0;
int[][] opt = new int[sequenceA.length() + 1][sequenceB.length() + 1];
// First of all, compute insertions and deletions at 1st row/column
for (int i = 1; i <= sequenceA.length(); i++)
opt[i][0] = opt[i - 1][0] + gap;
for (int j = 1; j <= sequenceB.length(); j++)
opt[0][j] = opt[0][j - 1] + gap;
for (int i = 1; i <= sequenceA.length(); i++) {
for (int j = 1; j <= sequenceB.length(); j++) {
int scoreDiag = opt[i - 1][j - 1] +
(sequenceA.charAt(i-1) == sequenceB.charAt(j-1) ?
match : // same symbol
substitution); // different symbol
int scoreLeft = opt[i][j - 1] + gap; // insertion
int scoreUp = opt[i - 1][j] + gap; // deletion
// we take the minimum
opt[i][j] = Math.min(Math.min(scoreDiag, scoreLeft), scoreUp);
}
}
for (int i = 0; i <= sequenceA.length(); i++) {
for (int j = 0; j <= sequenceB.length(); j++)
System.out.print(opt[i][j] + "\t");
System.out.println();
}
The result is just as in the example you gave us (but reversed, remember!):
0 2 4 6 8 10 12 14 16
2 1 2 4 6 8 10 12 14
4 3 1 3 5 7 9 11 13
6 4 3 2 4 6 7 9 11
8 6 5 3 3 5 7 8 9
10 8 7 5 4 4 6 8 8
12 10 9 7 5 4 5 7 9
14 12 11 9 7 6 4 5 7
16 14 12 11 9 8 6 5 6
18 16 14 13 11 10 8 6 6
20 18 16 15 13 12 10 8 7
So the final alignment score is found at opt[sequenceA.length()][sequenceB.length()]
(7). If you really need to show the reversed matrix as in the image, do this:
for (int i = sequenceA.length(); i >=0; i--) {
for (int j = sequenceB.length(); j >= 0 ; j--)
System.out.print(opt[i][j] + "\t");
System.out.println();
}
The current example shows how to find the minimum and maximum values above the score matrix. Below you have a javascript implementation that computes the score matrix for global alignment, but also the minimum and maximum values. The entire alignment process was described in " Paul A. Gagniuc. Algorithms in Bioinformatics: Theory and Implementation. John Wiley & Sons, Hoboken, NJ, USA, 2021, ISBN: 9781119697961. " and is available here:
https://github.com/gagniuc/Local-sequence-alignment-in-JS
or here:
https://bcs.wiley.com/he-bcs/Books?action=index&itemId=1119697964&bcsId=12108
// Variable statement var Match = +2; var Mismatch = -1; var gap = -2; var s0 = 'AACAGTTACC'; var s1 = 'TAAGGTCA'; var MMax = 0; var MMin = 0; var m = []; var s = []; // Matrix initialization and completion s[0] = [] = s0.split(''); s[1] = [] = s1.split(''); var n_0 = s[0].length + 1; var n_1 = s[1].length + 1; for(var i=0; i<=n_0; i++) { m[i]=[]; for(var j=0; j<=n_1; j++) { m[i][j]=0; if (i==1 && j>1) {m[i][j]=m[i][j-1]+gap;} if (j==1 && i>1) {m[i][j]=m[i-1][j]+gap;} if (i>1) {m[i][0]=s[0][i-2];} if (j>1) {m[0][j]=s[1][j-2];} if(i>1 && j>1){ var A = m[i-1][j-1] + f(m[i][0],m[0][j]); //'\\ var B = m[i-1][j] + gap; //'- var C = m[i][j-1] + gap; //'| m[i][j] = Math.max(A, B, C); if(m[i][j] > MMax){MMax = m[i][j];x=i;y=j;} if(m[i][j] < MMin){MMin = m[i][j];} } } } document.write('Max:'+MMax+'<br>'); document.write('Min:'+MMin+'<hr>'); document.write('Score matrix:'+SMC(m)); // Matching function function f(a1, a2) { if(a1 === a2){return Match;} else {return Mismatch;} } // SHOW MATRIX CONTENT function SMC(m) { var r = "<table border=1>"; for(var i=0; i<m.length; i++) { r += "<tr>"; for(var j=0; j<m[i].length; j++){ r += "<td>"+m[i][j]+"</td>"; } r += "</tr>"; } r += "</table>"; return r; }
body { padding: 1rem; font-family: monospace; font-size: 18px; font-style: normal; font-variant: normal; line-height: 20px; }
Note that the implementation uses the two DNA sequences indicated by you, namely s1:AACAGTTACC, and s2:TAAGGTCA.
Have a look at http://en.wikipedia.org/wiki/Longest_common_substring , the code is pretty much copy-paste for several languages, and is easily adapted to also tell you the alignment index. I had to do a similar thing and ended up with https://github.com/Pomax/DOM-diff/blob/rewrite/rewrite/rewrite.html#L103
(The SubsetMapping it returns is basically a simple struct that gives the index for both contexts, https://github.com/Pomax/DOM-diff/blob/rewrite/rewrite/rewrite.html#L52 )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.