DP for 3 sequences

Alignment of 3 sequences can be solved by 3 dimensional DP. The gap should be carefully treated.

DP for 3 sequences with linear gap cost

General equation

The iteration for alignment of 3 sequences with linear gap costs is written as:
\[
\mbox{ for } i=1, \ldots \ell;
\mbox{ for } j=1, \ldots m;
\mbox{ for } k=1, \ldots n; \\
F_{i,j,k}= \max
\left[ \begin{array}{l}
F_{i-1,j-1,k-1} + s(x_i,y_j,z_k),\\
F_{i,j-1,k-1} + s(-,y_j,z_k),\\
F_{i-1,j,k-1} + s(x_i,-,z_k),\\
F_{i-1,j-1,k} + s(x_i,y_j,-),\\
F_{i-1,j,k} + s(x_i,-,-),\\
F_{i,j-1,k} + s(-,y_j,-),\\
F_{i,j,k-1} + x(-,-,z_k)\\
\end{array} \right.
\]

Sum of Pairs Score (SPS)

The above equation implies that the gap costs are linear, but they are not explicitly shown.
The gap costs are implicitly included in \( s(,,) \)’s. We have to give the gap costs by defining the \( s(,,) \)’s including ‘-‘.
For example, \( s(-,y_j,z_k) \) in the score that \(y_j\) is aligned to \(z_k\) and a gap is inserted in sequence \(x\).
For multiple sequence alignment problem (alignment for more than 2 sequences), ‘Sum of Pairs Score (SPS)’ is often used.
The SPS for an aligned column including \(n\) sequences is defined as the sum of all the pairwise scores:
\[
\newcommand{\sum}{\mathop{\rm \Sigma}\limits}
s(x^1,\ldots,x^n) = \sum_{i \ne j} s(x^i,x^j)
\]
If SPS and the linear gap cost are used,
\[
s(-,y_j,z_k) = s(y_j,z_k) + s(-,z_k) + s(-,y_j) = s(y_j,z_k) – 2d.
\]
Similarly, \(s(x_i,-,-)\) can be written as:
\[
s(x_i,-,-) = s(-,-) + s(x_i,-) + s(x_i,-) = -2d,
\]
because \(s(-,-)\) is the score for aligning nothing and it should be zero.
Based on the above analysis, the iterative equation of sequence alignment of 3 sequences with SPS score and linear gap is:
\[
\mbox{ for } i=1, \ldots \ell;
\mbox{ for } j=1, \ldots m;
\mbox{ for } k=1, \ldots n; \\
F_{i,j,k}= \max
\left[ \begin{array}{l}
F_{i-1,j-1,k-1} + s(y_j,z_k) + s(x_i,z_k) + s(x_i,y_j),\\
F_{i,j-1,k-1} + s(y_j,z_k) – 2d,\\
F_{i-1,j,k-1} + s(x_i,z_k) – 2d,\\
F_{i-1,j-1,k} + s(x_i,y_j) – 2d,\\
F_{i-1,j,k} – 2d,\\
F_{i,j-1,k} – 2d,\\
F_{i,j,k-1} – 2d\\
\end{array} \right.
\]

Initialization

The initial values on the 3 axes, x-axe, y-axe, z-axe, are easy:
\[
\begin{array}{l}
F_{0,0,0} = 0;\\
F_{i,0,0} = -2id \mbox{ for } i=1, \ldots \ell;\\
F_{0,j,0} = -2jd \mbox{ for } j=1, \ldots m;\\
F_{0,0,k} = -2kd \mbox{ for } k=1, \ldots n;\\
\end{array}
\]
The initial values on the 3 planes, y-z-play, x-z-play, x-y-play, are complicated and require DPs.
For example \(F_{0,j,k}\)’s are calculated by the following iterations.
\[
\mbox{ for } j=1, \ldots m;
\mbox{ for } k=1, \ldots n; \\
F_{0,j,k}= \max
\left[ \begin{array}{l}
F_{0,j-1,k-1} + s(y_j,z_k),\\
F_{0,j-1,k} – 2d,\\
F_{0,j,k-1} – 2d,\\
\end{array} \right.
\]