Nussinov Algorithm and its associated SCFG

Nussinov Algorithm for maximum number of possible base-pairs

Initialization:
for $i=1$ to $L$
$M(i,i) = 0,$
Recurtion:
for $\ell=1$ to $L-1$, for $i=1$ to $L-\ell$, $j=i+\ell$
$M(i,j) = \max \left[ \begin{array}{l} M(i+1,j) \\ M(i,j-1) \\ M(i+1,j-1) + \delta(i,j)\\ \max_{h} M(i,h) + M(h+1,j) \end{array} \right.$

Traceback of Nussinov Algorithm

— pop $(i,j)$
— if $i>=j$ continue;
— else if $\gamma(i+1,j) = \gamma(i,j)$， $(i+1,j)$ を push;
— else if $\gamma(i,j-1) = \gamma(i,j)$， $(i,j-1)$ を push;
— else if $\gamma(i+1,j-1)+\delta_{i,j} = \gamma(i,j)$,
—— 塩基対$(i,j)$を記録．
—— $(i+1,j-1)$ を push
else for $k=i+1$ to $j-1$: if $\gamma(i,k)+\gamma(k+1,j)=\gamma(i,j)$:
—— $(k+1,j)$ を push
—— $(i,k)$ を push
—— break

Production rules of the SCFG associated to the above Nassinov Algorithm

$\begin{eqnarray} M &\longrightarrow xM & : P(M\rightarrow xM), & x\in\{a,c,g,u\}\\ M &\longrightarrow Mx & : P(M\rightarrow Mx), & x\in\{a,c,g,u\}\\ M &\longrightarrow xM\bar{x} & : P(M\rightarrow xM\bar{x}), &(x,\bar{x})\in \{(a,u),(c,g),(g,c),(u,a)\}\\ M &\longrightarrow MM & : P(M\rightarrow MM)\\ M &\longrightarrow x & : P(M\rightarrow x), &x\in\{a,c,g,u\} \end{eqnarray}$

CYK Algorithm of the SCFG

Initialization:
for $i=1$ to $L-1$
$\gamma(i,i,M) = P(M \rightarrow x_i)$
Recurtion:
for $\ell=1$ to $L-1$, for $i=1$ to $L-\ell$, $j=i+\ell$
$\gamma(i,j,M) = \max \left[ \begin{array}{l} \gamma(i+1,j,M) + \log P(M \rightarrow x_i M) \\ \gamma(i,j-1,M) + \log P(M \rightarrow M x_j) \\ \gamma(i+1,j-1,M) + \log P(M \rightarrow x_i M x_j) \\ \max_{h} \gamma(i,h) + \gamma(h+1,j) + \log P(M \rightarrow MM) \end{array} \right.$

Inside algorithm of the SCFG

Initialization:
for $i=1$ to $L-1$
$\alpha(i,i,M) = P(M \rightarrow x_i)$
Recurtion:
for $\ell=1$ to $L-1$, for $i=1$ to $L-\ell$, $j=i+\ell$
$\alpha(i,j,M) = \left[ \begin{array}{l} \alpha(i+1,j,M) P(M \rightarrow x_i M) \\ + \alpha(i,j-1,M) P(M \rightarrow M x_j) \\ + \alpha(i+1,j-1,M) P(M \rightarrow x_i M x_j) \\ + \sum_{h=1}^{L-1} \alpha(i,h,M) \alpha(h+1,j,M) P(M \rightarrow MM) \end{array} \right.$

Outside Algorithm of the SCFG

Initialization:
$\beta(1,L,M) = 1$
Recurtion:
for $\ell=L-1$ to $1$, for $i=1$ to $L-\ell$, $j=i+\ell$
$\beta(i,j,M) = \left[ \begin{array}{l} \beta(i-1,j,M) P(M \rightarrow x_{i-1} M) \\ + \beta(i,j+1,M) P(M \rightarrow M x_{j+1}) \\ + \beta(i-1,j+1,M) P(M \rightarrow x_{i-1} M x_{j+1}) \\ + \sum_{h=1}^{i-1} \alpha(h,i-1,M) \beta(h,j,M) P(M \rightarrow MM) \\ + \sum_{h=j+1}^{L} \alpha(j+1,h,M) \beta(i,h,M) P(M \rightarrow MM) \end{array} \right.$

Partition Function on number of base-pairs

２次構造$\sigma$に含まれる塩基対の数$N_\sigma$を2次構造$\sigma$の自由エネルギーの符号を反転した値の乱暴な近似と考えると、２次構造$\sigma$の熱力学的確率は以下のボルツマン分布で与えられる。
$\begin{eqnarray} P(\sigma |x) =\frac{1}{Z_N(x,\sigma)} e^{N_\sigma/k_N T}\\ Z_N(x,\sigma) = \sum_{\sigma \in S(x)} e^{N_\sigma/k_N T} \end{eqnarray}$
このボルツマン分布に対応するSCFGがどのようになるか、考察せよ