# Conditional Log-linear Model (CLLM)

A generative stochastic grammar comprises a set of stochastic production rules.
Given a sequence $x$, the conditional probability of a parse $\sigma$ is
$P(\sigma|x) = \frac{P(x,\sigma)}{\sum_{\sigma’ \in \Omega(x)}P(x,\sigma’)}$
If the grammar is ambiguous, which means a structure may corresponds to multiple parses,
conditional probability of a structure should be written as
$P(y|x) = \sum_{\sigma \in y} P(\sigma|x) = \frac{\sum_{\sigma \in y}P(x,\sigma)}{\sum_{\sigma’ \in \Omega(x)}P(x,\sigma’)}$
Let $F_i(x,\sigma)$ the number of occurrences of each production rule $i$ in parse $\sigma$,
the joint probability of $x$ and $\sigma$ is written as
$P(x,\sigma) = \prod_{i}^{n} p_i^{F_i(x,\sigma)} = \exp \left[ \sum_{i}^{n} F_i(x,\sigma) \log p_i \right] = \exp ({\bf w}^T {\bf F}(x,\sigma))$
where $p_i$ is the probability of the $i$-th production rule,
and $w_i = \log p_i$ is regarded as the coefficient of this log linear model.
The conditional probability is rewritten as
$P(y|x) = \frac{1}{Z(x)}\sum_{\sigma \in y} \exp ({\bf w}^T {\bf F}(x,\sigma))$
where $\sum_{\sigma{‘} \in \Omega(x)} \exp ({\bf w}^T {\bf F}(x,\sigma{‘}))$ is the partition function of this Boltzmann distribution.

\begin{align} p(y|x;w) = \frac{\exp{\sum_{j=1}^{J}w_jF_j(x,y)}}{Z(x,w)} \label{eq:general-log-linear} \end{align}
ここで分母は、以下のような分配関数である．
\begin{align} Z(x,w) = \sum_{y’}\exp\sum_{j=1}^{J}w_jF_j(x,y’) \end{align}

\begin{align} \hat{y}^{MLE} = argmax_{y}p(y|x;w) = argmax_{y}\sum_{j=1}^{J}w_jF_j(x,y) \label{eq:mle_general_llm} \end{align}