Log Linear Model (Exponential Family)

In Log Linear Model (Exponential Family), \(P(\vec{x},\vec{\theta})\) of a vector random variable \(\vec{x}\) and a vector parameter \(\vec{\theta}\) is written as
\[
P(\vec{x},\vec{\theta}) = \exp \left[ \vec{\theta} \cdot \vec{x} – \psi(\vec{\theta}) \right]
\]
From now on, arrows indicating vectors will be omitted.
Because \(\int P(x,\theta)=1\) and \(\psi(\theta)\) is a constant w.r.t. \(x\),
\[
1 = \int \exp \left[ \sum_i \theta^i x_i – \psi(\theta)\right] dx
= \frac{1}{\exp \psi(\theta)} \int \exp \sum_i \theta^i x_i dx
\]
Hence the normalization constant \(\psi(\theta)\) is written as
\[
\psi(\theta) = \log \int \exp \sum \theta^i x_i dx
\]
Assuming the order of partial derivative by \(\theta^i\) and the integral by \(x\) can be reversed, the partial derivative of \(\psi(\theta)\) by \(\theta^i\) is
\[
\begin{align}
\frac{\partial \psi(\theta)}{\partial \theta^i}
&= \frac{ \frac {\partial} {\partial \theta^i} \int \exp \sum \theta^i x_i dx}
{\int \exp \sum \theta^i x_i dx}
= \frac{\int \frac{\partial}{\partial \theta^i} \exp \sum \theta^i _i x_i\ dx}
{\exp \psi(\theta)}
= \frac{\int x_i \exp \sum \theta^i x_i dx}
{\exp \psi(\theta)} \\
&= \int x_i \exp \left[ \sum \theta^i x_i – \psi(\theta) \right] dx
= \int x_i p(x,\theta) dx
= E \left[ x |\theta \right]
\end{align}
\]