Week 4, Thursday

Lattice Simulation Results

We now have everything prepared, all that remains is to pick any observable. We then just have to evaluate it on sample field configurations and average over the samples to approximate the path integral with that observable inserted. The simplest observable would be the average field over the $N\times N$ lattice sites: \begin{equation} \bar\phi = \frac{1}{N^2} \sum_{n\in (\mathbb{Z}/N)^2} \phi(n). \end{equation} While the distribution of samples shows some interesting patterns, the average alone is always going to be zero because of the $\phi\to -\phi$ symmetry of our theory. A more interesting observable would be, for example, \begin{equation} |\bar\phi| = \frac{1}{N^2} \sum_{n\in (\mathbb{Z}/N)^2} |\phi(n)|. \end{equation} By plotting it as a function of $m_L^2$, $\lambda_L$ we make the following observations:

  • There seems to be a phase transition between an “unbroken phase” with $\langle|\bar\phi\rangle=0$ for large $m^2$ and a “broken phase”with $\langle|\bar\phi\rangle>0$ for small (large negative) $m^2$.

  • The actual place where the phase transition takes place is at finite negative $m_L^2$, for example $\lambda_L\approx 1.0$, $m_L^2\approx -1.3$.

  • In the broken phase the actual value of $\langle|\bar\phi\rangle>0$ depends on $m_L^2$, $\lambda_L$. \end{itemize} At least the last part is clear: the potential \begin{equation} V(\phi) = \frac{1}{2}m_L^2 \phi^2 + \frac{1}{4!} \lambda_L \phi^4

has two distinct minimal for $m_L^2 < 0$ whose position depends on $m_L^2$ and $\lambda_L$. It should also not be too surprising that there are qualitative differences between the case where $V(\phi)$ has a single minimum at $V(0)=0$ and the case where it has two separate minima. Though classically the distinction is just whether $m_L^2$ is positive or negative, the fact that the dividing line is not at zero will only find an explanation later on. If we want to numerically find the precise point of the phase transition then it would be nice to have a better observable than $|\bar\phi|$, for example one that has a simple peak at the point of the transition. Such an observable is the susceptibility, which you might have seen in the Ising model as magnetic susceptibility. The only difference is that $\phi\in \pm 1$ in the Ising model. There, the average $\bar\phi$ is the total magnetization. The susceptibility is the change of the magnetization if a constant external field is applied. In the $\phi^4$ theory it is not really justified to call $\bar\phi$ a magnetization, but we can still talk about the change in an external field. Technically, this means we add a source term $-E \mapsto -E + J\bar\phi$ to the exponent of the partition function.

Definition: With $Z(J) = \sum_\mu \exp(-E_\mu + J\bar\phi)$, the susceptibility $\chi$ is the quantity \begin{equation} \chi = \frac{\partial\langle \bar\phi\rangle}{\partial J} \Big|_{J=0}. \end{equation}

We expect that it is only close to the phase transition that a small change in an external field will affect a large change in $\langle\bar\phi\rangle$. However, the definition is not very useful to compute $\chi$ from a lattice simulation. Fortunately, we can rewrite it as \begin{equation} \begin{split} \chi =&\; \frac{\partial}{\partial J} \frac{\sum_\mu \bar\phi e^{-E_\mu +J\bar\phi}} {\sum_\mu e^{-E_\mu +J\bar\phi}} = \frac{\partial}{\partial J} \frac{1}{Z(J)} \frac{\partial}{\partial J} Z(J) \big|_{J=0} \\ =&\; \left[ \frac{1}{Z(J)} \frac{\partial^2 Z(J)}{\partial J^2} – \left( \frac{1}{Z(J)} \frac{\partial Z(J)}{\partial J} Z(J) \right)^2 \right]_{J=0} \\ =&\; \langle \bar\phi^2 \rangle – \big( \langle\bar\phi\rangle \big)^2, \end{split} \end{equation} which is just the variance of the $\bar\phi$ observable. And the variance is of course easily approximated using the sample variance. The result is plotted in the figure below. It shows the susceptibility for fixed $\lambda_L=1.0$ and varying $-2\leq m_L^2 \leq 0$ and three different lattice sizes. susceptibility-latticesize See also the notes about the lattice simulation code if you want to try it out yourself!

Week 4, Wednesday

Markov Chains

We ultimately want to compute expectation values in QFT, \begin{equation} \langle\mathcal{O}\rangle = \frac{ \int \mathcal{D}\phi \; \mathcal{O} e^{\frac{i}{\hbar}S} }{ \int \mathcal{D}\phi \; e^{\frac{i}{\hbar}S} }, \end{equation} where we could set the denominator to one by properly normalizing the path integral measure. However, in practice this is too difficult and we rather divide by the normalization as above. Wick rotation turns this into normal expectation values that you are familiar with from statistical mechanics, \begin{equation} \langle\mathcal{O}\rangle = \frac{ \sum_\mu \mathcal{O}_\mu e^{-\beta E_\mu} }{ \sum_\mu e^{-\beta E_\mu} } ,\quad \beta = \frac{1}{k_B T}, \end{equation} where now $\mu$ indexes the states of the system, $\mathcal{O}_\mu$ is the value of some observable for the state $\mu$, and $E_\mu$ is the energy of the state $\mu$.

Even for very modest lattices it is impossible to sum over all microstates on a computer, even if we restrict the field values to just $\pm 1$ instead of a floating-point number. We cannot even sample a significant fraction, so naively picking random fields would just yield numerical garbage. The trick is to sample important ones, that is, with large $e^{-\beta E_\mu}$, that is, small energy. In fact, the ideal solution would be if we had a means to generate $n$ random samples with probability not constant but equal to the Boltzmann distribution \begin{equation} p_\mu = \frac{e^{-\beta\mu}}{\sum_\nu e^{-\beta E_\nu}}. \end{equation} Then the Boltzmann-weighted average just turns into the normal (arithmetic) average \begin{equation} \langle \mathcal{O} \rangle_n = \frac{ \sum_{j=1}^n p_j^{-1} \mathcal{O}_j e^{-\beta E_j} }{ \sum_{j=1}^n p_j^{-1} e^{-\beta E_j} } = \frac{1}{n} \sum_{j=1}^n \mathcal{O}_j. \end{equation}

In fact, there is a way, and it uses a random walk in the space of fields. That is, at each step you modify the field configuration according to certain rules and these step sample the fields with probability distribution $p_\mu$. The key idea is to use

Definition: A Markov chain is a process in which the probability $P(\mu\to\nu)$ of making a transition from $\mu$ to $\nu$ depends only on $\mu$ and $\nu$.

Under two conditions the states in a Markov chain sample states with a prescribed probability distribution:

  • Ergodicity: it must be possible to reach each state by a sequence of steps.

  • Detailed balance: $p_\mu P(\mu\to\nu) = p_\nu P(\nu\to \mu)$.

Note that the detailed balance condition is a stronger condition than balance, that is, equilibrium. The latter means that the rate of transition out of $\mu$ must equal the rate of transition towards $\mu$, that is, \begin{equation} \sum_\nu p_\mu P(\mu\to\nu) = \sum_\nu p_\nu P(\nu\to \mu). \end{equation}

Metropolis Algorithm

The Metropolis algorithm is a Markov chain whose states are discretized fields and with transition probabilities \begin{equation} \frac{P(\mu\to\nu)}{P(\nu\to\mu)} = \frac{p_\nu}{p_\mu} = e^{-\beta(E_\nu – E_\mu)}, \end{equation} as required for sampling fields with the Boltzmann probability distribution. It works by starting with a random field, in our implementation the values are taking to be $\phi(n) \in [-1.5, 1.5]$. Then repeat the following steps:

  • Select a random lattice site $n$.
  • Select a random $\Delta \phi$, in our implementation uniformly random in the range $[-1.5, 1.5]$.

  • Let \begin{equation} \phi'(m) = \begin{cases} \phi(n) + \Delta\phi & \text{if }n = m \\ \phi(m) & \text{else}. \end{cases} \end{equation}

  • Accept $\phi’$ as the new field configuration with the acceptance probability \begin{equation} A(\phi\to\phi’) = \begin{cases} 1 & \text{if } E(\phi’) < E(\phi) \\ e^{-\beta[E(\phi')-E(\phi)]} & \text{else}. \end{cases} \end{equation}

The acceptance probability might seem a bit arbitrary, and indeed is a choice that could have been made differently. Only the ratio \begin{equation} \frac{P(\mu\to\nu)}{P(\nu\to\mu)} = \frac{A(\mu\to\nu)}{A(\nu\to\mu)} = e^{-\beta(E(\phi’) – E(\phi))} \end{equation} matters. Also, note that the energy difference $E(\phi’) – E(\phi)$ can be computed very effectively since it only depends on the chosen lattice site and its nearest neighbors.

The fields $\phi$, $\phi’$ from a single Metropolis step are by no means statistically independent samples. After all, the field was only changed at a single lattice site at most. This does not affect the limiting value for the average, but it does change how quickly the running average approaches the limit as we take more samples. Formally, this can be seen in the formula for the standard deviation of $M$ correlated samples, \begin{equation} \sigma = \sqrt{ \frac{1+2\tau}{M-1} \Big( \langle\mathcal{O}^2\rangle – (\langle\mathcal{O}\rangle)^2 \Big) } \end{equation} where $\tau$ is the autocorrelation time. Instead of falling off like $\frac{1}{\sqrt{M}}$ form $M\gg1$ as familiar for the standard error of random errors, it behaves as if we only collected $\frac{M}{1+2\tau}$ samples. Since computing the observable typically is much more expensive than a single Metropolis step, one combines many steps into a “Metropolis Sweep” and only computes the observable each sweep. The field configuration after each sweep is much less correlated to the one before. For example, in our implementation each iteration consists of $5 \times$ {# lattice sites} Metropolis steps. In addition, the implementation also uses the Wolff Algorithm to flip entire clusters of points. This is a further numerical improvement over just the Metropolis algorithm.