--- title: The Falikov-Kimball Model - Introduction excerpt: The Falikov-Kimball is about the simplest possible testbed we could have for the many electron problem. layout: none image: --- The Falikov-Kimball Model - Introduction {% include header.html %}

Contributions

This material is this chapter expands on work presented in

citekey? One-dimensional long-range Falikov-Kimball model: Thermal phase transition and disorder-free localization, Hodson, T. and Willsher, J. and Knolle, J., Phys. Rev. B, 104, 4, 2021,

Johannes had the initial idea to use a long range Ising term to stablise order in a one dimension Falikov-Kimball model. Josef developed a proof of concept during a summer project at Imperial. The three of us brought the project to fruition.

[WARNING] Citeproc: citation abanin_recent_2017 not found abaninRecentProgressManybody2017
[WARNING] Citeproc: citation anderson_absence_1958-1 not found andersonAbsenceDiffusionCertain1958
[WARNING] Citeproc: citation antipov_interaction-tuned_2016-1 not found andersonAbsenceDiffusionCertain1958
[WARNING] Citeproc: citation binder_finite_1981 not found binderFiniteSizeScaling1981
[WARNING] Citeproc: citation croy_anderson_2011 not found croyAndersonLocalization1D2011
[WARNING] Citeproc: citation dalessio_quantum_2016 not found dalessioQuantumChaosEigenstate2016
[WARNING] Citeproc: citation dyson_existence_1969 not found dysonExistencePhasetransitionOnedimensional1969
[WARNING] Citeproc: citation fig:binder not found
[WARNING] Citeproc: citation goldshtein_pure_1977 not found goldshteinPurePointSpectrum1977
[WARNING] Citeproc: citation huang_accelerated_2017 not found huangAcceleratedMonteCarlo2017
[WARNING] Citeproc: citation hubbard_j._electron_1963 not found
[WARNING] Citeproc: citation imbrie_diagonalization_2016 not found
[WARNING] Citeproc: citation imbrie_many-body_2016 not found
[WARNING] Citeproc: citation izrailev_anomalous_2012 not found
[WARNING] Citeproc: citation izrailev_localization_1999 not found
[WARNING] Citeproc: citation kapfer_sampling_2013 not found
[WARNING] Citeproc: citation kennedy_itinerant_1986 not found
[WARNING] Citeproc: citation khatami_fluctuation-dissipation_2013 not found
[WARNING] Citeproc: citation kramer_localization_1993 not found
[WARNING] Citeproc: citation krauth_introduction_1996 not found
[WARNING] Citeproc: citation lieb_absence_1968 not found
[WARNING] Citeproc: citation lipkin_validity_1965 not found
[WARNING] Citeproc: citation maska_thermodynamics_2006-1 not found
[WARNING] Citeproc: citation musial_monte_2002 not found
[WARNING] Citeproc: citation nagaoka_ferromagnetism_1966 not found
[WARNING] Citeproc: citation noauthor_hubbard_2013 not found
[WARNING] Citeproc: citation peierls_isings_1936 not found
[WARNING] Citeproc: citation roberts_weak_1997 not found
[WARNING] Citeproc: citation ruelle_statistical_1968 not found
[WARNING] Citeproc: citation schiulaz_dynamics_2015 not found
[WARNING] Citeproc: citation smith_disorder-free_2017 not found
[WARNING] Citeproc: citation smith_dynamical_2018 not found
[WARNING] Citeproc: citation srednicki_chaos_1994 not found
[WARNING] Citeproc: citation thouless_long-range_1969 not found
[WARNING] Citeproc: citation yao_quasi-many-body_2016 not found

Introduction

Localisation

The discovery of localisation in quantum systems surprising at the time given the seeming ubiquity of extended Bloch states. Later, when thermalisation in quantum systems gained interest, localisation phenomena again stood out as counterexamples to the eigenstate thermalisation hypothesisabanin_recent_2017?,srednicki_chaos_1994?, allowing quantum systems to avoid to retain memory of their initial conditions in the face of thermal noise.

The simplest and first discovered kind is Anderson localisation, first studied in 1958anderson_absence_1958-1? in the context of non-interacting fermions subject to a static or quenched disorder potential \(V_j\) drawn uniformly from the interval \([-W,W]\)

\[ H = -t\sum_{\langle jk \rangle} c^\daggerger_j c_k + \sum_j V_j c_j^\daggerger c_j \]

this model exhibits exponentially localised eigenfunctions \(\psi(x) = f(x) e^{-x/\lambda}\) which cannot contribute to transport processes. Initially it was thought that in one dimensional disordered models, all states would be localised, however it was later shown that in the presence of correlated disorder, bands of extended states can existizrailev_localization_1999?,croy_anderson_2011?,izrailev_anomalous_2012?.

Later localisation was found in interacting many-body systems with quenched disorder:

\[ H = -t\sum_{\langle jk \rangle} c^\daggerger_j c_k + \sum_j V_j c_j^\daggerger c_j + U\sum_{jk} n_j n_k \]

where the number operators \(n_j = c^\dagger_j c_j\). Here, in contrast to the Anderson model, localisation phenomena can be proven robust to weak perturbations of the Hamiltonian. This is called many-body localisation (MBL)imbrie_many-body_2016?.

Both MBL and Anderson localisation depend crucially on the presence of quenched disorder. This has led to ongoing interest in the possibility of disorder-free localisation, in which the disorder necessary to generate localisation is generated entirely from the dynamics of the model. This contracts with typical models of disordered systems in which disorder is explicielty introduced into the Hamilton or the initial state.

The concept of disorder-free localisation was first proposed in the context of Helium mixtures1 and then extended to heavy-light mixtures in which multiple species with large mass ratios interact. The idea is that the heavier particles act as an effective disorder potential for the lighter ones, inducing localisation. Two such modelsyao_quasi-many-body_2016?,schiulaz_dynamics_2015? instead find that the models thermalise exponentially slowly in system size, which Ref.yao_quasi-many-body_2016? dubs Quasi-MBL.

True disorder-free localisation does occur in exactly solveable models with extensively many conserved quantitiessmith_disorder-free_2017?. As conserved quantites have no time dynamics this can be thought of as taking the separation of timescales to the infinite limit.

The Falikov Kimball Model

In the Falikov Kimball (FK) model spinless fermions \(c_{i\uparrow}\) are coupled via a repulsive on-site interaction to a classical degree of freedom \(n_{i\downarrow}\).

\[\begin{aligned} H &= -t \sum_{<ij>} c^\daggerger_{i\uparrow}c_{j\uparrow} + U \sum_{i} (n_{i \uparrow} - 1/2)( n_{i\downarrow} - 1/2) \\ & - \mu \sum_i \left( n_{i \uparrow} + n_{i \downarrow} \right) + \sum_{ij} V_{ij} (n_{i\downarrow} - 1/2)(n_{j\downarrow} - 1/2) \end{aligned}\] replace with hamiltonian from the paper

This notation emphasises that this can also be thought of as an asymmetric Hubbard model in which the spin down electrons cannot hop and are subject to an additional long range potential. However, to avoid the confusion of talking about two distinct species of spinless electrons we will use a different interpretation and refer to the classical degrees of freedom as the “ionic sector” and the quantum degrees of freedom as the “electronic sector”. The final term that induces interactions between the classical particles has been added by us to stabilise the formation of an ordered phase in 1D. The classical variables commute with the Hamiltonian \([H, n_{i\downarrow}] = 0\) so like the lattice gauge model in Refsmith_disorder-free_2017?} the FK model has extensively many conserved quantities which can act as an effective disorder potential for the electronic sector.

Due to Pauli exclusion, the maximum filling occurs when one of each species occupies each lattice site such that \(\sum_i (n_{i\downarrow} + n_{i\uparrow} )/ N = 2\). Here we focus on the half filled case which also displays particle-hole symmetry (see later).

Falikov Kimball and Hubbard models

We will first introduce the standard Hubbard and Falikov-Kimball (FK) models then look at some of their properties. We’ll then cover why the Falikov-Kimball model represents an interesting system in which to study disorder free localisation.

Hubbard model

The Hubbard model gives a very simple setting in which to study interacting, itinerant electrons. It is a tight binding model of spin half electrons with finite bandwidth \(t\) and a repulsive on-site interaction \(U > 0\).

\[ H = -\sum_{<ij>,\sigma} t_{\sigma} c^\dagger_{i\sigma}c_{j\sigma} + U \sum_{i} (n_{i \uparrow} - 1/2)( n_{i\downarrow} - 1/2) - \mu \sum_i \left( n_{i \uparrow} + n_{i \downarrow} \right) \]

in standard notation. The standard Hubbard model corresponds to the case \(t_{\uparrow} = t_{\downarrow}\). Here we have used the particle-hole symmetric version of the interaction term, which is more often given as \(n_{i \uparrow} n_{i\downarrow}\). The difference just amounts to a redefinition of the chemical potential.

Hubbard originally used the model at half filling \(\mu = 0\) to explain the Mott metal-insulator (MI) transition, however it has seen applications to high-temperature superconductivity and become target for cold-atom optical trap experiments.noauthor_hubbard_2013?, greiner_quantum_2002, jordens_mott_2008}. While simple, only a few analytic results exist, namely the Bethe ansatzlieb_absence_1968?} which proves the absence of even a zero temperature phase transition in the 1D model and Nagaoka’s theoremnagaoka_ferromagnetism_1966?} which proves that the three dimensional model has a ferromagnetic ground state in the vicinity of half filling.

Falikov-Kimball model

The Falikov-Kimball model corresponds to the case \(t_{\downarrow} = 0\). It can be interpreted as two coupled spinless electron bands with infinite mass ratio. An itinerant light species with creation operator \(c^\dagger_{i\uparrow}\) coupled to an infinitely heavy, immobile species with density operator \(n_{i\downarrow}\). These are often called c and f electrons or electrons and ions. The model was first introduced by Hubbard in 1963 as a model of interacting localised and de-localised electron bands and gained its name from Falikov and Kimball’s use of it to study the MI transition in rare-earth materialshubbard_j._electron_1963?, falicov_simple_1969}.

Here we will use refer to the light spinless species as electrons' with creation operator $c^\dagger_{i}$ and the heavy species asions’ with density operator \(n_i\). When the the density operator of the electrons is needed I’ll always use \(c^\dagger_{i}c_{i}\). We also set \(t = 1\).

\[ H_{\mathrm{FK}} = -\sum_{<ij>} c^\dagger_{i}c_{j} + U \sum_{i} (c^\dagger_{i}c_{i} - 1/2)( n_i - 1/2) - \mu \sum_i \left(c^\dagger_{i}c_{i} + n_{i}\right) \] % ### Particle-Hole Symmetry The Hubbard and FK models on a bipartite lattice have particle-hole (PH) symmetry \(P^\dagger H P = - H\), accordingly they have symmetric energy spectra. The associated symmetry operator \(P\) exchanges creation and annihilation operators along with a sign change between the two sublattices.

\[ d^\dagger_{i\sigma} = (-1)^i c_{i\sigma}\] % The entirely filled state \(\ket{\Omega} = \sum_{j\rho} c^\dagger_{j\rho} \ket{0}\) becomes the new vacuum state: \[d_{i\sigma} \ket{\Omega} = (-1)^i c^\dagger_{i\sigma} \sum_{j\rho} c^\dagger_{j\rho} \ket{0} = 0\] % The number operator \(n_{i\sigma} = 0,1\) counts holes rather than particles: \[ d^\dagger_{i\sigma} d_{i \sigma} = c_{i\sigma} c^\dagger_{i\sigma} = 1 - c^\dagger_{i\sigma} c_{i\sigma}\] % With the last equality following from the fermionic commutation relations. In the case of nearest neighbour hopping on a bipartite lattice this transformation also leaves the hopping term unchanged: \[ d^\dagger_{i\sigma} d_{j \sigma} = (-1)^{i+j} c_{i\sigma} c^\dagger_{j\sigma} = c^\dagger_{i\sigma} c_{j\sigma} \] % Since when \(i\) and \(j\) label sites on separate sublattices, \((-1)^{i-j} = -1\) and this is absorbed into rearranging the operators via their anti-commutator.

Defining the particle density \(\rho\) as the number of fermions per site: \[ \rho = \frac{1}{N} \sum_i \left( n_{i \uparrow} + n_{i \downarrow} \right) \] % The PH symmetry maps the Hamiltonian to itself with the sign of the chemical potential reversed and the density is inverted about half filling: \[ \text{PH} : H(t, U, \mu) \rightarrow H(t, U, -\mu) \] \[ \rho \rightarrow 2 - \rho \] % The Hamiltonian is symmetric under PH at \(\mu = 0\) and so must all the observables, hence half filling \(\rho = 1\) occurs here. This symmetry and known observable acts as a useful test for the numerical calculations.

Thermodynamics of the FK model

\begin{figure}

} \end{figure}

At half filling and in dimensions greater than one, the FK model exhibits a phase transition at some \(U\) dependent critical temperature \(T_c(U)\) to a low temperature charge density wave state in which the ions occupy one of the two sublattices A and Bmaska_thermodynamics_2006-1?}. The order parameter is the square of the staggered magnetisation: \[ M = \sum_{i \in A} n_i - \sum_{i \in B} n_i \] % In the disordered phase Ref.antipov_interaction-tuned_2016-1?} identifies an interplay between Anderson localisation at weak interaction and a Mott insulator phase in the strongly interacting regime.

In the one dimensional FK model, however, Peierls’ argumentpeierls_isings_1936?, kennedy_itinerant_1986} and the Bethe ansatzlieb_absence_1968?} make it clear that there is no ordered CDW phase. Peierls’ argument is that one should consider the difference in free energy \(\Delta F = \Delta E - T\Delta S\) between an ordered state and a state with single domain wall in the order parameter. In the Ising model this would be having the spins pointing up in one part of the model and down in the other, for a CDW phase it means having the ions occupy the A sublattice in one part and the B sublattice in the other.

Short range interactions will produce a constant energy penalty for such a domain wall that does not scale with system size while in 1D there are \(L\) such states so the domain wall is associated with entropy \(S \propto \ln L\) which dominates in the thermodynamic limit. The slow logarithmic scaling suggests we should be wary of finite size scaling effects.

One dimensional systems are more amenable to numerical and experimental study so we add long range staggered interactions to bring back the ordered phase:

\[ H_{\textrm{int}} = 4J \sum_{ij} \frac{(-1)^{|i-j|}}{ |i - j|^{\alpha} } (n_i - 1/2) (n_j - 1/2) = J \sum_{ij} |i - j|^{-\alpha} \tau_i \tau_j\] % at half-filling the modified Hamiltonian is then: \[ H_{\mathrm{FK}}^* &= -\sum_{<ij>} c^\dagger_{i}c_{j} + U \sum_{i} (c^\dagger_{i}c_{i} - 1/2)( n_i - 1/2) \\ &+ 4J \sum_{ij} \frac{(-1)^{|i-j|}}{ |i - j|^{\alpha} } (n_i - 1/2) (n_j - 1/2) \\ &= -\sum_{<ij>} c^\dagger_{i}c_{j} + 2U \sum_{i} (-1)^i (c^\dagger_{i}c_{i} - 1/2)\tau_i + J \sum_{ij} |i - j|^{-\alpha} \tau_i \tau_j \\ \] % The form of this interaction comes from interpreting the f-electrons as a classical Ising chain using a staggered mapping \(\tau_i = (-1)^i (2n_i^ f - 1)\) so that ferromagnetic order in the \(\tau_i\) variables corresponds to a CDW state in the \(n_i^f\) variables. It also preserves the particle hole symmetry because for the ions the PH transformation corresponds to \(n_i \rightarrow 1 - n_i\). When \(U = 0\) the model decouples into a long ranged Ising model and free fermions.

Our extension to the FK model could now be though of as spinless fermions coupled to a long range Ising (LRI) model. The LRI model has been extensively studied and its behaviour may be bear relation to the behaviour of our modified FK model.

\[H_{\mathrm{LRI}} = \sum_{ij} J(\abs{i-j}) \tau_i \tau_j = J \sum_{i\neq j} |i - j|^{-\alpha} \tau_i \tau_j\] % Rigorous renormalisation group arguments show that the LRI model has an ordered phase in 1D for $1 < < 2 $dyson_existence_1969?}. Peierls’ argument can be extendedthouless_long-range_1969?} to provide intuition for why this is the case. Again considering the energy difference between the ordered state \(\ket{\ldots\uparrow\uparrow\uparrow\uparrow\ldots}\) and a domain wall state \(\ket{\ldots\uparrow\uparrow\downarrow\downarrow\ldots}\). In the case of the LRI model, careful counting shows that this energy penalty is: \[\Delta E \propto \sum_{n=1}^{\infty} n J(n)\] % because each interaction between spins separated across the domain by a bond length \(n\) can be drawn between \(n\) equivalent pairs of sites. Ruelle proved rigorously for a very general class of 1D systems, that if \(\Delta E\) or its many-body generalisation converges in the thermodynamic limit then the free energy is analyticruelle_statistical_1968?}. This rules out a finite order phase transition, though not one of the Kosterlitz-Thouless type. Dyson also proves this though with a slightly different condition on \(J(n)\)dyson_existence_1969?}.

With a power law form for \(J(n)\), there are three cases to consider:

  1. $ = 0$ For infinite range interactions the Ising model is exactly solveable and mean field theory is exactlipkin_validity_1965?}.
  2. $ $ For slowly decaying interactions \(\sum_n J(n)\) does not converge so the Hamiltonian is non-extensive, a case which won’t be further considered here.
  3. $ 1 < < 2 $ A phase transition to an ordered state at a finite temperature.
  4. $ = 2 $ The energy of domain walls diverges logarithmically, and this turns out to be a Kostelitz-Thouless transitionthouless_long-range_1969?}.
  5. $ 2 < $ For quickly decaying interactions, domain walls have a finite energy penalty, hence Peirels’ argument holds and there is no phase transition.

Thermodynamics

On bipartite lattices in dimensions 2 and above the FK model exhibits a finite temperature phase transition to an ordered charge density wave (CDW) phasemaska_thermodynamics_2006-1?. In this phase, the ions are confined to one of the two sublattices, breaking the \(\mathbb{Z}_2\) symmetry.

In 1D, however, Periel’s argumentpeierls_isings_1936?,kennedy_itinerant_1986? states that domain walls only introduce a constant energy penalty into the free energy while bringing a entropic contribution logarithmic in system size. Hence the 1D model does not have a finite temperature phase transition. However 1D systems are much easier to study numerically and admit simpler realisations experimentally. We therefore introduce a long range coupling between the ions in order to stabilise a CDW phase in 1D. This leads to a disordered system that is gaped by the CDW background but with correlated fluctuations leading to a disorder-free correlation induced mobility edge in one dimension.

Markov Chain Monte Carlo

To evaluate thermodynamic averages we perform a classical Markov Chain Monte Carlo random walk over the space of ionic configurations, at each step diagonalising the effective electronic Hamiltonianmaska_thermodynamics_2006-1?}. Using a binder-cumulant methodbinder_finite_1981?,musial_monte_2002?, we demonstrate the model has a finite temperature phase transition when the interaction is sufficiently long ranged. We then estimate the density of states and the inverse participation ratio as a function of energy to diagnose localisation properties. We show preliminary results that the in-gap states induced at finite temperature are localised while the states in the unperturbed bands remain extended, evidence for a mobility edge.

Figure 1: Hello I am the figure caption!

Macro definitions in this cell \[ \newcommand{\expval}[1]{\langle #1 \rangle} \newcommand{\ket}[1]{|#1\rangle} \newcommand{\bra}[1]{\langle#1|} \newcommand{\op}[2]{|#1\rangle \langle#2|} \]

\[ \expval{O}, \op{\alpha}{\beta}, \ket{\psi} \]

Localisation

Thermalisation

Isolated classical systems generally thermalise if they are large enough. Since classical dynamics is the limit of some underlying quantum dynamics, it seems reasonable to suggest that isolated quantum systems also thermalise in some related sense. However it is not immediately obvious what thermalisation should mean in a quantum setting where the presence of unitary time dynamics implies full information about a system’s initial state is always preserved.

A potential solution lies in the eigenstate thermalisation hypothesis. It states that if a system thermalises, then we can isolate small patches of a larger system, trace everyhing else out, and get a thermal density matrix.

Following Ref.abanin_recent_2017?, consider the time evolution of a local operator \(\hat{O}\) \[ \expval{\hat{O}}{\psi(t)} = \sum_{\alpha \beta} C^*_\alpha C_\beta e^{i(E_\alpha - E_\beta)} O_{\alpha \beta}\]

Where \(C_\alpha\) are determined by the initial state and \(O_{\alpha \beta} = \expval{\alpha | \hat{O} | \beta}\) are the matrix elements of \(\hat{O}\) with respect to the energy eigenstates. Srednickisrednicki_chaos_1994?} introduced the ansatz that for local operators:

\[O_{\alpha \beta} = O(E)\delta_{\alpha\beta} + e^{-S(E)/2} f(E,\omega) R_{\alpha\beta}\]

with \(E = (E_\alpha + E_\beta)\), \(\omega = (E_\alpha - E_\beta)\) and \(R_{\alpha\beta}\) are sampled from some distribution with zero mean and unit variance. The first term asserts that the diagonal elements are given by the thermal expectation value \(O(E) = Tr[e^{-\beta \hat{H}} \hat{O}]/\mathcal{Z}\) with \(\beta\) an effective temperature defined by equating the energy to the expectation of the Hamiltonian at that temperature \(E = Tr[H e^{-\beta \hat{H}}/\mathcal{Z}]\).

The second term deals with thermodynamic fluctuations scaled by the entropy \(S(E) = -Tr(\rho \log \rho)\) where \(\rho = e^{-\beta \hat{H}}\) and \(\mathcal{Z} = Tr[e^{-\beta \hat{H}}]\).

With this ansatz the long time average of the observable becomes equal to the thermal expectations with fluctuations suppressed by the entropic term \(e^{-S(E)}\) and the rapidly varying phase factors \(e^{i(E_\alpha - E_\beta)}\). This statement of the ETH has verified for the quantum hard sphere modelsrednicki_chaos_1994? and numerically for other modelskhatami_fluctuation-dissipation_2013?,dalessio_quantum_2016?.

An alternate view on ETH is the statement that in thermalising systems individual eigenstates look thermal when viewed locally. Take a eigenstate \(|\alpha\rangle\) with energy \(E_\alpha\) and as before define an effective temperature with \(E_\alpha = Tr[H e^{-\beta \hat{H}}/\mathcal{Z}]\). This statement of the ETH says that if we partition the system into subsystems A and B with a limit taken as B becomes very large, B will act as a heat bath for A. Specifically the reduced density matrix \(\rho_A = Tr_B \op{\alpha}{\alpha}\) is equal to the thermal density matrix:

\[\rho_A = Tr_B |\alpha\rangle \langle \alpha| = \mathcal{Z}^{-1} Tr_B [e^{-\beta \hat{H}}] \]

Intuitively, for thermalisation to happen, the degrees of freedom must be sufficiently well coupled that energy transport occurs. This condition is broken by systems with localised states so a lack of thermalisation is often used as a diagnostic tool for localisation.

Anderson Localisation

Localisation was first studied by Anderson in 1958anderson_absence_1958-1? in the context of non-interacting fermions subject to a static or quenched disorder potential \(V_j\) drawn uniformly from the interval \([-W,W]\):

\[ H = -t\sum_{\expval{jk}} c^\dagger_j c_k + \sum_j V_j c_j^\dagger c_j \]

At sufficiently strong disorder the Anderson model exhibits exponentially localised eigenfunctions \(\psi(x) = f(x) e^{-x/\lambda}\) which cannot contribute to diffusive transport processes. Except in 1D where any disorder strength is sufficient. Intuitively this happens because hopping processes between nearby sites become off-resonant, hindering the hybridisation that would normally lead to extended Bloch stateskramer_localization_1993?.

In one and two dimensions, all the states in the Anderson model are localised. In three dimensions there are mobility edges. Mobility edges are critical energies in the spectrum which separate delocalised states in a band from localised states which form a band tailabanin_recent_2017?}. An argument due to Lifshitz shows that the density of state of the band tail should decay exponentially and localised and extended stats cannot co-exist at the same energy as they would hybridise into extended stateskramer_localization_1993?}.

It was thought that mobility edges could not exist in 1D because all the states localised in the presence of any amount of disorder. This is true for uncorrelated potentialsgoldshtein_pure_1977?}. However, it was shown that if the disorder potential \(V_j\) contains spatial correlations mobility edges do exist in 1Dizrailev_localization_1999?, izrailev_anomalous_2012}. Ref.croy_anderson_2011?} extends this work to look at power law decay of the correlations: \[ C(l) = \expval{V_i V_{i+l}} \propto l^{-\alpha} \] % Figure \(\ref{fig:anderson_dos}\) shows numerical calculations of the Localisation length (see later) and density of states for the power law correlated Anderson model. At the unperturbed band edges \(\abs{E} = 2\), the states transition from extended to localised. The behaviour close to the edge takes a universal scaling form with exponents dependant on \(\alpha\).

Many Body Localisation

A related phenomena known as many body localisation (MBL) was found in interacting systems with quenched disorder. A simple example comes from adding density-density interactions to the Anderson model:

\[ H = -t\sum_{\expval{jk}} c^\dagger_j c_k + \sum_j V_j c_j^\dagger c_j + U\sum_{jk} n_j n_k \] % with \(n_j = c^\dagger_j c_j\) Here, in contrast to the Anderson model, localisation phenomena can be proven robust to weak perturbations of the Hamiltonianimbrie_many-body_2016?}.

MBL is defined by the emergence of an extensive number of quasi-local operators called local integrals of motions (LIOMs) or l-bits. Following Ref.abanin_recent_2017?}, using a spin system with variables \(\sigma^z_i\), any operator can be written in the general form:

\[ \tau^z_i = \sigma^z_i + \sum_{\alpha\beta kl} f_{kl}^{\alpha\beta} \sigma^\alpha_{i+k} \sigma_z\beta_{i+k} + ...\] % what defines a MBL system is that there exist extensively many \(\tau^z_i\) for which the coefficients decay exponentially with distance \(f_{kl}^{\alpha\beta} \propto e^{-\max(\abs{l},\abs{k}) / \xi}\). These LIOMs commute with the Hamiltonian and each other \([\hat{H}, \tau^z_i] = [\tau^z_i, \tau^z_j] = 0\). It is this extensive number of conserved local charges that leads to the localisation properties of MBL. It also has implications for the way entanglement grows over time in MBL systems.

Since the Hamiltonian commutes with all the LIOMs and they are a complete operator basis, the Hamiltonian can be written as:

\[\hat{H} = \sum_{i} h_i \tau^z_i + \sum_{ij} J_{ij} \tau^z_i \tau^z_j + \sum_{ijk} J_{ij} \tau^z_i \tau^z_j \tau^z_k+ ...\] % Where again the couplings decay exponentially, albeit with a different length scale \(\Bar{\xi}\). From this form we see that distant l-bits can only become entangled on a timescale of:

\[ t_{\mathrm{ent}}(r) \propto \frac{\hbar}{J_0} e^{r/\Bar{\xi}} \] % and hence quantum correlations and entanglement propagates logarithmically in MBL systemsimbrie_diagonalization_2016?}.

Disorder Free localisation

Both Anderson localisation and MBL depend on the presence of quenched disorder. Recently the idea of disorder-free localisation has gained traction, asking whether the disorder necessary to generate localisation can be generated entirely from the dynamics of a model itself.

The idea was first proposed in the context of Helium mixtures1} and then extended to heavy-light mixtures in which multiple species with large mass ratios interact, the idea being that the heavier particles act as an effective disorder potential for the lighter ones, inducing localisation. Two such modelsyao_quasi-many-body_2016?,schiulaz_dynamics_2015} instead find that the models thermalise exponentially slowly in system size, which Ref.yao_quasi-many-body_2016?} dubs Quasi-MBL. A. Smith, J. Knolle et al instead looked at models containing an extensive number of conserved quantities and demonstrated true disorder free localisationsmith_disorder-free_2017?}.

Diagnostics of Localisation

Inverse Participation Ratio

The inverse participation ratio is defined for a normalised wave function \(\psi_i = \psi(x_i), \sum_i \abs{\psi_i}^2 = 1\) as its fourth momentkramer_localization_1993?}: \[ P^{-1} = \sum_i \abs{\psi_i}^4 \] % It acts as a measure of the portion of space occupied by the wave function. For localised states it will be independent of system size while for plane wave states in d dimensions $P = L^d $. States may also be intermediate between localised and extended, described by their fractal dimensionality \(d > d* > 0\): \[ P(L) \goeslike L^{d*} \] % For extended states \(d* = 0\) while for localised ones \(d* = 0\). In this work we take use an energy resolved IPRantipov_interaction-tuned_2016-1?: \[ DOS(\omega) = \sum_n \delta(\omega - \epsilon_n) IPR(\omega) = DOS(\omega)^{-1} \sum_{n,i} \delta(\omega - \epsilon_n) \abs{\psi_{n,i}}^4 \] Where \(\psi_{n,i}\) is the wavefunction corresponding to the energy \(\epsilon_n\) at the ith site. In practice we bin the energies and IPRs into a fine energy grid and use Lorentzian smoothing if necessary.

Transfer Matrix Approach

The transfer matrix method (TMM) can be used to calculate the localisation length of the eigenstates of a system. Following Refs.kramer_localization_1993?, smith_dynamical_2018}, for bi-linear, 1D Hamiltonians the method represents the action of \(H\) on a state \(\ket{\psi} = \sum_i \psi_i \ket{i}\) with energy E by a matrix equation: \[ H &= - \sum_i (c^\dagger_i c_{i+1} + c^\dagger_{i+1} c_{i}) - \sum_i h_i c^\dagger_i c_i \\ E\ket{\psi} &= H \ket{\psi} \\ \label{eq:tmm_difference} E \psi_i &= -(\psi_{i+1} + \psi_{i-1}) - h_i \psi_i \] % Where Eq. \(\ref{eq:tmm_difference}\) can be represented by a matrix equation: \[ \begin{pmatrix} \psi_{i+1}\\ \psi_{i} \end{pmatrix} = \begin{pmatrix} -(E + h_i) & -1\\ 1 & 0 \end{pmatrix} \begin{pmatrix} \psi_{i}\\ \psi_{i-1} \end{pmatrix} = T_i \begin{pmatrix} \psi_{i}\\ \psi_{i-1} \end{pmatrix} \] % Defining a product along the chain: \[Q_L = \prod_{i=0}^L T_i\] % Oseledec’s theorem proves that there exists a limiting matrix \(\Gamma\): \[ \Gamma = \lim_{L \to \infty} (Q_L Q_L^\dagger)^{\frac{1}{2L}} \] % with eigenvalues \(\exp(\gamma_i)\) where \(\gamma_i\) are the Lyapunov exponents of \(Q_L\). The smallest Lyapunov exponent is the inverse of the localisation length of the state. In practice one takes \(Q_L\) with L equal to the system size, finds the smallest eigenvalue q and estimates the localisation length by: \[ \lambda = \frac{L}{\ln{q}} \] % As noted bysmith_dynamical_2018? this method can be numerically unstable because the matrix elements diverge or vanish exponentially. To get around this, the authors break the matrix multiplication into chunks and the logarithms of the eigenvalues of each are stored separately.

Numerical Methods

In this section we will define the Markov Chain Monte Carlo (MCMC) method in general then detail its application to the FK model. We will then cover methods applicable to the measurements obtained from MCMC. These include calculation of the density of states and energy resolved inverse participation ratio as well as phase transition diagnostics such as the Binder cumulant.

Markov Chain Monte Carlo}

Markov Chain Monte Carlo (MCMC) is a technique for evaluating thermal expectation values \(\expval{O}\) with respect to some physical system defined by a set of states \(\{x: x \in S\}\) and a free energy \(F(x)\)krauth_introduction_1996?}. The thermal expectation value is defined via a Boltzmann weighted sum over the entire states: \[ \tex{O} &= \frac{1}{\Z} \sum_{x \in S} O(x) P(x) \\ P(x) &= \frac{1}{\Z} e^{-\beta F(x)} \\ \Z &= \sum_{x \in S} e^{-\beta F(x)} \]

When the state space is too large to evaluate this sum directly, MCMC defines a stochastic algorithm which generates a random walk \(\{x_0\ldots x_i\ldots x_N\}\) whose distribution \(p(x_i)\) approaches a target distribution \(P(x)\) in the large N limit.

\[\lim_{i\to\infty} p(x_i) = P(x)\]

In this case the target distribution will be the thermal one \(P(x) \rightarrow \Z^{-1} e^{-\beta F(x)}\). The major benefit of the method being that as long as one can express the desired \(P(x)\) up to a multiplicative constant, MCMC can be applied. Since \(e^{-\beta F(x)}\) is relatively easy to evaluate, MCMC provides a useful method for finite temperature physics.

Once the random walk has been carried out for many steps, the expectation values of \(O\) can be estimated from the MCMC samples: \[ \tex{O} = \sum_{i = 0}^{N} O(x_i) + \mathcal{O}(\frac{1}{\sqrt{N}}) \] The the samples in the random walk are correlated so the samples effectively contain less information than \(N\) independent samples would. As a consequence the variance is larger than the \(\tex{O^2} - \tex{O}^2\) form it would have if the estimates were uncorrelated. Methods of estimating the true variance of \(\tex{O}\) and decided how many steps are needed will be considered later.

Markov chains are defined by a transition function $(x_{i+1} x_i) $ giving the probability that a chain in state \(x_i\) at time \(i\) will transition to a state \(x_{i+1}\). Since we must transition somewhere at each step, this comes with the normalisation condition that \(\sum\limits_x \T(x' \rightarrow x) = 1\).

If we define an ensemble of Markov chains and consider the distributions we get a sequence of distributions \(\{p_0(x), p_1(x), p_2(x)\ldots\}\) with \[p_{i+1}(x) &= \sum_{x' \in S} p_i(x') \T(x' \rightarrow x)\] \(p_o(x)\) might be a delta function on one particular starting state which would then be smoothed out by the transition function repeatedly.

As we’d like to draw samples from the target distribution \(P(x)\) the trick is to choose $(x_{i+1} x_i) $ such that :

\[ P(x) &= \sum_{x'} P(x') \T(x' \rightarrow x) \] In other words the MCMC dynamics defined by \(\T\) must be constructed to have the target distribution as their only fixed point. This condition is called the global balance condition. Along with some more technical considerations such as ergodcity which won’t be considered here, global balance suffices to ensure that a MCMC method is correct.

A sufficient but not necessary condition for global balance to hold is detailed balance:

\[ P(x) \T(x \rightarrow x') = P(x') \T(x' \rightarrow x) \] % In practice most algorithms are constructed to satisfy detailed balance though there are arguments that relaxing the condition can lead to faster algorithmskapfer_sampling_2013?}.

The goal of MCMC is then to choose \(\T\) so that it has the desired thermal distribution \(P(x)\) as its fixed point and that it converges quickly onto it. This boils down to requiring that the matrix representation of \(T_{ij} = \T(x_i \to x_j)\) has an eigenvector equal to \(P_i = P(x_i)\) with eigenvalue 1 and all other eigenvalues with magnitude less than one. The convergence time depends on the magnitude of the second largest eigenvalue.

In order to actually choose new states according to \(\T\) one chooses states from a proposal distribution \(q(x_i \to x')\) that can be directly sampled from. For instance, this might mean flipping a single random spin in a spin chain, in which case \(q(x'\to x_i)\) is the uniform distribution on states reachable by one spin flip from \(x_i\). The proposal \(x'\) is then accepted or rejected with an acceptance probability \(\A(x'\to x_{i+1})\), if the proposal is rejected then \(x_{i+1} = x_{i}\). Now \(\T(x\to x') = q(x\to x')\A(x \to x')\).

The Metropolis-Hasting algorithm is a slight extension of the original Metropolis algorithm that allows for non-symmetric proposal distributions $q(xx’) q(x’x) $. It can be derived starting from detailed balancekrauth_introduction_1996?}: \[ P(x)\T(x \to x') &= P(x')\T(x' \to x) \\ P(x)q(x \to x')\A(x \to x') &= P(x')q(x' \to x)\A(x' \to x) \\ \label{eq:db2} \frac{\A(x \to x')}{\A(x' \to x)} &= \frac{P(x')q(x' \to x)}{P(x)q(x \to x')} = f(x, x')\\ \] % The Metropolis-Hastings algorithm is the choice: \[\label{eq:mh} \A(x \to x') = \min\left(1, f(x,x')\right)\] % Noting that \(f(x,x') = 1/f(x',x)\), Eq. \(\ref{eq:mh}\) can be seen to satisfy Eq. \(\ref{eq:db2}\) by considering the two cases \(f(x,x') > 1\) and \(f(x,x') < 1\).

By choosing the proposal distribution such that \(f(x,x')\) is as close as possible to one, the rate of rejections can be reduced and the algorithm sped up.

%

%Thinning, burn in, multiple runs %Simulated annealing and Parallel Tempering

Applying MCMC to the FK model}

MCMC can be applied to sample over the classical degrees of freedom of the model. We take the full Hamiltonian and split it into a classical and a quantum part: \[ H_{\mathrm{FK}} &= -\sum_{<ij>} c^\dagger_{i}c_{j} + U \sum_{i} (c^\dagger_{i}c_{i} - 1/2)( n_i - 1/2) \\ &+ \sum_{ij} J_{ij} (n_i - 1/2) (n_j - 1/2) - \mu \sum_i (c^\dagger_{i}c_{i} + n_i)\\ H_q &= -\sum_{<ij>} c^\dagger_{i}c_{j} + \sum_{i} \left(U(n_i - 1/2) - \mu\right) c^\dagger_{i}c_{i}\\ H_c &= \sum_i \mu n_i - \frac{U}{2}(n_i - 1/2) + \sum_{ij}J_{ij}(n_i - 1/2)(n_j - 1/2) \] % There are \(2^N\) possible ion configurations \(\{ n_i \}\), we define \(n^k_i\) to be the occupation of the ith site of the kth configuration. The quantum part of the free energy can then be defined through the quantum partition function \(\Z^k\) associated with each ionic state \(n^k_i\): \[ F^k &= -1/\beta \ln{\Z^k} \\ \] % Such that the overall partition function is: \[ \Z &= \sum_k e^{- \beta H^k} Z^k \\ &= \sum_k e^{-\beta (H^k + F^k)} \\ \] % Because fermions are limited to occupation numbers of 0 or 1 \(Z^k\) simplifies nicely. If \(m^j_i = \{0,1\}\) is defined as the occupation of the level with energy \(\epsilon^k_i\) then the partition function is a sum over all the occupation states labelled by j: \[ Z^k &= \Tr e^{-\beta F^k} = \sum_j e^{-\beta \sum_i m^j_i \epsilon^k_i}\\ &= \sum_j \prod_i e^{- \beta m^j_i \epsilon^k_i}= \prod_i \sum_j e^{- \beta m^j_i \epsilon^k_i}\\ &= \prod_i (1 + e^{- \beta \epsilon^k_i})\\ F^k &= -1/\beta \sum_k \ln{(1 + e^{- \beta \epsilon^k_i})} \] % Observables can then be calculated from the partition function, for examples the occupation numbers:

\[ \tex{N} &= \frac{1}{\beta} \frac{1}{Z} \frac{\partial Z}{\partial \mu} = - \frac{\partial F}{\partial \mu}\\ &= \frac{1}{\beta} \frac{1}{Z} \frac{\partial}{\partial \mu} \sum_k e^{-\beta (H^k + F^k)}\\ &= 1/Z \sum_k (N^k_{\mathrm{ion}} + N^k_{\mathrm{electron}}) e^{-\beta (H^k + F^k)}\\ \] % with the definitions:

\[ N^k_{\mathrm{ion}} &= - \frac{\partial H^k}{\partial \mu} = \sum_i n^k_i\\ N^k_{\mathrm{electron}} &= - \frac{\partial F^k}{\partial \mu} = \sum_i \left(1 + e^{\beta \epsilon^k_i}\right)^{-1}\\ \] % The MCMC algorithm consists of performing a random walk over the states \(\{ n^k_i \}\). In the simplest case the proposal distribution corresponds to flipping a random site from occupied to unoccupied or vice versa, since this proposal is symmetric the acceptance function becomes: \[ P(k) &= \Z^{-1} e^{-\beta(H^k + F^k)} \\ \A(k \to k') &= \min\left(1, \frac{P(k')}{P(k)}\right) = \min\left(1, e^{\beta(H^{k'} + F^{k'})-\beta(H^k + F^k)}\right) \] % At each step \(F^k\) is calculated by diagonalising the tri-diagonal matrix representation of \(H_q\) with open boundary conditions. Observables are simply averages over the their value at each step of the random walk. The full spectrum and eigenbasis is too large to save to disk so usually running averages of key observables are taken as the walk progresses.

In a MCMC method a key property is the proportion of the time that proposals are accepted, the acceptance rate. If this rate is too low the random walk is trying to take overly large steps in energy space which problematic because it means very few new samples will be generated. If it is too high it implies the steps are too small, a problem because then the walk will take longer to explore the state space and the samples will be highly correlated. Ideal values for the acceptance rate can be calculated under certain assumptionsroberts_weak_1997?}. Here we monitor the acceptance rate and if it is too high we re-run the MCMC with a modified proposal distribution that has a chance to propose moves that flip multiple sites at a time.

In addition we exploit the particle-hole symmetry of the problem by occasionally proposing a flip of the entire state. This works because near half-filling, flipping the occupations of all the sites will produce a state at or near the energy of the current one.

The matrix diagonalisation is the most computationally expensive step of the process, a speed up can be obtained by modifying the proposal distribution to depend on the classical part of the energy, a trick gleaned from Ref.krauth_introduction_1996?}: \[ q(k \to k') &= \min\left(1, e^{\beta (H^{k'} - H^k)}\right) \\ \A(k \to k') &= \min\left(1, e^{\beta(F^{k'}- F^k)}\right) \] % This allows the method to reject some states without performing the diagonalisation at no cost to the accuracy of the MCMC method.

An extension of this idea is to try to define a classical model with a similar free energy dependence on the classical state as the full quantum, Ref.huang_accelerated_2017?} does this with restricted Boltzmann machines whose form is very similar to a classical spin model.

In order to reduce the effects of the boundary conditions and the finite size of the system we redefine and normalise the coupling matrix to have 0 derivative at its furthest extent rather than cutting off abruptly.

\[ J'(x) &= \abs{\frac{L}{\pi}\sin \frac{\pi x}{L}}^{-\alpha} \\ J(x) &= \frac{J_0 J'(x)}{\sum_y J'(y)} \] % The scaling ensures that, in the ordered phase, the overall potential felt by each site due to the rest of the system is independent of system size.

The Binder cumulant is defined as: \[U_B = 1 - \frac{\tex{\mu_4}}{3\tex{\mu_2}^2}\] % where \[\mu_n = \tex{(m - \tex{m})^n}\] % are the central moments of the order parameter m: \[m = \sum_i (-1)^i (2n_i - 1) / N\] % The Binder cumulant evaluated against temperature can be used as a diagnostic for the existence of a phase transition. If multiple such curves are plotted for different system sizes, a crossing indicates the location of a critical pointbinder_finite_1981?, musial_monte_2002}.

Markov Chain Monte-Carlo in Practice}

Quick Intro to MCMC}

The main paper relies on extensively to evaluate thermal expectation values within the model by walking over states of the classical spin system \(S_i\). For a classical system, the thermal expectation value of some operator \(O\) is defined by a Boltzmann weighted sum over the classical state space: \[ \tex{O} &= \frac{1}{\Z} \sum_{\s \in S} O(x) P(x) \\ P(x) &= \frac{1}{\Z} e^{-\beta F(x)} \\ \Z &= \sum_{\s \in S} e^{-\beta F(x)} \] While for a quantum system these sums are replaced by equivalent traces. The obvious approach to evaluate these sums numerically would be to directly loop over all the classical states in the system and perform the sum. But we all know know why this isn’t feasible: the state space is too large! Indeed even if we could do it, it would still be computationally wasteful since at low temperatures the sums are dominated by low energy excitations about the ground states of the system. Even worse, in our case we must fully solve the fermionic system via exact diagonalisation for each classical state in the sum, a very expensive operation!~.}

sidesteps these issues by defining a random walk that focuses on the states with the greatest Boltzmann weight. At low temperatures this means we need only visit a few low energy states to make good estimates while at high temperatures the weights become uniform so a small number of samples distributed across the state space suffice. However we will see that the method is not without difficulties of its own.

%MCMC from an ensemble point of view In implementation can be boiled down to choosing a transition function $(_{t} _t+1) $ where \(\s\) are vectors representing classical spin configurations. We start in some initial state \(\s_0\) and then repeatedly jump to new states according to the probabilities given by \(\T\). This defines a set of random walks \(\{\s_0\ldots \s_i\ldots \s_N\}\). Fig.~\(\ref{fig:single}\) shows this in practice: we have a (rather small) ensemble of \(M = 2\) walkers starting at the same point in state space and then spreading outwards by flipping spins along the way.

In pseudo-code one could write the MCMC simulation for a single walker as:

Where the function here produces a state with probability determined by the and the transition function \(\T\).

If we ran many such walkers in parallel we could then approximate the distribution \(p_t(\s; \s_0)\) which tells us where the walkers are likely to be after they’ve evolved for \(t\) steps from an initial state \(\s_0\). We need to carefully choose \(\T\) such that after a large number of steps \(k\) (the convergence time) the probability \(p_t(\s;\s_0)\) approaches the thermal distribution \(P(\s; \beta) = \Z^{-1} e^{-\beta F(\s)}\). This turns out to be quite easy to achieve using the Metropolis-Hasting algorithm.

Convergence Time}

Considering \(p(\s)\) as a vector \(\vec{p}\) whose jth entry is the probability of the jth state \(p_j = p(\s_j)\), and writing \(\T\) as the matrix with entries \(T_{ij} = \T(\s_j \rightarrow \s_i)\) we can write the update rule for the ensemble probability as: \[\vec{p}_{t+1} = \T \vec{p}_t \implies \vec{p}_{t} = \T^t \vec{p}_0\] where \(\vec{p}_0\) is vector which is one on the starting state and zero everywhere else. Since all states must transition to somewhere with probability one: \(\sum_i T_{ij} = 1\).

Matrices that satisfy this are called stochastic matrices exactly because they model these kinds of Markov processes. It can be shown that they have real eigenvalues, and ordering them by magnitude, that \(\lambda_0 = 1\) and \(0 < \lambda_{i\neq0} < 1\). %https://en.wikipedia.org/wiki/Stochastic_matrix Assuming \(\T\) has been chosen correctly, its single eigenvector with eigenvalue 1 will be the thermal distribution so repeated application of the transition function eventually leads there, while memory of the initial conditions decays exponentially with a convergence time \(k\) determined by \(\lambda_1\). In practice this means that one throws away the data from the beginning of the random walk in order reduce the dependence on the initial conditions and be close enough to the target distribution.

Auto-correlation Time}

At this stage one might think we’re done. We can indeed draw independent samples from \(P(\s; \beta)\) by starting from some arbitrary initial state and doing \(k\) steps to arrive at a sample. However a key insight is that after the convergence time, every state generated is a sample from \(P(\s; \beta)\)! They are not, however, independent samples. In Fig.~\(\ref{fig:raw}\) it is already clear that the samples of the order parameter m have some auto-correlation because only a few spins are flipped each step but even when the number of spins flipped per step is increased, Fig.~\(\ref{fig:m_autocorr}\) shows that it can be an important effect near the phase transition. Let’s define the auto-correlation time \(\tau(O)\) informally as the number of MCMC samples of some observable O that are statistically equal to one independent sample.~ for a more rigorous definition involving a sum over the auto-correlation function.} The auto-correlation time is generally shorter than the convergence time so it therefore makes sense from an efficiency standpoint to run a single walker for many MCMC steps rather than to run a huge ensemble for \(k\) steps each.

Once the random walk has been carried out for many steps, the expectation values of \(O\) can be estimated from the MCMC samples \(\s_i\): \[ \tex{O} = \sum_{i = 0}^{N} O(\s_i) + \mathcal{O}(\frac{1}{\sqrt{N}}) \] The the samples are correlated so the N of them effectively contains less information than \(N\) independent samples would, in fact roughly \(N/\tau\) effective samples. As a consequence the variance is larger than the \(\qex{O^2} - \qex{O}^2\) form it would have if the estimates were uncorrelated. There are many methods in the literature for estimating the true variance of \(\qex{O}\) and deciding how many steps are needed but my approach has been to run a small number of parallel chains, which are independent, in order to estimate the statistical error produced. This is a slightly less computationally efficient because it requires throwing away those \(k\) steps generated before convergence multiple times but it is a conceptually simple workaround.

In summary, to do efficient simulations we want to reduce both the convergence time and the auto-correlation time as much as possible. In order to explain how, we need to introduce the Metropolis-Hasting (MH) algorithm and how it gives an explicit form for the transition function.

The Metropolis-Hastings Algorithm}

MH breaks up the transition function into a proposal distribution \(q(\s \to \s')\) and an acceptance function \(\A(\s \to \s')\). \(q\) needs to be something that we can directly sample from, and in our case generally takes the form of flipping some number of spins in \(\s\), i.e if we’re flipping a single random spin in the spin chain, \(q(\s \to \s')\) is the uniform distribution on states reachable by one spin flip from \(\s\). This also gives the nice symmetry property that \(q(\s \to \s') = q(\s' \to \s)\).

The proposal \(\s'\) is then accepted or rejected with an acceptance probability \(\A(\s \to \s')\), if the proposal is rejected then \(\s_{i+1} = \s_{i}\). Hence:

\[\T(x\to x') = q(x\to x')\A(x \to x')\]

When the proposal distribution is symmetric as ours is, it cancels out in the expression for the acceptance function and the Metropolis-Hastings algorithm is simply the choice: \[ \A(x \to x') = \min\left(1, e^{-\beta\;\Delta F}\right)\] Where \(F\) is the overall free energy of the system, including both the quantum and classical sector.

To implement the acceptance function in practice we pick a random number in the unit interval and accept if it is less than \(e^{-\beta\;\Delta F}\):

This has the effect of always accepting proposed states that are lower in energy and sometimes accepting those that are higher in energy than the current state.

Choosing the proposal distribution}

Now we can discuss how to minimise the auto-correlations. The general principle is that one must balance the proposal distribution between two extremes. Choose overlay small steps, like flipping only a single spin and the acceptance rate will be high because \(\Delta F\) will usually be small, but each state will be very similar to the previous and the auto-correlations will be high too, making sampling inefficient. On the other hand, overlay large steps, like randomising a large portion of the spins each step, will result in very frequent rejections, especially at low temperatures.

I evaluated a few different proposal distributions for use with the FK model.

Fro Figure~\(\ref{fig:comparison}\) we see that even at moderately high temperatures \(T > T_c\) flipping one or two sites is the best choice. However for some simulations at very high temperature flipping more spins is warranted. Tuning the proposal distribution automatically seems like something that would not yield enough benefit for the additional complexity it would require.

Two Step Trick

Our method already relies heavily on the split between the classical and quantum sector to derive a sign problem free MCMC algorithm but it turns out that there is a further trick we can play with it. The free energy term is the sum of an easy to compute classical energy and a more expensive quantum free energy, we can split the acceptance function into two in such as way as to avoid having to compute the full exact diagonalisation some of the time:


current_state = initial_state

for i in range(N_steps):
    new_state = proposal(current_state)

    df_classical = classical_free_energy_change(current_state, new_state, parameters)
    if exp(-beta * df_classical) < uniform(0,1):
        f_quantum = quantum_free_energy(current_state, new_state, parameters)
    
        if exp(- beta * df_quantum) < uniform(0,1):
          current_state = new_state
    
        states[i] = current_state
    

lets cite Figure1

lets cite to person2. and then multple2,3. what is we surround it by spaces?2

1.
Kagan, Y. & Maksimov, L. Localization in a system of interacting particles diffusing in a regular crystal. Zhurnal Eksperimental’noi i Teoreticheskoi Fiziki 87, 348–365 (1984).
2.
Trebst, S. & Hickey, C. Kitaev materials. Physics Reports 950, 1–37 (2022).
3.
Banerjee, A. et al. Proximate Kitaev Quantum Spin Liquid Behaviour in {\alpha}-RuCl$_3$. Nature Mater 15, 733–740 (2016).