diff --git a/_sass/thesis.scss b/_sass/thesis.scss
index 5fc5322..7fcd559 100644
--- a/_sass/thesis.scss
+++ b/_sass/thesis.scss
@@ -88,5 +88,11 @@ nav.overall-table-of-contents > ul {
 
 // Page header
 div#page-header {
+    //make the header sticky, I don't really like how this looks but it's fun to play with
+    // position: sticky;
+    // top: 0px;
+    // background: white;
+    // z-index: 10;
+    // width: 100%;
     p { margin-block-end: 0px;}
 }
\ No newline at end of file
diff --git a/_thesis/1_Introduction/1_Intro.html b/_thesis/1_Introduction/1_Intro.html
index a6efefb..2f5a923 100644
--- a/_thesis/1_Introduction/1_Intro.html
+++ b/_thesis/1_Introduction/1_Intro.html
@@ -595,8 +595,7 @@ to the Falikov-Kimball Model, the Kitaev Honeycomb Model, disorder and
 localisation. Then Chapter 3 introduces and studies the Long Range
 Falikov-Kimball Model in one dimension while Chapter 4 focusses on the
 Amorphous Kitaev Model.</p>
-<p>Next Chapter: <a
-href="../2_Background/2.1_FK_Model.html#the-falikov-kimball-model">2
+<p>Next Chapter: <a href="../2_Background/2.1_FK_Model.html">2
 Background</a></p>
 </section>
 <section id="bibliography" class="level1 unnumbered">
diff --git a/_thesis/2_Background/2.1_FK_Model.html b/_thesis/2_Background/2.1_FK_Model.html
index e2c5928..cf3dc3b 100644
--- a/_thesis/2_Background/2.1_FK_Model.html
+++ b/_thesis/2_Background/2.1_FK_Model.html
@@ -138,7 +138,7 @@ href="#ref-gruberFalicovKimballModel2005"
 role="doc-biblioref">9</a>]</span>. The absence of a hopping term for
 the heavy electrons means they do not need the factor of <span
 class="math inline">\(\epsilon_i\)</span>. See appendix <a
-href="../6_Appendices/A.1_Particle_Hole_Symmetry.html#particle-hole-symmetry">A.1</a>
+href="../6_Appendices/A.1_Particle_Hole_Symmetry-Copy1.html#particle-hole-symmetry">A.1</a>
 for a full derivation of the PH symmetry.</p>
 <div id="fig:simple_DOS" class="fignos">
 <figure>
@@ -419,9 +419,8 @@ j|^{-\alpha} S_i S_j\)</span> as the exponent of the interaction <span
 class="math inline">\(\alpha\)</span> is varied.</figcaption>
 </figure>
 </div>
-<p>Next Section: <a
-href="../2_Background/2.2_HKM_Model.html#the-kitaev-honeycomb-model">The
-Kitaev Honeycomb Model</a></p>
+<p>Next Section: <a href="../2_Background/2.2_HKM_Model.html">The Kitaev
+Honeycomb Model</a></p>
 </section>
 </section>
 <section id="bibliography" class="level1 unnumbered">
diff --git a/_thesis/2_Background/2.2_HKM_Model.html b/_thesis/2_Background/2.2_HKM_Model.html
index f510c47..8007b18 100644
--- a/_thesis/2_Background/2.2_HKM_Model.html
+++ b/_thesis/2_Background/2.2_HKM_Model.html
@@ -140,8 +140,7 @@ role="doc-biblioref">1</a>]</span></li>
 <h2>Phase Diagram</h2>
 <div class="sourceCode" id="cb1"><pre
 class="sourceCode python"><code class="sourceCode python"></code></pre></div>
-<p>Next Section: <a
-href="../2_Background/2.3_Disorder.html#bg-disorder-and-localisation">Disorder
+<p>Next Section: <a href="../2_Background/2.3_Disorder.html">Disorder
 and Localisation</a></p>
 </section>
 </section>
diff --git a/_thesis/2_Background/2.3_Disorder.html b/_thesis/2_Background/2.3_Disorder.html
index 8b2820e..c29b041 100644
--- a/_thesis/2_Background/2.3_Disorder.html
+++ b/_thesis/2_Background/2.3_Disorder.html
@@ -174,8 +174,8 @@ timescales to the infinite limit.</p>
 <p>-link to the Kitaev Model</p>
 <p>-link to the physics of amorphous systems</p>
 <p>Next Chapter: <a
-href="../3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html#fk-model">3
-The Long Range Falikov-Kimball Model</a></p>
+href="../3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html">3 The Long
+Range Falikov-Kimball Model</a></p>
 </section>
 </section>
 <section id="bibliography" class="level1 unnumbered">
diff --git a/_thesis/3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html b/_thesis/3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html
index 46036c8..8cfa62b 100644
--- a/_thesis/3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html
+++ b/_thesis/3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html
@@ -272,7 +272,7 @@ href="#ref-fukuiOrderNClusterMonte2009"
 role="doc-biblioref">30</a>]</span>. We only consider even system sizes
 given that odd system sizes are not commensurate with a CDW state.</p>
 <p>Next Section: <a
-href="../3_Long_Range_Falikov_Kimball/3.2_LRFK_Methods.html#fk-methods">Methods</a></p>
+href="../3_Long_Range_Falikov_Kimball/3.2_LRFK_Methods.html">Methods</a></p>
 </section>
 <section id="bibliography" class="level1 unnumbered">
 <h1 class="unnumbered">Bibliography</h1>
diff --git a/_thesis/3_Long_Range_Falikov_Kimball/3.2_LRFK_Methods.html b/_thesis/3_Long_Range_Falikov_Kimball/3.2_LRFK_Methods.html
index 56fb8f0..50f3598 100644
--- a/_thesis/3_Long_Range_Falikov_Kimball/3.2_LRFK_Methods.html
+++ b/_thesis/3_Long_Range_Falikov_Kimball/3.2_LRFK_Methods.html
@@ -29,58 +29,20 @@ image:
 <ul>
 <li><a href="#fk-methods" id="toc-fk-methods">Methods</a>
 <ul>
-<li><a href="#markov-chain-monte-carlo"
-id="toc-markov-chain-monte-carlo">Markov Chain Monte Carlo</a></li>
-<li><a href="#sampling" id="toc-sampling">Sampling</a></li>
-<li><a href="#markov-chains" id="toc-markov-chains">Markov
-Chains</a></li>
+<li><a href="#thermodynamics-of-the-lrfk-model"
+id="toc-thermodynamics-of-the-lrfk-model">Thermodynamics of the LRFK
+Model</a></li>
+<li><a href="#markov-chain-monte-carlo-and-emergent-disorder"
+id="toc-markov-chain-monte-carlo-and-emergent-disorder">Markov Chain
+Monte Carlo and Emergent Disorder</a></li>
 <li><a href="#application-to-the-fk-model"
-id="toc-application-to-the-fk-model">Application to the FK Model</a>
-<ul>
-<li><a href="#markov-chain-monte-carlo-1"
-id="toc-markov-chain-monte-carlo-1">Markov Chain Monte Carlo</a></li>
-</ul></li>
-<li><a href="#the-metropolis-hasting-algorithm"
-id="toc-the-metropolis-hasting-algorithm">The Metropolis-Hasting
-Algorithm</a></li>
-<li><a href="#metropolis-hastings"
-id="toc-metropolis-hastings">Metropolis-Hastings</a></li>
-<li><a href="#convergence-auto-correlation-and-binning"
-id="toc-convergence-auto-correlation-and-binning">Convergence,
-Auto-correlation and Binning</a></li>
-<li><a href="#proposal-distributions"
-id="toc-proposal-distributions">Proposal Distributions</a></li>
-<li><a href="#perturbation-mcmc" id="toc-perturbation-mcmc">Perturbation
-MCMC</a></li>
+id="toc-application-to-the-fk-model">Application to the FK
+Model</a></li>
+<li><a href="#two-step-trick" id="toc-two-step-trick">Two Step
+Trick</a></li>
 <li><a href="#scaling" id="toc-scaling">Scaling</a></li>
 <li><a href="#binder-cumulants" id="toc-binder-cumulants">Binder
 Cumulants</a></li>
-<li><a href="#markov-chain-monte-carlo-in-practice"
-id="toc-markov-chain-monte-carlo-in-practice">Markov Chain Monte-Carlo
-in Practice</a>
-<ul>
-<li><a href="#quick-intro-to-mcmc" id="toc-quick-intro-to-mcmc">Quick
-Intro to MCMC</a></li>
-<li><a href="#convergence-time" id="toc-convergence-time">Convergence
-Time</a></li>
-<li><a href="#auto-correlation-time"
-id="toc-auto-correlation-time">Auto-correlation Time</a></li>
-<li><a href="#the-metropolis-hastings-algorithm"
-id="toc-the-metropolis-hastings-algorithm">The Metropolis-Hastings
-Algorithm</a></li>
-</ul></li>
-<li><a href="#two-step-trick" id="toc-two-step-trick">Two Step
-Trick</a></li>
-<li><a href="#detailed-balance-for-the-two-step-method"
-id="toc-detailed-balance-for-the-two-step-method">Detailed Balance for
-the two step method</a>
-<ul>
-<li><a href="#two-step-trick-1" id="toc-two-step-trick-1">Two Step
-Trick</a></li>
-<li><a href="#tuning-the-proposal-distribution"
-id="toc-tuning-the-proposal-distribution">Tuning the proposal
-distribution</a></li>
-</ul></li>
 <li><a href="#diagnostics-of-localisation"
 id="toc-diagnostics-of-localisation">Diagnostics of Localisation</a>
 <ul>
@@ -88,20 +50,10 @@ id="toc-diagnostics-of-localisation">Diagnostics of Localisation</a>
 id="toc-inverse-participation-ratio">Inverse Participation
 Ratio</a></li>
 </ul></li>
-<li><a href="#markov-chain-monte-carlo-2"
-id="toc-markov-chain-monte-carlo-2">Markov Chain Monte-Carlo</a></li>
-<li><a href="#convergence-time-1"
-id="toc-convergence-time-1">Convergence Time</a></li>
-<li><a href="#auto-correlation-time-1"
-id="toc-auto-correlation-time-1">Auto-correlation Time</a></li>
-<li><a href="#the-metropolis-hastings-algorithm-1"
-id="toc-the-metropolis-hastings-algorithm-1">The Metropolis-Hastings
-Algorithm</a></li>
-<li><a href="#choosing-the-proposal-distribution"
-id="toc-choosing-the-proposal-distribution">Choosing the proposal
-distribution</a></li>
-<li><a href="#two-step-trick-2" id="toc-two-step-trick-2">Two Step
-Trick</a></li>
+<li><a href="#convergence-time" id="toc-convergence-time">Convergence
+Time</a></li>
+<li><a href="#auto-correlation-time"
+id="toc-auto-correlation-time">Auto-correlation Time</a></li>
 </ul></li>
 <li><a href="#bibliography" id="toc-bibliography">Bibliography</a></li>
 </ul>
@@ -118,58 +70,20 @@ Trick</a></li>
 <ul>
 <li><a href="#fk-methods" id="toc-fk-methods">Methods</a>
 <ul>
-<li><a href="#markov-chain-monte-carlo"
-id="toc-markov-chain-monte-carlo">Markov Chain Monte Carlo</a></li>
-<li><a href="#sampling" id="toc-sampling">Sampling</a></li>
-<li><a href="#markov-chains" id="toc-markov-chains">Markov
-Chains</a></li>
+<li><a href="#thermodynamics-of-the-lrfk-model"
+id="toc-thermodynamics-of-the-lrfk-model">Thermodynamics of the LRFK
+Model</a></li>
+<li><a href="#markov-chain-monte-carlo-and-emergent-disorder"
+id="toc-markov-chain-monte-carlo-and-emergent-disorder">Markov Chain
+Monte Carlo and Emergent Disorder</a></li>
 <li><a href="#application-to-the-fk-model"
-id="toc-application-to-the-fk-model">Application to the FK Model</a>
-<ul>
-<li><a href="#markov-chain-monte-carlo-1"
-id="toc-markov-chain-monte-carlo-1">Markov Chain Monte Carlo</a></li>
-</ul></li>
-<li><a href="#the-metropolis-hasting-algorithm"
-id="toc-the-metropolis-hasting-algorithm">The Metropolis-Hasting
-Algorithm</a></li>
-<li><a href="#metropolis-hastings"
-id="toc-metropolis-hastings">Metropolis-Hastings</a></li>
-<li><a href="#convergence-auto-correlation-and-binning"
-id="toc-convergence-auto-correlation-and-binning">Convergence,
-Auto-correlation and Binning</a></li>
-<li><a href="#proposal-distributions"
-id="toc-proposal-distributions">Proposal Distributions</a></li>
-<li><a href="#perturbation-mcmc" id="toc-perturbation-mcmc">Perturbation
-MCMC</a></li>
+id="toc-application-to-the-fk-model">Application to the FK
+Model</a></li>
+<li><a href="#two-step-trick" id="toc-two-step-trick">Two Step
+Trick</a></li>
 <li><a href="#scaling" id="toc-scaling">Scaling</a></li>
 <li><a href="#binder-cumulants" id="toc-binder-cumulants">Binder
 Cumulants</a></li>
-<li><a href="#markov-chain-monte-carlo-in-practice"
-id="toc-markov-chain-monte-carlo-in-practice">Markov Chain Monte-Carlo
-in Practice</a>
-<ul>
-<li><a href="#quick-intro-to-mcmc" id="toc-quick-intro-to-mcmc">Quick
-Intro to MCMC</a></li>
-<li><a href="#convergence-time" id="toc-convergence-time">Convergence
-Time</a></li>
-<li><a href="#auto-correlation-time"
-id="toc-auto-correlation-time">Auto-correlation Time</a></li>
-<li><a href="#the-metropolis-hastings-algorithm"
-id="toc-the-metropolis-hastings-algorithm">The Metropolis-Hastings
-Algorithm</a></li>
-</ul></li>
-<li><a href="#two-step-trick" id="toc-two-step-trick">Two Step
-Trick</a></li>
-<li><a href="#detailed-balance-for-the-two-step-method"
-id="toc-detailed-balance-for-the-two-step-method">Detailed Balance for
-the two step method</a>
-<ul>
-<li><a href="#two-step-trick-1" id="toc-two-step-trick-1">Two Step
-Trick</a></li>
-<li><a href="#tuning-the-proposal-distribution"
-id="toc-tuning-the-proposal-distribution">Tuning the proposal
-distribution</a></li>
-</ul></li>
 <li><a href="#diagnostics-of-localisation"
 id="toc-diagnostics-of-localisation">Diagnostics of Localisation</a>
 <ul>
@@ -177,20 +91,10 @@ id="toc-diagnostics-of-localisation">Diagnostics of Localisation</a>
 id="toc-inverse-participation-ratio">Inverse Participation
 Ratio</a></li>
 </ul></li>
-<li><a href="#markov-chain-monte-carlo-2"
-id="toc-markov-chain-monte-carlo-2">Markov Chain Monte-Carlo</a></li>
-<li><a href="#convergence-time-1"
-id="toc-convergence-time-1">Convergence Time</a></li>
-<li><a href="#auto-correlation-time-1"
-id="toc-auto-correlation-time-1">Auto-correlation Time</a></li>
-<li><a href="#the-metropolis-hastings-algorithm-1"
-id="toc-the-metropolis-hastings-algorithm-1">The Metropolis-Hastings
-Algorithm</a></li>
-<li><a href="#choosing-the-proposal-distribution"
-id="toc-choosing-the-proposal-distribution">Choosing the proposal
-distribution</a></li>
-<li><a href="#two-step-trick-2" id="toc-two-step-trick-2">Two Step
-Trick</a></li>
+<li><a href="#convergence-time" id="toc-convergence-time">Convergence
+Time</a></li>
+<li><a href="#auto-correlation-time"
+id="toc-auto-correlation-time">Auto-correlation Time</a></li>
 </ul></li>
 <li><a href="#bibliography" id="toc-bibliography">Bibliography</a></li>
 </ul>
@@ -198,76 +102,171 @@ Trick</a></li>
  -->
 
 <!-- Main Page Body -->
+<div id="page-header">
+<p>3 The Long Range Falikov-Kimball Model</p>
+<hr />
+</div>
 <section id="fk-methods" class="level1">
 <h1>Methods</h1>
-<section id="markov-chain-monte-carlo" class="level2">
-<h2>Markov Chain Monte Carlo</h2>
+<section id="thermodynamics-of-the-lrfk-model" class="level2">
+<h2>Thermodynamics of the LRFK Model</h2>
+<p>The results for the phase diagram were obtained with a classical
+Markov Chain Monte Carlo (MCMC) method which we discuss in the
+following. It allows us to solve our long-range FK model efficiently,
+yielding unbiased estimates of thermal expectation values and linking it
+to disorder physics in a translationally invariant setting.</p>
+<p>Since the spin configurations are classical, the Hamiltonian can be
+split into a classical spin part <span
+class="math inline">\(H_s\)</span> and an operator valued part <span
+class="math inline">\(H_c\)</span>.</p>
+<p><span class="math display">\[\begin{aligned}
+H_s&amp; = - \frac{U}{2}S_i + \sum_{i, j}^{N} J_{ij} S_i S_j \\
+H_c&amp; = \sum_i U S_i c^\dagger_{i}c_{i} -t(c^\dagger_{i}c_{i+1} +
+c^\dagger_{i+1}c_{i}) \end{aligned}\]</span></p>
+<p>The partition function can then be written as a sum over spin
+configurations, <span class="math inline">\(\vec{S} = (S_0,
+S_1...S_{N-1})\)</span>:</p>
+<p><span class="math display">\[\begin{aligned}
+\mathcal{Z} = \mathrm{Tr} e^{-\beta H}= \sum_{\vec{S}} e^{-\beta H_s}
+\mathrm{Tr}_c e^{-\beta H_c} .\end{aligned}\]</span></p>
+<p>The contribution of <span class="math inline">\(H_c\)</span> to the
+grand canonical partition function can be obtained by performing the sum
+over eigenstate occupation numbers giving <span
+class="math inline">\(-\beta F_c[\vec{S}] = \sum_k \ln{(1 + e^{- \beta
+\epsilon_k})}\)</span> where <span
+class="math inline">\({\epsilon_k[\vec{S}]}\)</span> are the eigenvalues
+of the matrix representation of <span class="math inline">\(H_c\)</span>
+determined through exact diagonalisation. This gives a partition
+function containing a classical energy which corresponds to the
+long-range interaction of the spins, and a free energy which corresponds
+to the quantum subsystem. <span class="math display">\[\begin{aligned}
+\mathcal{Z} = \sum_{\vec{S}} e^{-\beta H_S[\vec{S}] - \beta
+F_c[\vec{S}]} = \sum_{\vec{S}} e^{-\beta
+E[\vec{S}]}\end{aligned}\]</span></p>
 </section>
-<section id="sampling" class="level2">
-<h2>Sampling</h2>
-<p>Markov Chain Monte Carlo (MCMC) is a useful method whenever we have a
-probability distribution that we want to sample from but there is not
-direct sampling way to do so.</p>
-<p>In almost any computer simulation the ultimate source of randomness
-is a stream of (close to) uniform, uncorrelated bits generated from a
-pseudo random number generator. A direct sampling method takes such a
-source and outputs uncorrelated samples from the target distribution.
-The fact they’re uncorrelated is key as we’ll see later. Examples of
-direct sampling methods range from the trivial: take n random bits to
-generate integers uniformly between 0 and <span
-class="math inline">\(2^n\)</span> to more complex methods such as
-inverse transform sampling and rejection sampling <span class="citation"
-data-cites="devroyeRandomSampling1986"> [<a
-href="#ref-devroyeRandomSampling1986"
-role="doc-biblioref">1</a>]</span>.</p>
-<p>In physics the distribution we usually want to sample from is the
-Boltzmann probability over states of the system <span
-class="math inline">\(S\)</span>: <span class="math display">\[
-\begin{aligned}
-p(S)  &amp;= \frac{1}{\mathcal{Z}} e^{-\beta H(S)} \\
-\end{aligned}
-\]</span> where <span class="math inline">\(\mathcal{Z} = \sum_S
-e^{-\beta H(S)}\)</span> is the normalisation factor and ubiquitous
-partition function. In principle we could directly sample from this, for
-a discrete system there are finitely many choices. We could calculate
-the probability of each one and assign each a region of the unit
-interval which we could then sample uniformly from.</p>
-<p>However if we actually try to do this we will run into two problems,
-we can’t calculate <span class="math inline">\(\mathcal{Z}\)</span> for
-any reasonably sized systems because the state space grows exponentially
-with system size. Even if we could calculate <span
-class="math inline">\(\mathcal{Z}\)</span>, sampling from an
-exponentially large number of options quickly become tricky. This kind
-of problem happens in many other disciplines too, particularly when
-fitting statistical models using Bayesian inference <span
-class="citation" data-cites="BMCP2021"> [<a href="#ref-BMCP2021"
-role="doc-biblioref">2</a>]</span>.</p>
-</section>
-<section id="markov-chains" class="level2">
-<h2>Markov Chains</h2>
-<p>So what can we do? Well it turns out that if we’re willing to give up
-in the requirement that the samples be uncorrelated then we can use MCMC
-instead.</p>
-<p>MCMC defines a weighted random walk over the states <span
-class="math inline">\((S_0, S_1, S_2, ...)\)</span>, such that in the
-long time limit, states are visited according to their probability <span
-class="math inline">\(p(S)\)</span>. <span class="citation"
+<section id="markov-chain-monte-carlo-and-emergent-disorder"
+class="level2">
+<h2>Markov Chain Monte Carlo and Emergent Disorder</h2>
+<p>Classical MCMC defines a weighted random walk over the spin states
+<span class="math inline">\((\vec{S}_0, \vec{S}_1, \vec{S}_2,
+...)\)</span>, such that the likelihood of visiting a particular state
+converges to its Boltzmann probability <span
+class="math inline">\(p(\vec{S}) = \mathcal{Z}^{-1} e^{-\beta
+E}\)</span> <span class="citation"
 data-cites="binderGuidePracticalWork1988 kerteszAdvancesComputerSimulation1998 wolffMonteCarloErrors2004"> [<a
-href="#ref-binderGuidePracticalWork1988" role="doc-biblioref">3</a>–<a
+href="#ref-binderGuidePracticalWork1988" role="doc-biblioref">1</a>–<a
 href="#ref-wolffMonteCarloErrors2004"
-role="doc-biblioref">5</a>]</span>.</p>
-<p>In a physics context this lets us evaluate any observable with a mean
-over the states visited by the walk. <span
+role="doc-biblioref">3</a>]</span>. Hence, any observable can be
+estimated as a mean over the states visited by the walk, <span
 class="math display">\[\begin{aligned}
-\langle O \rangle &amp; = \sum_{S} p(S) \langle O \rangle_{S} = \sum_{i
-= 0}^{M} \expval{O}_{S_i} + \mathcal{O}(\tfrac{1}{\sqrt{M}})\\
+\label{eq:thermal_expectation}
+\langle O \rangle &amp; = \sum_{\vec{S}} p(\vec{S}) \langle O
+\rangle_{\vec{S}}\\
+                  &amp; = \sum_{i = 0}^{M} \langle O\rangle_{\vec{S}_i}
+\pm \mathcal{O}(\tfrac{1}{\sqrt{M}})
+\end{aligned}\]</span> where the former sum runs over the entire state
+space while the later runs over all the state visited by a particular
+MCMC run.</p>
+<p><span class="math display">\[\begin{aligned}
+\langle O \rangle_{\vec{S}}&amp; = \sum_{\nu} n_F(\epsilon_{\nu})
+\langle O \rangle{\nu}
 \end{aligned}\]</span></p>
+<p>Where <span class="math inline">\(\nu\)</span> runs over the
+eigenstates of <span class="math inline">\(H_c\)</span> for a particular
+spin configuration and <span class="math inline">\(n_F(\epsilon) =
+\left(e^{-\beta\epsilon} + 1\right)^{-1}\)</span> is the Fermi
+function.</p>
 <p>The choice of the transition function for MCMC is under-determined as
 one only needs to satisfy a set of balance conditions for which there
 are many solutions <span class="citation"
 data-cites="kellyReversibilityStochasticNetworks1981"> [<a
 href="#ref-kellyReversibilityStochasticNetworks1981"
-role="doc-biblioref">6</a>]</span>.</p>
+role="doc-biblioref">4</a>]</span>. Here, we incorporate a modification
+to the standard Metropolis-Hastings algorithm <span class="citation"
+data-cites="hastingsMonteCarloSampling1970"> [<a
+href="#ref-hastingsMonteCarloSampling1970"
+role="doc-biblioref">5</a>]</span> gleaned from Krauth <span
+class="citation" data-cites="krauthIntroductionMonteCarlo1998"> [<a
+href="#ref-krauthIntroductionMonteCarlo1998"
+role="doc-biblioref">6</a>]</span>. Let us first recall the standard
+algorithm which decomposes the transition probability into <span
+class="math inline">\(\mathcal{T}(a \to b) = p(a \to b)\mathcal{A}(a \to
+b)\)</span>. Here, <span class="math inline">\(p\)</span> is the
+proposal distribution that we can directly sample from while <span
+class="math inline">\(\mathcal{A}\)</span> is the acceptance
+probability. The standard Metropolis-Hastings choice is <span
+class="math display">\[\mathcal{A}(a \to b) = \min\left(1, \frac{p(b\to
+a)}{p(a\to b)} e^{-\beta \Delta E}\right)\;,\]</span> with <span
+class="math inline">\(\Delta E = E_b - E_a\)</span>. The walk then
+proceeds by sampling a state <span class="math inline">\(b\)</span> from
+<span class="math inline">\(p\)</span> and moving to <span
+class="math inline">\(b\)</span> with probability <span
+class="math inline">\(\mathcal{A}(a \to b)\)</span>. The latter
+operation is typically implemented by performing a transition if a
+uniform random sample from the unit interval is less than <span
+class="math inline">\(\mathcal{A}(a \to b)\)</span> and otherwise
+repeating the current state as the next step in the random walk. The
+proposal distribution is often symmetric so does not appear in <span
+class="math inline">\(\mathcal{A}\)</span>. Here, we flip a small number
+of sites in <span class="math inline">\(b\)</span> at random to generate
+proposals, which is indeed symmetric.</p>
+<p>In our computations <span class="citation"
+data-cites="hodsonMCMCFKModel2021"> [<a
+href="#ref-hodsonMCMCFKModel2021" role="doc-biblioref">7</a>]</span> we
+employ a modification of the algorithm which is based on the observation
+that the free energy of the <span data-acronym-label="FK"
+data-acronym-form="singular+short">FK</span> system is composed of a
+classical part which is much quicker to compute than the quantum part.
+Hence, we can obtain a computational speedup by first considering the
+value of the classical energy difference <span
+class="math inline">\(\Delta H_s\)</span> and rejecting the transition
+if the former is too high. We only compute the quantum energy difference
+<span class="math inline">\(\Delta F_c\)</span> if the transition is
+accepted. We then perform a second rejection sampling step based upon
+it. This corresponds to two nested comparisons with the majority of the
+work only occurring if the first test passes and has the acceptance
+function <span class="math display">\[\mathcal{A}(a \to b) =
+\min\left(1, e^{-\beta \Delta H_s}\right)\min\left(1, e^{-\beta \Delta
+F_c}\right)\;.\]</span></p>
+<p>See Appendix <a href="#app:balance" data-reference-type="ref"
+data-reference="app:balance">[app:balance]</a> for a proof that this
+satisfies the detailed balance condition.</p>
+<p>For the model parameters used in Fig. <a href="#fig:indiv_IPR"
+data-reference-type="ref" data-reference="fig:indiv_IPR">1</a>, we find
+that with our new scheme the matrix diagonalisation is skipped around
+30% of the time at <span class="math inline">\(T = 2.5\)</span> and up
+to 80% at <span class="math inline">\(T = 1.5\)</span>. We observe that
+for <span class="math inline">\(N = 50\)</span>, the matrix
+diagonalisation, if it occurs, occupies around 60% of the total
+computation time for a single step. This rises to 90% at N = 300 and
+further increases for larger N. We therefore get the greatest speedup
+for large system sizes at low temperature where many prospective
+transitions are rejected at the classical stage and the matrix
+computation takes up the greatest fraction of the total computation
+time. The upshot is that we find a speedup of up to a factor of 10 at
+the cost of very little extra algorithmic complexity.</p>
+<p>Our two-step method should be distinguished from the more common
+method for speeding up <span data-acronym-label="MCMC"
+data-acronym-form="singular+short">MCMC</span> which is to add asymmetry
+to the proposal distribution to make it as similar as possible to <span
+class="math inline">\(\min\left(1, e^{-\beta \Delta E}\right)\)</span>.
+This reduces the number of rejected states, which brings the algorithm
+closer in efficiency to a direct sampling method. However it comes at
+the expense of requiring a way to directly sample from this complex
+distribution, a problem which <span data-acronym-label="MCMC"
+data-acronym-form="singular+short">MCMC</span> was employed to solve in
+the first place. For example, recent work trains restricted Boltzmann
+machines (RBMs) to generate samples for the proposal distribution of the
+<span data-acronym-label="FK"
+data-acronym-form="singular+short">FK</span> model <span
+class="citation" data-cites="huangAcceleratedMonteCarlo2017"> [<a
+href="#ref-huangAcceleratedMonteCarlo2017"
+role="doc-biblioref">8</a>]</span>. The RBMs are chosen as a
+parametrisation of the proposal distribution that can be efficiently
+sampled from while offering sufficient flexibility that they can be
+adjusted to match the target distribution. Our proposed method is
+considerably simpler and does not require training while still reaping
+some of the benefits of reduced computation.</p>
 </section>
 <section id="application-to-the-fk-model" class="level2">
 <h2>Application to the FK Model</h2>
@@ -302,206 +301,67 @@ to the quantum subsystem. <span class="math display">\[\begin{aligned}
 \mathcal{Z} = \sum_{\vec{S}} e^{-\beta H_S[\vec{S}] - \beta
 F_c[\vec{S}]} = \sum_{\vec{S}} e^{-\beta E[\vec{S}]}
 \end{aligned}\]</span></p>
-<section id="markov-chain-monte-carlo-1" class="level3">
-<h3>Markov Chain Monte Carlo</h3>
-<p>Markov Chain Monte Carlo (MCMC) is a technique for evaluating thermal
-expectation values <span class="math inline">\(\expval{O}\)</span> with
-respect to some physical system defined by a set of states <span
-class="math inline">\(\{x: x \in S\}\)</span> and a free energy <span
-class="math inline">\(F(x)\)</span> <span class="citation"
-data-cites="krauthIntroductionMonteCarlo1998"> [<a
+</section>
+<section id="two-step-trick" class="level2">
+<h2>Two Step Trick</h2>
+<p>Here, we incorporate a modification to the standard
+Metropolis-Hastings algorithm <span class="citation"
+data-cites="hastingsMonteCarloSampling1970"> [<a
+href="#ref-hastingsMonteCarloSampling1970"
+role="doc-biblioref">5</a>]</span> gleaned from Krauth <span
+class="citation" data-cites="krauthIntroductionMonteCarlo1998"> [<a
 href="#ref-krauthIntroductionMonteCarlo1998"
-role="doc-biblioref">7</a>]</span>. The thermal expectation value is
-defined via a Boltzmann weighted sum over the entire states: <span
-class="math display">\[
-\begin{aligned}
-    \expval{O} &amp;= \frac{1}{\mathcal{Z}} \sum_{x \in S} O(x) P(x) \\
-    P(x) &amp;= \frac{1}{\mathcal{Z}} e^{-\beta F(x)} \\
-    \mathcal{Z} &amp;= \sum_{x \in S} e^{-\beta F(x)}
-\end{aligned}
-\]</span></p>
-<p>When the state space is too large to evaluate this sum directly, MCMC
-defines a stochastic algorithm which generates a random walk <span
-class="math inline">\(\{x_0\ldots x_i\ldots x_N\}\)</span> whose
-distribution <span class="math inline">\(p(x_i)\)</span> approaches a
-target distribution <span class="math inline">\(P(x)\)</span> in the
-large N limit.</p>
-<p><span class="math display">\[\lim_{i\to\infty} p(x_i) =
-P(x)\]</span></p>
-<p>In this case the target distribution will be the thermal one <span
-class="math inline">\(P(x) \rightarrow \mathcal{Z}^{-1} e^{-\beta
-F(x)}\)</span>. The major benefit of the method being that as long as
-one can express the desired <span class="math inline">\(P(x)\)</span> up
-to a multiplicative constant, MCMC can be applied. Since <span
-class="math inline">\(e^{-\beta F(x)}\)</span> is relatively easy to
-evaluate, MCMC provides a useful method for finite temperature
-physics.</p>
-<p>Once the random walk has been carried out for many steps, the
-expectation values of <span class="math inline">\(O\)</span> can be
-estimated from the MCMC samples: <span class="math display">\[
-    \expval{O} = \sum_{i = 0}^{N} O(x_i) +
-\mathcal{O}(\frac{1}{\sqrt{N}})
-\]</span> The the samples in the random walk are correlated so the
-samples effectively contain less information than <span
-class="math inline">\(N\)</span> independent samples would. As a
-consequence the variance is larger than the <span
-class="math inline">\(\expval{O^2} - \expval{O}^2\)</span> form it would
-have if the estimates were uncorrelated. Methods of estimating the true
-variance of <span class="math inline">\(\expval{O}\)</span> and decided
-how many steps are needed will be considered later.</p>
-</section>
-</section>
-<section id="the-metropolis-hasting-algorithm" class="level2">
-<h2>The Metropolis-Hasting Algorithm</h2>
-<p>Markov chains are defined by a transition function $(x_{i+1} x_i) $
-giving the probability that a chain in state <span
-class="math inline">\(x_i\)</span> at time <span
-class="math inline">\(i\)</span> will transition to a state <span
-class="math inline">\(x_{i+1}\)</span>. Since we must transition
-somewhere at each step, this comes with the normalisation condition that
-<span class="math inline">\(\sum\limits_x \mathcal{T}(x&#39; \rightarrow
-x) = 1\)</span>.</p>
-<p>If we define an ensemble of Markov chains and consider the
-distributions we get a sequence of distributions <span
-class="math inline">\(\{p_0(x), p_1(x), p_2(x)\ldots\}\)</span> with
-<span class="math display">\[\begin{aligned}
-p_{i+1}(x) &amp;= \sum_{x&#39; \in S} p_i(x&#39;) \mathcal{T}(x&#39;
-\rightarrow x)
-\end{aligned}\]</span> <span class="math inline">\(p_o(x)\)</span> might
-be a delta function on one particular starting state which would then be
-smoothed out by the transition function repeatedly.</p>
-<p>As we’d like to draw samples from the target distribution <span
-class="math inline">\(P(x)\)</span> the trick is to choose $(x_{i+1}
-x_i) $ such that :</p>
-<p><span class="math display">\[\begin{aligned}
-P(x) &amp;= \sum_{x&#39;} P(x&#39;) \mathcal{T}(x&#39; \rightarrow x)
-\end{aligned}
-\]</span> In other words the MCMC dynamics defined by <span
-class="math inline">\(\mathcal{T}\)</span> must be constructed to have
-the target distribution as their only fixed point. This condition is
-called the global balance condition. Along with some more technical
-considerations such as ergodicity which won’t be considered here, global
-balance suffices to ensure that a MCMC method is correct.</p>
-<p>A sufficient but not necessary condition for global balance to hold
-is detailed balance:</p>
-<p><span class="math display">\[
-P(x) \mathcal{T}(x \rightarrow x&#39;) = P(x&#39;) \mathcal{T}(x&#39;
-\rightarrow x)
-\]</span> % In practice most algorithms are constructed to satisfy
-detailed balance though there are arguments that relaxing the condition
-can lead to faster algorithms <span class="citation"
-data-cites="kapferSamplingPolytopeHarddisk2013"> [<a
-href="#ref-kapferSamplingPolytopeHarddisk2013"
-role="doc-biblioref">8</a>]</span>.</p>
-<p>The goal of MCMC is then to choose <span
-class="math inline">\(\mathcal{T}\)</span> so that it has the desired
-thermal distribution <span class="math inline">\(P(x)\)</span> as its
-fixed point and that it converges quickly onto it. This boils down to
-requiring that the matrix representation of <span
-class="math inline">\(T_{ij} = \mathcal{T}(x_i \to x_j)\)</span> has an
-eigenvector equal to <span class="math inline">\(P_i = P(x_i)\)</span>
-with eigenvalue 1 and all other eigenvalues with magnitude less than
-one. The convergence time depends on the magnitude of the second largest
-eigenvalue.</p>
-</section>
-<section id="metropolis-hastings" class="level2">
-<h2>Metropolis-Hastings</h2>
-<p>In order to actually choose new states according to <span
-class="math inline">\(\mathcal{T}\)</span> one chooses states from a
-proposal distribution <span class="math inline">\(q(x_i \to
-x&#39;)\)</span> that can be directly sampled from. For instance, this
-might mean flipping a single random spin in a spin chain, in which case
-<span class="math inline">\(q(x&#39;\to x_i)\)</span> is the uniform
-distribution on states reachable by one spin flip from <span
-class="math inline">\(x_i\)</span>. The proposal <span
-class="math inline">\(x&#39;\)</span> is then accepted or rejected with
-an acceptance probability <span
-class="math inline">\(\mathcal{A}(x&#39;\to x_{i+1})\)</span>, if the
-proposal is rejected then <span class="math inline">\(x_{i+1} =
-x_{i}\)</span>. Now <span class="math inline">\(\mathcal{T}(x\to x&#39;)
-= q(x\to x&#39;)\mathcal{A}(x \to x&#39;)\)</span>.</p>
-<p>The Metropolis-Hasting algorithm is a slight extension of the
-original Metropolis algorithm that allows for non-symmetric proposal
-distributions $q(xx’) q(x’x) $. It can be derived starting from detailed
-balance <span class="citation"
-data-cites="krauthIntroductionMonteCarlo1998"> [<a
-href="#ref-krauthIntroductionMonteCarlo1998"
-role="doc-biblioref">7</a>]</span>: <span
-class="math display">\[\begin{aligned}
-P(x)\mathcal{T}(x \to x&#39;) &amp;= P(x&#39;)\mathcal{T}(x&#39; \to x)
-\\
-P(x)q(x \to x&#39;)\mathcal{A}(x \to x&#39;) &amp;= P(x&#39;)q(x&#39;
-\to x)\mathcal{A}(x&#39; \to x) \\
-\label{eq:db2} \frac{\mathcal{A}(x \to x&#39;)}{\mathcal{A}(x&#39; \to
-x)} &amp;= \frac{P(x&#39;)q(x&#39; \to x)}{P(x)q(x \to x&#39;)} = f(x,
-x&#39;)\\
-\end{aligned}
-\]</span> % The Metropolis-Hastings algorithm is the choice: <span
-class="math display">\[
-\begin{aligned}
-\label{eq:mh}
-\mathcal{A}(x \to x&#39;) = \min\left(1, f(x,x&#39;)\right)
-\end{aligned}
-\]</span> % Noting that <span class="math inline">\(f(x,x&#39;) =
-1/f(x&#39;,x)\)</span>, Eq. <span
-class="math inline">\(\ref{eq:mh}\)</span> can be seen to satisfy Eq.
-<span class="math inline">\(\ref{eq:db2}\)</span> by considering the two
-cases <span class="math inline">\(f(x,x&#39;) &gt; 1\)</span> and <span
-class="math inline">\(f(x,x&#39;) &lt; 1\)</span>.</p>
-<p>By choosing the proposal distribution such that <span
-class="math inline">\(f(x,x&#39;)\)</span> is as close as possible to
-one, the rate of rejections can be reduced and the algorithm sped
-up.</p>
-</section>
-<section id="convergence-auto-correlation-and-binning" class="level2">
-<h2>Convergence, Auto-correlation and Binning</h2>
-<p>%Thinning, burn in, multiple runs</p>
-</section>
-<section id="proposal-distributions" class="level2">
-<h2>Proposal Distributions</h2>
-<p>In a MCMC method a key property is the proportion of the time that
-proposals are accepted, the acceptance rate. If this rate is too low the
-random walk is trying to take overly large steps in energy space which
-problematic because it means very few new samples will be generated. If
-it is too high it implies the steps are too small, a problem because
-then the walk will take longer to explore the state space and the
-samples will be highly correlated. Ideal values for the acceptance rate
-can be calculated under certain assumptions <span class="citation"
-data-cites="robertsWeakConvergenceOptimal1997"> [<a
-href="#ref-robertsWeakConvergenceOptimal1997"
-role="doc-biblioref">9</a>]</span>. Here we monitor the acceptance rate
-and if it is too high we re-run the MCMC with a modified proposal
-distribution that has a chance to propose moves that flip multiple sites
-at a time.</p>
-<p>In addition we exploit the particle-hole symmetry of the problem by
-occasionally proposing a flip of the entire state. This works because
-near half-filling, flipping the occupations of all the sites will
-produce a state at or near the energy of the current one.</p>
-</section>
-<section id="perturbation-mcmc" class="level2">
-<h2>Perturbation MCMC</h2>
-<p>The matrix diagonalisation is the most computationally expensive step
-of the process, a speed up can be obtained by modifying the proposal
-distribution to depend on the classical part of the energy, a trick
-gleaned from Ref. <span class="citation"
-data-cites="krauthIntroductionMonteCarlo1998"> [<a
-href="#ref-krauthIntroductionMonteCarlo1998"
-role="doc-biblioref">7</a>]</span>: <span class="math display">\[
-\begin{aligned}
-q(k \to k&#39;) &amp;= \min\left(1, e^{\beta (H^{k&#39;} - H^k)}\right)
-\\
-\mathcal{A}(k \to k&#39;) &amp;= \min\left(1, e^{\beta(F^{k&#39;}-
-F^k)}\right)
-\end{aligned}\]</span> % This allows the method to reject some states
-without performing the diagonalisation at no cost to the accuracy of the
-MCMC method.</p>
-<p>An extension of this idea is to try to define a classical model with
-a similar free energy dependence on the classical state as the full
-quantum, Ref. <span class="citation"
+role="doc-biblioref">6</a>]</span>.</p>
+<p>In our computations <span class="citation"
+data-cites="hodsonMCMCFKModel2021"> [<a
+href="#ref-hodsonMCMCFKModel2021" role="doc-biblioref">7</a>]</span> we
+employ a modification of the algorithm which is based on the observation
+that the free energy of the FK system is composed of a classical part
+which is much quicker to compute than the quantum part. Hence, we can
+obtain a computational speedup by first considering the value of the
+classical energy difference <span class="math inline">\(\Delta
+H_s\)</span> and rejecting the transition if the former is too high. We
+only compute the quantum energy difference <span
+class="math inline">\(\Delta F_c\)</span> if the transition is accepted.
+We then perform a second rejection sampling step based upon it. This
+corresponds to two nested comparisons with the majority of the work only
+occurring if the first test passes and has the acceptance function <span
+class="math display">\[\mathcal{A}(a \to b) = \min\left(1, e^{-\beta
+\Delta H_s}\right)\min\left(1, e^{-\beta \Delta
+F_c}\right)\;.\]</span></p>
+<p>For the model parameters <span class="math inline">\(U=2/5, T = 1.5 /
+2.5, J = 5,\;\alpha = 1.25\)</span>, we find that with our new scheme
+the matrix diagonalisation is skipped around 30% of the time at <span
+class="math inline">\(T = 2.5\)</span> and up to 80% at <span
+class="math inline">\(T = 1.5\)</span>. We observe that for <span
+class="math inline">\(N = 50\)</span>, the matrix diagonalisation, if it
+occurs, occupies around 60% of the total computation time for a single
+step. This rises to 90% at N = 300 and further increases for larger N.
+We therefore get the greatest speedup for large system sizes at low
+temperature where many prospective transitions are rejected at the
+classical stage and the matrix computation takes up the greatest
+fraction of the total computation time. The upshot is that we find a
+speedup of up to a factor of 10 at the cost of very little extra
+algorithmic complexity.</p>
+<p>Our two-step method should be distinguished from the more common
+method for speeding up MCMC which is to add asymmetry to the proposal
+distribution to make it as similar as possible to <span
+class="math inline">\(\min\left(1, e^{-\beta \Delta E}\right)\)</span>.
+This reduces the number of rejected states, which brings the algorithm
+closer in efficiency to a direct sampling method. However it comes at
+the expense of requiring a way to directly sample from this complex
+distribution, a problem which MCMC was employed to solve in the first
+place. For example, recent work trains restricted Boltzmann machines
+(RBMs) to generate samples for the proposal distribution of the FK
+model <span class="citation"
 data-cites="huangAcceleratedMonteCarlo2017"> [<a
 href="#ref-huangAcceleratedMonteCarlo2017"
-role="doc-biblioref">10</a>]</span> does this with restricted Boltzmann
-machines whose form is very similar to a classical spin model.</p>
+role="doc-biblioref">8</a>]</span>. The RBMs are chosen as a
+parametrisation of the proposal distribution that can be efficiently
+sampled from while offering sufficient flexibility that they can be
+adjusted to match the target distribution. Our proposed method is
+considerably simpler and does not require training while still reaping
+some of the benefits of reduced computation.</p>
 </section>
 <section id="scaling" class="level2">
 <h2>Scaling</h2>
@@ -529,429 +389,10 @@ phase transition. If multiple such curves are plotted for different
 system sizes, a crossing indicates the location of a critical
 point <span class="citation"
 data-cites="binderFiniteSizeScaling1981 musialMonteCarloSimulations2002"> [<a
-href="#ref-binderFiniteSizeScaling1981" role="doc-biblioref">11</a>,<a
+href="#ref-binderFiniteSizeScaling1981" role="doc-biblioref">9</a>,<a
 href="#ref-musialMonteCarloSimulations2002"
 role="doc-biblioref"><strong>musialMonteCarloSimulations2002?</strong></a>]</span>.</p>
 </section>
-<section id="markov-chain-monte-carlo-in-practice" class="level2">
-<h2>Markov Chain Monte-Carlo in Practice</h2>
-<section id="quick-intro-to-mcmc" class="level3">
-<h3>Quick Intro to MCMC</h3>
-<p>The main paper relies on extensively to evaluate thermal expectation
-values within the model by walking over states of the classical spin
-system <span class="math inline">\(S_i\)</span>. For a classical system,
-the thermal expectation value of some operator <span
-class="math inline">\(O\)</span> is defined by a Boltzmann weighted sum
-over the classical state space: <span
-class="math display">\[\being{aligned}
-    \tex{O} &amp;= \frac{1}{\mathcal{Z}} \sum_{\s \in S} O(x) P(x) \\
-    P(x) &amp;= \frac{1}{\mathcal{Z}} e^{-\beta F(x)} \\
-    \mathcal{Z} &amp;= \sum_{\s \in S} e^{-\beta F(x)}
-\end{aligned}\]</span> While for a quantum system these sums are
-replaced by equivalent traces. The obvious approach to evaluate these
-sums numerically would be to directly loop over all the classical states
-in the system and perform the sum. But we all know know why this isn’t
-feasible: the state space is too large! Indeed even if we could do it,
-it would still be computationally wasteful since at low temperatures the
-sums are dominated by low energy excitations about the ground states of
-the system. Even worse, in our case we must fully solve the fermionic
-system via exact diagonalisation for each classical state in the sum, a
-very expensive operation!~\footnote{The effort involved in exact
-diagonalisation scales like <span class="math inline">\(N^2\)</span> for
-systems with a tri-diagonal matrix representation (open boundary
-conditions and nearest neighbour hopping) and like <span
-class="math inline">\(N^3\)</span> for a generic matrix <span
-class="citation"
-data-cites="bolchQueueingNetworksMarkov2006 usmaniInversionTridiagonalJacobi1994"> [<a
-href="#ref-bolchQueueingNetworksMarkov2006"
-role="doc-biblioref">12</a>,<a
-href="#ref-usmaniInversionTridiagonalJacobi1994"
-role="doc-biblioref">13</a>]</span>.</p>
-<p>c</p>
-<p>MCMC sidesteps these issues by defining a random walk that focuses on
-the states with the greatest Boltzmann weight. At low temperatures this
-means we need only visit a few low energy states to make good estimates
-while at high temperatures the weights become uniform so a small number
-of samples distributed across the state space suffice. However we will
-see that the method is not without difficulties of its own.</p>
-<p>%MCMC from an ensemble point of view In implementation can be boiled
-down to choosing a transition function $(_{t} _t+1) $ where <span
-class="math inline">\(\s\)</span> are vectors representing classical
-spin configurations. We start in some initial state <span
-class="math inline">\(\s_0\)</span> and then repeatedly jump to new
-states according to the probabilities given by <span
-class="math inline">\(\mathcal{T}\)</span>. This defines a set of random
-walks <span class="math inline">\(\{\s_0\ldots \s_i\ldots
-\s_N\}\)</span>. Fig.~<span
-class="math inline">\(\ref{fig:single}\)</span> shows this in practice:
-we have a (rather small) ensemble of <span class="math inline">\(M =
-2\)</span> walkers starting at the same point in state space and then
-spreading outwards by flipping spins along the way.</p>
-<p>In pseudo-code one could write the MCMC simulation for a single
-walker as:</p>
-<div class="sourceCode" id="cb1"><pre
-class="sourceCode python"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
-<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
-<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> sample_T(current_state) </span>
-<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>    states[i] <span class="op">=</span> current_state</span></code></pre></div>
-<p>Where the <code>sample_T</code> function here produces a state with
-probability determined by the <code>current_state</code> and the
-transition function <span
-class="math inline">\(\mathcal{T}\)</span>.</p>
-<p>If we ran many such walkers in parallel we could then approximate the
-distribution <span class="math inline">\(p_t(\s; \s_0)\)</span> which
-tells us where the walkers are likely to be after they’ve evolved for
-<span class="math inline">\(t\)</span> steps from an initial state <span
-class="math inline">\(\s_0\)</span>. We need to carefully choose <span
-class="math inline">\(\mathcal{T}\)</span> such that after a large
-number of steps <span class="math inline">\(k\)</span> (the convergence
-time) the probability <span class="math inline">\(p_t(\s;\s_0)\)</span>
-approaches the thermal distribution <span class="math inline">\(P(\s;
-\beta) = \mathcal{Z}^{-1} e^{-\beta F(\s)}\)</span>. This turns out to
-be quite easy to achieve using the Metropolis-Hasting algorithm.</p>
-</section>
-<section id="convergence-time" class="level3">
-<h3>Convergence Time</h3>
-<p>Considering <span class="math inline">\(p(\s)\)</span> as a vector
-<span class="math inline">\(\vec{p}\)</span> whose jth entry is the
-probability of the jth state <span class="math inline">\(p_j =
-p(\s_j)\)</span>, and writing <span
-class="math inline">\(\mathcal{T}\)</span> as the matrix with entries
-<span class="math inline">\(T_{ij} = \mathcal{T}(\s_j \rightarrow
-\s_i)\)</span> we can write the update rule for the ensemble probability
-as: <span class="math display">\[\vec{p}_{t+1} = \mathcal{T} \vec{p}_t
-\implies \vec{p}_{t} = \mathcal{T}^t \vec{p}_0\]</span> where <span
-class="math inline">\(\vec{p}_0\)</span> is vector which is one on the
-starting state and zero everywhere else. Since all states must
-transition to somewhere with probability one: <span
-class="math inline">\(\sum_i T_{ij} = 1\)</span>.</p>
-<p>Matrices that satisfy this are called stochastic matrices exactly
-because they model these kinds of Markov processes. It can be shown that
-they have real eigenvalues, and ordering them by magnitude, that <span
-class="math inline">\(\lambda_0 = 1\)</span> and <span
-class="math inline">\(0 &lt; \lambda_{i\neq0} &lt; 1\)</span>.
-%https://en.wikipedia.org/wiki/Stochastic_matrix</p>
-<p>Assuming <span class="math inline">\(\mathcal{T}\)</span> has been
-chosen correctly, its single eigenvector with eigenvalue 1 will be the
-thermal distribution so repeated application of the transition function
-eventually leads there, while memory of the initial conditions decays
-exponentially with a convergence time <span
-class="math inline">\(k\)</span> determined by <span
-class="math inline">\(\lambda_1\)</span>. In practice this means that
-one throws away the data from the beginning of the random walk in order
-reduce the dependence on the initial conditions and be close enough to
-the target distribution.</p>
-</section>
-<section id="auto-correlation-time" class="level3">
-<h3>Auto-correlation Time</h3>
-<div id="fig:m_autocorr" class="fignos">
-<figure>
-<img src="../figure_code/fk_chapter/lsr/figs/m_autocorr.png"
-data-short-caption="no title" style="width:100.0%"
-alt="Figure 1: (Upper) 10 MCMC chains starting from the same initial state for a system with N = 150 sites and 3000 MCMC steps. At each MCMC step, n spins are flipped where n is drawn from Uniform(1,N) and this is repeated N^2/100 times. The simulations therefore have the potential to necessitate 10*N^2 matrix diagonalisations for each 100 MCMC steps. (Lower) The normalised auto-correlation (\expval{m_i m_{i-j}} - \expval{m_i}\expval{m_i}) / Var(m_i)) averaged over i. It can be seen that even with each MCMC step already being composed of many individual flip attempts, the auto-correlation is still non negligible and must be taken into account in the statistics. t = 1, \alpha = 1.25, T = 2.2, J = U = 5" />
-<figcaption aria-hidden="true"><span>Figure 1:</span> (Upper) 10 MCMC
-chains starting from the same initial state for a system with <span
-class="math inline">\(N = 150\)</span> sites and 3000 MCMC steps. At
-each MCMC step, n spins are flipped where n is drawn from Uniform(1,N)
-and this is repeated <span class="math inline">\(N^2/100\)</span> times.
-The simulations therefore have the potential to necessitate <span
-class="math inline">\(10*N^2\)</span> matrix diagonalisations for each
-100 MCMC steps. (Lower) The normalised auto-correlation <span
-class="math inline">\((\expval{m_i m_{i-j}} - \expval{m_i}\expval{m_i})
-/ Var(m_i))\)</span> averaged over <span
-class="math inline">\(i\)</span>. It can be seen that even with each
-MCMC step already being composed of many individual flip attempts, the
-auto-correlation is still non negligible and must be taken into account
-in the statistics. <span class="math inline">\(t = 1, \alpha = 1.25, T =
-2.2, J = U = 5\)</span></figcaption>
-</figure>
-</div>
-<p>At this stage one might think we’re done. We can indeed draw
-independent samples from <span class="math inline">\(P(\s;
-\beta)\)</span> by starting from some arbitrary initial state and doing
-<span class="math inline">\(k\)</span> steps to arrive at a sample.
-However a key insight is that after the convergence time, every state
-generated is a sample from <span class="math inline">\(P(\s;
-\beta)\)</span>! They are not, however, independent samples. In
-Fig.~<span class="math inline">\(\ref{fig:raw}\)</span> it is already
-clear that the samples of the order parameter m have some
-auto-correlation because only a few spins are flipped each step but even
-when the number of spins flipped per step is increased, Fig.~<span
-class="math inline">\(\ref{fig:m_autocorr}\)</span> shows that it can be
-an important effect near the phase transition. Let’s define the
-auto-correlation time <span class="math inline">\(\tau(O)\)</span>
-informally as the number of MCMC samples of some observable O that are
-statistically equal to one independent sample or equivalently as the
-number of MCMC steps after which the samples are correlated below some
-cutoff, see <span class="citation"
-data-cites="krauthIntroductionMonteCarlo1996"> [<a
-href="#ref-krauthIntroductionMonteCarlo1996"
-role="doc-biblioref">14</a>]</span> for a more rigorous definition
-involving a sum over the auto-correlation function. The auto-correlation
-time is generally shorter than the convergence time so it therefore
-makes sense from an efficiency standpoint to run a single walker for
-many MCMC steps rather than to run a huge ensemble for <span
-class="math inline">\(k\)</span> steps each.</p>
-<p>Once the random walk has been carried out for many steps, the
-expectation values of <span class="math inline">\(O\)</span> can be
-estimated from the MCMC samples <span
-class="math inline">\(\s_i\)</span>: <span class="math display">\[
-    \tex{O} = \sum_{i = 0}^{N} O(\s_i) + \mathcal{O}(\frac{1}{\sqrt{N}})
-\]</span> The the samples are correlated so the N of them effectively
-contains less information than <span class="math inline">\(N\)</span>
-independent samples would, in fact roughly <span
-class="math inline">\(N/\tau\)</span> effective samples. As a
-consequence the variance is larger than the <span
-class="math inline">\(\qex{O^2} - \qex{O}^2\)</span> form it would have
-if the estimates were uncorrelated. There are many methods in the
-literature for estimating the true variance of <span
-class="math inline">\(\qex{O}\)</span> and deciding how many steps are
-needed but my approach has been to run a small number of parallel
-chains, which are independent, in order to estimate the statistical
-error produced. This is a slightly less computationally efficient
-because it requires throwing away those <span
-class="math inline">\(k\)</span> steps generated before convergence
-multiple times but it is a conceptually simple workaround.</p>
-<p>In summary, to do efficient simulations we want to reduce both the
-convergence time and the auto-correlation time as much as possible. In
-order to explain how, we need to introduce the Metropolis-Hasting (MH)
-algorithm and how it gives an explicit form for the transition
-function.</p>
-</section>
-<section id="the-metropolis-hastings-algorithm" class="level3">
-<h3>The Metropolis-Hastings Algorithm</h3>
-<p>MH breaks up the transition function into a proposal distribution
-<span class="math inline">\(q(\s \to \s&#39;)\)</span> and an acceptance
-function <span class="math inline">\(\mathcal{A}(\s \to
-\s&#39;)\)</span>. <span class="math inline">\(q\)</span> needs to be
-something that we can directly sample from, and in our case generally
-takes the form of flipping some number of spins in <span
-class="math inline">\(\s\)</span>, i.e if we’re flipping a single random
-spin in the spin chain, <span class="math inline">\(q(\s \to
-\s&#39;)\)</span> is the uniform distribution on states reachable by one
-spin flip from <span class="math inline">\(\s\)</span>. This also gives
-the nice symmetry property that <span class="math inline">\(q(\s \to
-\s&#39;) = q(\s&#39; \to \s)\)</span>.</p>
-<p>The proposal <span class="math inline">\(\s&#39;\)</span> is then
-accepted or rejected with an acceptance probability <span
-class="math inline">\(\mathcal{A}(\s \to \s&#39;)\)</span>, if the
-proposal is rejected then <span class="math inline">\(\s_{i+1} =
-\s_{i}\)</span>. Hence:</p>
-<p><span class="math display">\[\mathcal{T}(x\to x&#39;) = q(x\to
-x&#39;)\mathcal{A}(x \to x&#39;)\]</span></p>
-<p>When the proposal distribution is symmetric as ours is, it cancels
-out in the expression for the acceptance function and the
-Metropolis-Hastings algorithm is simply the choice: <span
-class="math display">\[ \mathcal{A}(x \to x&#39;) = \min\left(1,
-e^{-\beta\;\Delta F}\right)\]</span> Where <span
-class="math inline">\(F\)</span> is the overall free energy of the
-system, including both the quantum and classical sector.</p>
-<p>To implement the acceptance function in practice we pick a random
-number in the unit interval and accept if it is less than <span
-class="math inline">\(e^{-\beta\;\Delta F}\)</span>:</p>
-<div class="sourceCode" id="cb2"><pre
-class="sourceCode python"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
-<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
-<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> proposal(current_state)</span>
-<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>    df <span class="op">=</span> free_energy_change(current_state, new_state, parameters)</span>
-<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span>beta <span class="op">*</span> df):</span>
-<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>        current_state <span class="op">=</span> new_state</span>
-<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>        </span>
-<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a>    states[i] <span class="op">=</span> current_state</span></code></pre></div>
-<p>This has the effect of always accepting proposed states that are
-lower in energy and sometimes accepting those that are higher in energy
-than the current state.</p>
-</section>
-</section>
-<section id="two-step-trick" class="level2">
-<h2>Two Step Trick</h2>
-<p>Here, we incorporate a modification to the standard
-Metropolis-Hastings algorithm <span class="citation"
-data-cites="hastingsMonteCarloSampling1970"> [<a
-href="#ref-hastingsMonteCarloSampling1970"
-role="doc-biblioref">15</a>]</span> gleaned from Krauth <span
-class="citation" data-cites="krauthIntroductionMonteCarlo1998"> [<a
-href="#ref-krauthIntroductionMonteCarlo1998"
-role="doc-biblioref">7</a>]</span>.</p>
-<p>In our computations <span class="citation"
-data-cites="hodsonMCMCFKModel2021"> [<a
-href="#ref-hodsonMCMCFKModel2021" role="doc-biblioref">16</a>]</span> we
-employ a modification of the algorithm which is based on the observation
-that the free energy of the FK system is composed of a classical part
-which is much quicker to compute than the quantum part. Hence, we can
-obtain a computational speedup by first considering the value of the
-classical energy difference <span class="math inline">\(\Delta
-H_s\)</span> and rejecting the transition if the former is too high. We
-only compute the quantum energy difference <span
-class="math inline">\(\Delta F_c\)</span> if the transition is accepted.
-We then perform a second rejection sampling step based upon it. This
-corresponds to two nested comparisons with the majority of the work only
-occurring if the first test passes and has the acceptance function <span
-class="math display">\[\mathcal{A}(a \to b) = \min\left(1, e^{-\beta
-\Delta H_s}\right)\min\left(1, e^{-\beta \Delta
-F_c}\right)\;.\]</span></p>
-<p>For the model parameters used in Fig. <a href="#fig:indiv_IPR"
-data-reference-type="ref" data-reference="fig:indiv_IPR">2</a>, we find
-that with our new scheme the matrix diagonalisation is skipped around
-30% of the time at <span class="math inline">\(T = 2.5\)</span> and up
-to 80% at <span class="math inline">\(T = 1.5\)</span>. We observe that
-for <span class="math inline">\(N = 50\)</span>, the matrix
-diagonalisation, if it occurs, occupies around 60% of the total
-computation time for a single step. This rises to 90% at N = 300 and
-further increases for larger N. We therefore get the greatest speedup
-for large system sizes at low temperature where many prospective
-transitions are rejected at the classical stage and the matrix
-computation takes up the greatest fraction of the total computation
-time. The upshot is that we find a speedup of up to a factor of 10 at
-the cost of very little extra algorithmic complexity.</p>
-<p>Our two-step method should be distinguished from the more common
-method for speeding up MCMC which is to add asymmetry to the proposal
-distribution to make it as similar as possible to <span
-class="math inline">\(\min\left(1, e^{-\beta \Delta E}\right)\)</span>.
-This reduces the number of rejected states, which brings the algorithm
-closer in efficiency to a direct sampling method. However it comes at
-the expense of requiring a way to directly sample from this complex
-distribution, a problem which MCMC was employed to solve in the first
-place. For example, recent work trains restricted Boltzmann machines
-(RBMs) to generate samples for the proposal distribution of the FK
-model <span class="citation"
-data-cites="huangAcceleratedMonteCarlo2017"> [<a
-href="#ref-huangAcceleratedMonteCarlo2017"
-role="doc-biblioref">10</a>]</span>. The RBMs are chosen as a
-parametrisation of the proposal distribution that can be efficiently
-sampled from while offering sufficient flexibility that they can be
-adjusted to match the target distribution. Our proposed method is
-considerably simpler and does not require training while still reaping
-some of the benefits of reduced computation.</p>
-</section>
-<section id="detailed-balance-for-the-two-step-method" class="level2">
-<h2>Detailed Balance for the two step method</h2>
-<p>Given a MCMC algorithm with target distribution <span
-class="math inline">\(\pi(a)\)</span> and transition function <span
-class="math inline">\(\mathcal{T}\)</span> the detailed balance
-condition is sufficient (along with some technical constraints <span
-class="citation" data-cites="wolffMonteCarloErrors2004"> [<a
-href="#ref-wolffMonteCarloErrors2004"
-role="doc-biblioref">5</a>]</span>) to guarantee that in the long time
-limit the algorithm produces samples from <span
-class="math inline">\(\pi\)</span>. <span
-class="math display">\[\pi(a)\mathcal{T}(a \to b) = \pi(b)\mathcal{T}(b
-\to a)\]</span></p>
-<p>In pseudo-code, our two step method corresponds to two nested
-comparisons with the majority of the work only occurring if the first
-test passes:</p>
-<div class="sourceCode" id="cb3"><pre
-class="sourceCode python"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
-<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
-<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a>  new_state <span class="op">=</span> proposal(current_state)</span>
-<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a>  c_dE <span class="op">=</span> classical_energy_change(</span>
-<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>                               current_state,</span>
-<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a>                               new_state)</span>
-<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a>  <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span>beta <span class="op">*</span> c_dE):</span>
-<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a>    q_dF <span class="op">=</span> quantum_free_energy_change(</span>
-<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a>                                current_state,</span>
-<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a>                                new_state)</span>
-<span id="cb3-13"><a href="#cb3-13" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span> beta <span class="op">*</span> q_dF):</span>
-<span id="cb3-14"><a href="#cb3-14" aria-hidden="true" tabindex="-1"></a>      current_state <span class="op">=</span> new_state</span>
-<span id="cb3-15"><a href="#cb3-15" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb3-16"><a href="#cb3-16" aria-hidden="true" tabindex="-1"></a>    states[i] <span class="op">=</span> current_state</span></code></pre></div>
-<p>Defining <span class="math inline">\(r_c = e^{-\beta H_c}\)</span>
-and <span class="math inline">\(r_q = e^{-\beta F_q}\)</span> our target
-distribution is <span class="math inline">\(\pi(a) = r_c r_q\)</span>.
-This method has <span class="math inline">\(\mathcal{T}(a\to b) = q(a\to
-b)\mathcal{A}(a \to b)\)</span> with symmetric <span
-class="math inline">\(p(a \to b) = \pi(b \to a)\)</span> and <span
-class="math inline">\(\mathcal{A} = \min\left(1, r_c\right) \min\left(1,
-r_q\right)\)</span></p>
-<p>Substituting this into the detailed balance equation gives: <span
-class="math display">\[\mathcal{T}(a \to b)/\mathcal{T}(b \to a) =
-\pi(b)/\pi(a) = r_c r_q\]</span></p>
-<p>Taking the LHS and substituting in our transition function: <span
-class="math display">\[\begin{aligned}
-\mathcal{T}(a \to b)/\mathcal{T}(b \to a) = \frac{\min\left(1,
-r_c\right) \min\left(1, r_q\right)}{ \min\left(1, 1/r_c\right)
-\min\left(1, 1/r_q\right)}\end{aligned}\]</span></p>
-<p>which simplifies to <span class="math inline">\(r_c r_q\)</span> as
-<span class="math inline">\(\min(1,r)/\min(1,1/r) = r\)</span> for <span
-class="math inline">\(r &gt; 0\)</span>.</p>
-<section id="two-step-trick-1" class="level3">
-<h3>Two Step Trick</h3>
-<p>Our method already relies heavily on the split between the classical
-and quantum sector to derive a sign problem free MCMC algorithm but it
-turns out that there is a further trick we can play with it. The free
-energy term is the sum of an easy to compute classical energy and a more
-expensive quantum free energy, we can split the acceptance function into
-two in such as way as to avoid having to compute the full exact
-diagonalisation some of the time:</p>
-<div class="sourceCode" id="cb4"><pre
-class="sourceCode python"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
-<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
-<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> proposal(current_state)</span>
-<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    df_classical <span class="op">=</span> classical_free_energy_change(current_state, new_state, parameters)</span>
-<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> exp(<span class="op">-</span>beta <span class="op">*</span> df_classical) <span class="op">&lt;</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>):</span>
-<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>        f_quantum <span class="op">=</span> quantum_free_energy(current_state, new_state, parameters)</span>
-<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>    </span>
-<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> exp(<span class="op">-</span> beta <span class="op">*</span> df_quantum) <span class="op">&lt;</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>):</span>
-<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a>          current_state <span class="op">=</span> new_state</span>
-<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a>    </span>
-<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a>        states[i] <span class="op">=</span> current_state</span>
-<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a>    </span></code></pre></div>
-</section>
-<section id="tuning-the-proposal-distribution" class="level3">
-<h3>Tuning the proposal distribution</h3>
-<div id="fig:autocorr_multiple_proposals" class="fignos">
-<figure>
-<img
-src="../figure_code/fk_chapter/lsr/figs/autocorr_multiple_proposals.png"
-data-short-caption="no title" style="width:100.0%"
-alt="Figure 2: Simulations showing how the autocorrelation of the order parameter depends on the proposal distribution used at different temperatures, we see that at T = 1.5 &lt; T_c a single spin flip is likely the best choice, while at the high temperature T = 2.5 &gt; T_c flipping two sites or a mixture of flipping two and 1 sites is likely a better choice. $t = 1, = 1.25, J = U = 5 $" />
-<figcaption aria-hidden="true"><span>Figure 2:</span> Simulations
-showing how the autocorrelation of the order parameter depends on the
-proposal distribution used at different temperatures, we see that at
-<span class="math inline">\(T = 1.5 &lt; T_c\)</span> a single spin flip
-is likely the best choice, while at the high temperature <span
-class="math inline">\(T = 2.5 &gt; T_c\)</span> flipping two sites or a
-mixture of flipping two and 1 sites is likely a better choice. $t = 1, =
-1.25, J = U = 5 $</figcaption>
-</figure>
-</div>
-<p>Now we can discuss how to minimise the auto-correlations. The general
-principle is that one must balance the proposal distribution between two
-extremes. Choose overlay small steps, like flipping only a single spin
-and the acceptance rate will be high because <span
-class="math inline">\(\Delta F\)</span> will usually be small, but each
-state will be very similar to the previous and the auto-correlations
-will be high too, making sampling inefficient. On the other hand,
-overlay large steps, like randomising a large portion of the spins each
-step, will result in very frequent rejections, especially at low
-temperatures.</p>
-<p>I evaluated a few different proposal distributions for use with the
-FK model.</p>
-<ol type="1">
-<li>Flipping a single random site</li>
-<li>Flipping N random sites for some N</li>
-<li>Choosing n from Uniform(1, N) and then flipping n sites for some
-fixed N.</li>
-<li>Attempting to tune the proposal distribution for each parameter
-regime.</li>
-</ol>
-<p>Fro Figure~<span class="math inline">\(\ref{fig:comparison}\)</span>
-we see that even at moderately high temperatures <span
-class="math inline">\(T &gt; T_c\)</span> flipping one or two sites is
-the best choice. However for some simulations at very high temperature
-flipping more spins is warranted. Tuning the proposal distribution
-automatically seems like something that would not yield enough benefit
-for the additional complexity it would require.</p>
-</section>
-</section>
 <section id="diagnostics-of-localisation" class="level2">
 <h2>Diagnostics of Localisation</h2>
 <section id="inverse-participation-ratio" class="level3">
@@ -961,7 +402,7 @@ function <span class="math inline">\(\psi_i = \psi(x_i), \sum_i
 \abs{\psi_i}^2 = 1\)</span> as its fourth moment <span class="citation"
 data-cites="kramerLocalizationTheoryExperiment1993"> [<a
 href="#ref-kramerLocalizationTheoryExperiment1993"
-role="doc-biblioref">17</a>]</span>: <span class="math display">\[
+role="doc-biblioref">10</a>]</span>: <span class="math display">\[
 P^{-1} = \sum_i \abs{\psi_i}^4
 \]</span> % It acts as a measure of the portion of space occupied by the
 wave function. For localised states it will be independent of system
@@ -975,7 +416,7 @@ P(L) \goeslike L^{d*}
 0\)</span>. In this work we take use an energy resolved IPR <span
 class="citation" data-cites="andersonAbsenceDiffusionCertain1958"> [<a
 href="#ref-andersonAbsenceDiffusionCertain1958"
-role="doc-biblioref">18</a>]</span>: <span class="math display">\[
+role="doc-biblioref">11</a>]</span>: <span class="math display">\[
 DOS(\omega) = \sum_n \delta(\omega - \epsilon_n)
 IPR(\omega) = DOS(\omega)^{-1} \sum_{n,i} \delta(\omega - \epsilon_n)
 \abs{\psi_{n,i}}^4
@@ -984,73 +425,10 @@ wavefunction corresponding to the energy <span
 class="math inline">\(\epsilon_n\)</span> at the ith site. In practice
 we bin the energies and IPRs into a fine energy grid and use Lorentzian
 smoothing if necessary.</p>
-<p>Dimensionality can be both a blessing and a curse. In section ?? I’ll
-discuss the fact that statistical physics can be somewhat boring in one
-dimension where most simple models have no phase transitions. This
-chapter is motivated by the the converse problem, high dimensional
-spaces can sometimes be just too much.</p>
-<p>While there are many problems with high dimensions, my favourite
-being that there are no stable gravitational orbits in 4D and above, the
-specific issue we’ll focus on here is that it’s very hard to compute
-integrals over high dimensional spaces.</p>
-<p>The standard methods for numerical integration in 1,2 and 3
-dimensions mostly work in the same way REFERENCE. You evaluate the
-integrand at a grid of points, define an interpolating function over the
-points that’s easy to integrate and then integrate the function. For a
-fixed grid spacing <span class="math inline">\(d\)</span> on a finite
-domain of integration we’ll find that we need to evaluate <span
-class="math inline">\(\propto (1/d)^D\)</span> points, which scales
-exponentially with dimension!</p>
-<p>In statistical physics the main integral that one would love to be
-able to evaluate is of course the partition function.</p>
-<p><span class="math display">\[Z = \int ds e^{-\beta F}\]</span></p>
-<p>And as this is condensed matter theory, we will mainly be looking at
-quantum models in which our states are discrete occupation numbers of
-single particle energy states. For a spin model with just two states per
-site and N sites we therefore end up with <span
-class="math inline">\(2^N\)</span> possible states of the system.</p>
-<p>domain of integration is bounded we can cif we take a discrete space
-with <span class="math inline">\(M\)</span> dimensions each taking <span
-class="math inline">\(N\)</span> distinct values.</p>
-<p>Detailed and Global balance equation Mixing times Cluster updates and
-Critical slowing down Effective Sample Size</p>
-</section>
-</section>
-<section id="markov-chain-monte-carlo-2" class="level2">
-<h2>Markov Chain Monte-Carlo</h2>
-<p>Dimensionality can be both a blessing and a curse. In I’ll discuss
-the fact that statistical physics can be somewhat boring in one
-dimension where most simple models have no phase transitions. This
-chapter is motivated by the the converse problem, high dimensional
-spaces can sometimes be just too much.</p>
-<p>While there are many problems with high dimensions <a href="#fn1"
-class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>
-it’s very hard to compute integrals over high dimensional spaces. If we
-take a discrete space with <span class="math inline">\(M\)</span>
-dimensions each taking <span class="math inline">\(N\)</span> distinct
-values.</p>
-<p>For a classical system, the thermal expectation value of some
-operator <span class="math inline">\(O\)</span> is defined by a
-Boltzmann weighted sum over the classical state space: <span
-class="math display">\[\begin{aligned}
-    \tex{O} &amp;= \frac{1}{\Z} \sum_{\s \in S} O(x) P(x) \\
-    P(x) &amp;= \frac{1}{\Z} e^{-\beta F(x)} \\
-    \Z &amp;= \sum_{\s \in S} e^{-\beta F(x)}\end{aligned}\]</span>
-While for a quantum system these sums are replaced by equivalent traces.
-The obvious approach to evaluate these sums numerically would be to
-directly loop over all the classical states in the system and perform
-the sum. But we all know know why this isn’t feasible: the state space
-is too large! Indeed even if we could do it, it would still be
-computationally wasteful since at low temperatures the sums are
-dominated by low energy excitations about the ground states of the
-system. Even worse, in our case we must fully solve the fermionic system
-via exact diagonalisation for each classical state in the sum, a very
-expensive operation! <a href="#fn2" class="footnote-ref" id="fnref2"
-role="doc-noteref"><sup>2</sup></a></p>
 <div id="fig:raw" class="fignos">
 <figure>
 <embed src="figs/lsr/raw_steps_single_flip.pdf" />
-<figcaption aria-hidden="true"><span>Figure 3:</span> An MCMC walk
+<figcaption aria-hidden="true"><span>Figure 1:</span> An MCMC walk
 starting from the staggered charge density wave ground state for a
 system with <span class="math inline">\(N = 100\)</span> sites and
 10,000 MCMC steps. In this simulation only a single spin can be flipped
@@ -1076,7 +454,7 @@ not without difficulties of its own.</p>
 <div id="fig:single" class="fignos">
 <figure>
 <embed src="figs/lsr/single.pdf" />
-<figcaption aria-hidden="true"><span>Figure 4:</span> Two MCMC chains
+<figcaption aria-hidden="true"><span>Figure 2:</span> Two MCMC chains
 starting from the same initial state for a system with <span
 class="math inline">\(N = 90\)</span> sites and 1000 MCMC steps. In this
 simulation the MCMC step is defined differently: an attempt is made to
@@ -1094,15 +472,15 @@ label="fig:single">[fig:single]</span></figcaption>
 </div>
 <p>In implementation <span data-acronym-label="MCMC"
 data-acronym-form="singular+short">MCMC</span> can be boiled down to
-choosing a transition function <span class="math inline">\(\T(\s_{t}
-\rightarrow \s_t+1)\)</span> where <span
-class="math inline">\(\s\)</span> are vectors representing classical
-spin configurations. We start in some initial state <span
+choosing a transition function <span
+class="math inline">\(\mathcal{T}(\s_{t} \rightarrow \s_t+1)\)</span>
+where <span class="math inline">\(\s\)</span> are vectors representing
+classical spin configurations. We start in some initial state <span
 class="math inline">\(\s_0\)</span> and then repeatedly jump to new
 states according to the probabilities given by <span
-class="math inline">\(\T\)</span>. This defines a set of random walks
-<span class="math inline">\(\{\s_0\ldots \s_i\ldots \s_N\}\)</span>.
-Fig. <a href="#fig:single" data-reference-type="ref"
+class="math inline">\(\mathcal{T}\)</span>. This defines a set of random
+walks <span class="math inline">\(\{\s_0\ldots \s_i\ldots
+\s_N\}\)</span>. Fig. <a href="#fig:single" data-reference-type="ref"
 data-reference="fig:single">2</a> shows this in practice: we have a
 (rather small) ensemble of <span class="math inline">\(M = 2\)</span>
 walkers starting at the same point in state space and then spreading
@@ -1116,29 +494,32 @@ states[i] = current_state “’</p>
 </div>
 <p>Where the <code>sample_T</code> function here produces a state with
 probability determined by the <code>current_state</code> and the
-transition function <span class="math inline">\(\T\)</span>.</p>
+transition function <span
+class="math inline">\(\mathcal{T}\)</span>.</p>
 <p>If we ran many such walkers in parallel we could then approximate the
 distribution <span class="math inline">\(p_t(\s; \s_0)\)</span> which
 tells us where the walkers are likely to be after they’ve evolved for
 <span class="math inline">\(t\)</span> steps from an initial state <span
 class="math inline">\(\s_0\)</span>. We need to carefully choose <span
-class="math inline">\(\T\)</span> such that after a large number of
-steps <span class="math inline">\(k\)</span> (the convergence time) the
-probability <span class="math inline">\(p_t(\s;\s_0)\)</span> approaches
-the thermal distribution <span class="math inline">\(P(\s; \beta) =
-\Z^{-1} e^{-\beta F(\s)}\)</span>. This turns out to be quite easy to
-achieve using the Metropolis-Hasting algorithm.</p>
+class="math inline">\(\mathcal{T}\)</span> such that after a large
+number of steps <span class="math inline">\(k\)</span> (the convergence
+time) the probability <span class="math inline">\(p_t(\s;\s_0)\)</span>
+approaches the thermal distribution <span class="math inline">\(P(\s;
+\beta) = \mathcal{Z}^{-1} e^{-\beta F(\s)}\)</span>. This turns out to
+be quite easy to achieve using the Metropolis-Hasting algorithm.</p>
 </section>
-<section id="convergence-time-1" class="level2">
+</section>
+<section id="convergence-time" class="level2">
 <h2>Convergence Time</h2>
 <p>Considering <span class="math inline">\(p(\s)\)</span> as a vector
 <span class="math inline">\(\vec{p}\)</span> whose jth entry is the
 probability of the jth state <span class="math inline">\(p_j =
-p(\s_j)\)</span>, and writing <span class="math inline">\(\T\)</span> as
-the matrix with entries <span class="math inline">\(T_{ij} = \T(\s_j
-\rightarrow \s_i)\)</span> we can write the update rule for the ensemble
-probability as: <span class="math display">\[\vec{p}_{t+1} = \T
-\vec{p}_t \implies \vec{p}_{t} = \T^t \vec{p}_0\]</span> where <span
+p(\s_j)\)</span>, and writing <span
+class="math inline">\(\mathcal{T}\)</span> as the matrix with entries
+<span class="math inline">\(T_{ij} = \mathcal{T}(\s_j \rightarrow
+\s_i)\)</span> we can write the update rule for the ensemble probability
+as: <span class="math display">\[\vec{p}_{t+1} = \mathcal{T} \vec{p}_t
+\implies \vec{p}_{t} = \mathcal{T}^t \vec{p}_0\]</span> where <span
 class="math inline">\(\vec{p}_0\)</span> is vector which is one on the
 starting state and zero everywhere else. Since all states must
 transition to somewhere with probability one: <span
@@ -1148,25 +529,24 @@ because they model these kinds of Markov processes. It can be shown that
 they have real eigenvalues, and ordering them by magnitude, that <span
 class="math inline">\(\lambda_0 = 1\)</span> and <span
 class="math inline">\(0 &lt; \lambda_{i\neq0} &lt; 1\)</span>. Assuming
-<span class="math inline">\(\T\)</span> has been chosen correctly, its
-single eigenvector with eigenvalue 1 will be the thermal distribution <a
-href="#fn3" class="footnote-ref" id="fnref3"
-role="doc-noteref"><sup>3</sup></a> so repeated application of the
-transition function eventually leads there, while memory of the initial
-conditions decays exponentially with a convergence time <span
+<span class="math inline">\(\mathcal{T}\)</span> has been chosen
+correctly, its single eigenvector with eigenvalue 1 will be the thermal
+distribution [^3] so repeated application of the transition function
+eventually leads there, while memory of the initial conditions decays
+exponentially with a convergence time <span
 class="math inline">\(k\)</span> determined by <span
 class="math inline">\(\lambda_1\)</span>. In practice this means that
 one throws away the data from the beginning of the random walk in order
 reduce the dependence on the initial conditions and be close enough to
 the target distribution.</p>
 </section>
-<section id="auto-correlation-time-1" class="level2">
+<section id="auto-correlation-time" class="level2">
 <h2>Auto-correlation Time</h2>
 <div id="fig:m_autocorr" class="fignos">
 <figure>
 <img src="figs/lsr/m_autocorr.png"
-alt="Figure 5: (Upper) 10 MCMC chains starting from the same initial state for a system with N = 150 sites and 3000 MCMC steps. At each MCMC step, n spins are flipped where n is drawn from Uniform(1,N) and this is repeated N^2/100 times. The simulations therefore have the potential to necessitate 10*N^2 matrix diagonalisations for each 100 MCMC steps. (Lower) The normalised auto-correlation (\expval{m_i m_{i-j}} - \expval{m_i}\expval{m_i}) / Var(m_i)) averaged over i. It can be seen that even with each MCMC step already being composed of many individual flip attempts, the auto-correlation is still non negligible and must be taken into account in the statistics. t = 1, \alpha = 1.25, T = 2.2, J = U = 5 [fig:m_autocorr]" />
-<figcaption aria-hidden="true"><span>Figure 5:</span> (Upper) 10 MCMC
+alt="Figure 3: (Upper) 10 MCMC chains starting from the same initial state for a system with N = 150 sites and 3000 MCMC steps. At each MCMC step, n spins are flipped where n is drawn from Uniform(1,N) and this is repeated N^2/100 times. The simulations therefore have the potential to necessitate 10*N^2 matrix diagonalisations for each 100 MCMC steps. (Lower) The normalised auto-correlation (\expval{m_i m_{i-j}} - \expval{m_i}\expval{m_i}) / Var(m_i)) averaged over i. It can be seen that even with each MCMC step already being composed of many individual flip attempts, the auto-correlation is still non negligible and must be taken into account in the statistics. t = 1, \alpha = 1.25, T = 2.2, J = U = 5 [fig:m_autocorr]" />
+<figcaption aria-hidden="true"><span>Figure 3:</span> (Upper) 10 MCMC
 chains starting from the same initial state for a system with <span
 class="math inline">\(N = 150\)</span> sites and 3000 MCMC steps. At
 each MCMC step, n spins are flipped where n is drawn from Uniform(1,N)
@@ -1200,12 +580,11 @@ data-reference="fig:m_autocorr">3</a> shows that it can be an important
 effect near the phase transition. Let’s define the auto-correlation time
 <span class="math inline">\(\tau(O)\)</span> informally as the number of
 MCMC samples of some observable O that are statistically equal to one
-independent sample. <a href="#fn4" class="footnote-ref" id="fnref4"
-role="doc-noteref"><sup>4</sup></a> The auto-correlation time is
-generally shorter than the convergence time so it therefore makes sense
-from an efficiency standpoint to run a single walker for many MCMC steps
-rather than to run a huge ensemble for <span
-class="math inline">\(k\)</span> steps each.</p>
+independent sample. [^4] The auto-correlation time is generally shorter
+than the convergence time so it therefore makes sense from an efficiency
+standpoint to run a single walker for many MCMC steps rather than to run
+a huge ensemble for <span class="math inline">\(k\)</span> steps
+each.</p>
 <p>Once the random walk has been carried out for many steps, the
 expectation values of <span class="math inline">\(O\)</span> can be
 estimated from the MCMC samples <span
@@ -1231,138 +610,16 @@ convergence time and the auto-correlation time as much as possible. In
 order to explain how, we need to introduce the Metropolis-Hasting (MH)
 algorithm and how it gives an explicit form for the transition
 function.</p>
-</section>
-<section id="the-metropolis-hastings-algorithm-1" class="level2">
-<h2>The Metropolis-Hastings Algorithm</h2>
-<p>MH breaks up the transition function into a proposal distribution
-<span class="math inline">\(q(\s \to \s&#39;)\)</span> and an acceptance
-function <span class="math inline">\(\A(\s \to \s&#39;)\)</span>. <span
-class="math inline">\(q\)</span> needs to be something that we can
-directly sample from, and in our case generally takes the form of
-flipping some number of spins in <span
-class="math inline">\(\s\)</span>, i.e if we’re flipping a single random
-spin in the spin chain, <span class="math inline">\(q(\s \to
-\s&#39;)\)</span> is the uniform distribution on states reachable by one
-spin flip from <span class="math inline">\(\s\)</span>. This also gives
-the nice symmetry property that <span class="math inline">\(q(\s \to
-\s&#39;) = q(\s&#39; \to \s)\)</span>.</p>
-<p>The proposal <span class="math inline">\(\s&#39;\)</span> is then
-accepted or rejected with an acceptance probability <span
-class="math inline">\(\A(\s \to \s&#39;)\)</span>, if the proposal is
-rejected then <span class="math inline">\(\s_{i+1} = \s_{i}\)</span>.
-Hence:</p>
-<p><span class="math display">\[\T(x\to x&#39;) = q(x\to x&#39;)\A(x \to
-x&#39;)\]</span></p>
-<p>When the proposal distribution is symmetric as ours is, it cancels
-out in the expression for the acceptance function and the
-Metropolis-Hastings algorithm is simply the choice: <span
-class="math display">\[\A(x \to x&#39;) = \min\left(1, e^{-\beta\;\Delta
-F}\right)\]</span> Where <span class="math inline">\(F\)</span> is the
-overall free energy of the system, including both the quantum and
-classical sector.</p>
-<p>To implement the acceptance function in practice we pick a random
-number in the unit interval and accept if it is less than <span
-class="math inline">\(e^{-\beta\;\Delta F}\)</span>:</p>
-<div class="sourceCode" id="cb5" data-language="Python"><pre
-class="sourceCode python"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
-<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
-<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> proposal(current_state)</span>
-<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a>    df <span class="op">=</span> free_energy_change(current_state, new_state, parameters)</span>
-<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span>beta <span class="op">*</span> df):</span>
-<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a>        current_state <span class="op">=</span> new_state</span>
-<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a>        </span>
-<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a>    states[i] <span class="op">=</span> current_state</span></code></pre></div>
-<p>This has the effect of always accepting proposed states that are
-lower in energy and sometimes accepting those that are higher in energy
-than the current state.</p>
-</section>
-<section id="choosing-the-proposal-distribution" class="level2">
-<h2>Choosing the proposal distribution</h2>
-<p><img src="figs/lsr/autocorr_multiple_proposals.png" title="fig:"
-id="fig:comparison"
-alt="t = 1, \alpha = 1.25, J = U = 5 [fig:comparison]" /> Simulations
-showing how the autocorrelation of the order parameter depends on the
-proposal distribution used at different temperatures, we see that at
-<span class="math inline">\(T = 1.5 &lt; T_c\)</span> a single spin flip
-is likely the best choice, while at the high temperature <span
-class="math inline">\(T = 2.5 &gt; T_c\)</span> flipping two sites or a
-mixture of flipping two and 1 sites is likely a better choice.</p>
-<p>Now we can discuss how to minimise the auto-correlations. The general
-principle is that one must balance the proposal distribution between two
-extremes. Choose overlay small steps, like flipping only a single spin
-and the acceptance rate will be high because <span
-class="math inline">\(\Delta F\)</span> will usually be small, but each
-state will be very similar to the previous and the auto-correlations
-will be high too, making sampling inefficient. On the other hand,
-overlay large steps, like randomising a large portion of the spins each
-step, will result in very frequent rejections, especially at low
-temperatures.</p>
-<p>I evaluated a few different proposal distributions for use with the
-FK model.</p>
-<ol type="1">
-<li><p>Flipping a single random site</p></li>
-<li><p>Flipping N random sites for some N</p></li>
-<li><p>Choosing n from Uniform(1, N) and then flipping n sites for some
-fixed N.</p></li>
-<li><p>Attempting to tune the proposal distribution for each parameter
-regime.</p></li>
-</ol>
-<p>Fro Figure <a href="#fig:comparison" data-reference-type="ref"
-data-reference="fig:comparison">4</a> we see that even at moderately
-high temperatures <span class="math inline">\(T &gt; T_c\)</span>
-flipping one or two sites is the best choice. However for some
-simulations at very high temperature flipping more spins is warranted.
-Tuning the proposal distribution automatically seems like something that
-would not yield enough benefit for the additional complexity it would
-require.</p>
-</section>
-<section id="two-step-trick-2" class="level2">
-<h2>Two Step Trick</h2>
-<p>Our method already relies heavily on the split between the classical
-and quantum sector to derive a sign problem free MCMC algorithm but it
-turns out that there is a further trick we can play with it. The free
-energy term is the sum of an easy to compute classical energy and a more
-expensive quantum free energy, we can split the acceptance function into
-two in such as way as to avoid having to compute the full exact
-diagonalisation some of the time:</p>
-<div class="sourceCode" id="cb6" data-language="Python"><pre
-class="sourceCode python"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
-<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
-<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> proposal(current_state)</span>
-<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a>    df_classical <span class="op">=</span> classical_free_energy_change(current_state, new_state, parameters)</span>
-<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> exp(<span class="op">-</span>beta <span class="op">*</span> df_classical) <span class="op">&lt;</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>):</span>
-<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a>        f_quantum <span class="op">=</span> quantum_free_energy(current_state, new_state, parameters)</span>
-<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a>    </span>
-<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> exp(<span class="op">-</span> beta <span class="op">*</span> df_quantum) <span class="op">&lt;</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>):</span>
-<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a>          current_state <span class="op">=</span> new_state</span>
-<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a>    </span>
-<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a>        states[i] <span class="op">=</span> current_state</span>
-<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a>    </span></code></pre></div>
+<p>Next Section: <a
+href="../3_Long_Range_Falikov_Kimball/3.3_LRFK_Results.html">Results</a></p>
 </section>
 </section>
 <section id="bibliography" class="level1 unnumbered">
 <h1 class="unnumbered">Bibliography</h1>
 <div id="refs" class="references csl-bib-body" role="doc-bibliography">
-<div id="ref-devroyeRandomSampling1986" class="csl-entry"
-role="doc-biblioentry">
-<div class="csl-left-margin">[1] </div><div class="csl-right-inline">L.
-Devroye, <em><a
-href="https://doi.org/10.1007/978-1-4613-8643-8_12">Random
-Sampling</a></em>, in <em>Non-Uniform Random Variate Generation</em>,
-edited by L. Devroye (Springer, New York, NY, 1986), pp. 611–641.</div>
-</div>
-<div id="ref-BMCP2021" class="csl-entry" role="doc-biblioentry">
-<div class="csl-left-margin">[2] </div><div class="csl-right-inline">O.
-A. Martin, R. Kumar, and J. Lao, <em>Bayesian Modeling and Computation
-in Python</em> (Boca Raton, 2021).</div>
-</div>
 <div id="ref-binderGuidePracticalWork1988" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[3] </div><div class="csl-right-inline">K.
+<div class="csl-left-margin">[1] </div><div class="csl-right-inline">K.
 Binder and D. W. Heermann, <em><a
 href="https://doi.org/10.1007/978-3-662-08854-8_3">Guide to Practical
 Work with the Monte Carlo Method</a></em>, in <em>Monte Carlo Simulation
@@ -1372,7 +629,7 @@ W. Heermann (Springer Berlin Heidelberg, Berlin, Heidelberg, 1988), pp.
 </div>
 <div id="ref-kerteszAdvancesComputerSimulation1998" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[4] </div><div class="csl-right-inline">J.
+<div class="csl-left-margin">[2] </div><div class="csl-right-inline">J.
 Kertesz and I. Kondor, editors, <em><a
 href="https://doi.org/10.1007/BFb0105456">Advances in Computer
 Simulation: Lectures Held at the Eötvös Summer School in Budapest,
@@ -1381,45 +638,42 @@ Hungary, 16–20 July 1996</a></em> (Springer-Verlag, Berlin Heidelberg,
 </div>
 <div id="ref-wolffMonteCarloErrors2004" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[5] </div><div class="csl-right-inline">U.
+<div class="csl-left-margin">[3] </div><div class="csl-right-inline">U.
 Wolff, <em><a href="https://doi.org/10.1016/S0010-4655(03)00467-3">Monte
 Carlo Errors with Less Errors</a></em>, Computer Physics Communications
 <strong>156</strong>, 143 (2004).</div>
 </div>
 <div id="ref-kellyReversibilityStochasticNetworks1981" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[6] </div><div class="csl-right-inline">F.
+<div class="csl-left-margin">[4] </div><div class="csl-right-inline">F.
 P. Kelly, <em><a href="https://doi.org/10.2307/2287860">Reversibility
 and Stochastic Networks / F.P. Kelly</a></em>, SERBIULA (Sistema Librum
 2.0) <strong>76</strong>, (1981).</div>
 </div>
+<div id="ref-hastingsMonteCarloSampling1970" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[5] </div><div class="csl-right-inline">W.
+K. Hastings, <em><a href="https://doi.org/10.1093/biomet/57.1.97">Monte
+Carlo Sampling Methods Using Markov Chains and Their
+Applications</a></em>, Biometrika <strong>57</strong>, 97 (1970).</div>
+</div>
 <div id="ref-krauthIntroductionMonteCarlo1998" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[7] </div><div class="csl-right-inline">W.
+<div class="csl-left-margin">[6] </div><div class="csl-right-inline">W.
 Krauth, <em><a href="https://doi.org/10.1007/BFb0105456">Introduction To
 Monte Carlo Algorithms</a></em>, in <em>Advances in Computer Simulation:
 Lectures Held at the Eötvös Summer School in Budapest, Hungary, 16–20
 July 1996</em> (Springer-Verlag, Berlin Heidelberg, 1998).</div>
 </div>
-<div id="ref-kapferSamplingPolytopeHarddisk2013" class="csl-entry"
+<div id="ref-hodsonMCMCFKModel2021" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[8] </div><div class="csl-right-inline">S.
-C. Kapfer and W. Krauth, <em><a
-href="https://doi.org/10.1088/1742-6596/454/1/012031">Sampling from a
-Polytope and Hard-Disk Monte Carlo</a></em>, J. Phys.: Conf. Ser.
-<strong>454</strong>, 012031 (2013).</div>
-</div>
-<div id="ref-robertsWeakConvergenceOptimal1997" class="csl-entry"
-role="doc-biblioentry">
-<div class="csl-left-margin">[9] </div><div class="csl-right-inline">G.
-O. Roberts, A. Gelman, and W. R. Gilks, <em><a
-href="https://doi.org/10.1214/aoap/1034625254">Weak Convergence and
-Optimal Scaling of Random Walk Metropolis Algorithms</a></em>, Ann.
-Appl. Probab. <strong>7</strong>, 110 (1997).</div>
+<div class="csl-left-margin">[7] </div><div class="csl-right-inline">T.
+Hodson, <em><a href="https://doi.org/10.5281/zenodo.4593904">Markov
+Chain Monte Carlo for the Kitaev Model</a></em>, (2021).</div>
 </div>
 <div id="ref-huangAcceleratedMonteCarlo2017" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[10] </div><div class="csl-right-inline">L.
+<div class="csl-left-margin">[8] </div><div class="csl-right-inline">L.
 Huang and L. Wang, <em><a
 href="https://doi.org/10.1103/PhysRevB.95.035105">Accelerated Monte
 Carlo Simulations with Restricted Boltzmann Machines</a></em>, Phys.
@@ -1427,48 +681,14 @@ Rev. B <strong>95</strong>, 035105 (2017).</div>
 </div>
 <div id="ref-binderFiniteSizeScaling1981" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[11] </div><div class="csl-right-inline">K.
+<div class="csl-left-margin">[9] </div><div class="csl-right-inline">K.
 Binder, <em><a href="https://doi.org/10.1007/BF01293604">Finite Size
 Scaling Analysis of Ising Model Block Distribution Functions</a></em>,
 Z. Physik B - Condensed Matter <strong>43</strong>, 119 (1981).</div>
 </div>
-<div id="ref-bolchQueueingNetworksMarkov2006" class="csl-entry"
-role="doc-biblioentry">
-<div class="csl-left-margin">[12] </div><div class="csl-right-inline">G.
-Bolch, S. Greiner, H. de Meer, and K. S. Trivedi, <em>Queueing Networks
-and Markov Chains: Modeling and Performance Evaluation with Computer
-Science Applications</em> (John Wiley &amp; Sons, 2006).</div>
-</div>
-<div id="ref-usmaniInversionTridiagonalJacobi1994" class="csl-entry"
-role="doc-biblioentry">
-<div class="csl-left-margin">[13] </div><div class="csl-right-inline">R.
-A. Usmani, <em><a
-href="https://doi.org/10.1016/0024-3795(94)90414-6">Inversion of a
-Tridiagonal Jacobi Matrix</a></em>, Linear Algebra and Its Applications
-<strong>212-213</strong>, 413 (1994).</div>
-</div>
-<div id="ref-krauthIntroductionMonteCarlo1996" class="csl-entry"
-role="doc-biblioentry">
-<div class="csl-left-margin">[14] </div><div class="csl-right-inline">W.
-Krauth, <em><a href="http://arxiv.org/abs/cond-mat/9612186">Introduction
-To Monte Carlo Algorithms</a></em>, arXiv:cond-Mat/9612186 (1996).</div>
-</div>
-<div id="ref-hastingsMonteCarloSampling1970" class="csl-entry"
-role="doc-biblioentry">
-<div class="csl-left-margin">[15] </div><div class="csl-right-inline">W.
-K. Hastings, <em><a href="https://doi.org/10.1093/biomet/57.1.97">Monte
-Carlo Sampling Methods Using Markov Chains and Their
-Applications</a></em>, Biometrika <strong>57</strong>, 97 (1970).</div>
-</div>
-<div id="ref-hodsonMCMCFKModel2021" class="csl-entry"
-role="doc-biblioentry">
-<div class="csl-left-margin">[16] </div><div class="csl-right-inline">T.
-Hodson, <em><a href="https://doi.org/10.5281/zenodo.4593904">Markov
-Chain Monte Carlo for the Kitaev Model</a></em>, (2021).</div>
-</div>
 <div id="ref-kramerLocalizationTheoryExperiment1993" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[17] </div><div class="csl-right-inline">B.
+<div class="csl-left-margin">[10] </div><div class="csl-right-inline">B.
 Kramer and A. MacKinnon, <em><a
 href="https://doi.org/10.1088/0034-4885/56/12/001">Localization: Theory
 and Experiment</a></em>, Rep. Prog. Phys. <strong>56</strong>, 1469
@@ -1476,7 +696,7 @@ and Experiment</a></em>, Rep. Prog. Phys. <strong>56</strong>, 1469
 </div>
 <div id="ref-andersonAbsenceDiffusionCertain1958" class="csl-entry"
 role="doc-biblioentry">
-<div class="csl-left-margin">[18] </div><div class="csl-right-inline">P.
+<div class="csl-left-margin">[11] </div><div class="csl-right-inline">P.
 W. Anderson, <em><a
 href="https://doi.org/10.1103/PhysRev.109.1492">Absence of Diffusion in
 Certain Random Lattices</a></em>, Phys. Rev. <strong>109</strong>, 1492
@@ -1484,40 +704,6 @@ Certain Random Lattices</a></em>, Phys. Rev. <strong>109</strong>, 1492
 </div>
 </div>
 </section>
-<section class="footnotes footnotes-end-of-document"
-role="doc-endnotes">
-<hr />
-<ol>
-<li id="fn1" role="doc-endnote"><p>my favourite being that there are no
-stable gravitational orbits in 4D and above<a href="#fnref1"
-class="footnote-back" role="doc-backlink">↩︎</a></p></li>
-<li id="fn2" role="doc-endnote"><p>The effort involved in exact
-diagonalisation scales like <span class="math inline">\(N^2\)</span> for
-systems with a tri-diagonal matrix representation (open boundary
-conditions and nearest neighbour hopping) and like <span
-class="math inline">\(N^3\)</span> for a generic matrix <span
-class="citation"
-data-cites="bolchQueueingNetworksMarkov2006 usmaniInversionTridiagonalJacobi1994"> [<a
-href="#ref-bolchQueueingNetworksMarkov2006"
-role="doc-biblioref">12</a>,<a
-href="#ref-usmaniInversionTridiagonalJacobi1994"
-role="doc-biblioref">13</a>]</span>.<a href="#fnref2"
-class="footnote-back" role="doc-backlink">↩︎</a></p></li>
-<li id="fn3" role="doc-endnote"><p>or, in the general case, any desired
-distribution. MCMC has found a lot of use in sampling from the
-complicated distributions that arise when taking a Bayesian approach to
-statistics.<a href="#fnref3" class="footnote-back"
-role="doc-backlink">↩︎</a></p></li>
-<li id="fn4" role="doc-endnote"><p>or equivalently as the number of MCMC
-steps after which the samples are correlated below some cutoff,
-see <span class="citation"
-data-cites="krauthIntroductionMonteCarlo1996"> [<a
-href="#ref-krauthIntroductionMonteCarlo1996"
-role="doc-biblioref">14</a>]</span> for a more rigorous definition
-involving a sum over the auto-correlation function.<a href="#fnref4"
-class="footnote-back" role="doc-backlink">↩︎</a></p></li>
-</ol>
-</section>
 
 
 </main>
diff --git a/_thesis/3_Long_Range_Falikov_Kimball/3.3_LRFK_Results.html b/_thesis/3_Long_Range_Falikov_Kimball/3.3_LRFK_Results.html
index 438c43f..fd3872e 100644
--- a/_thesis/3_Long_Range_Falikov_Kimball/3.3_LRFK_Results.html
+++ b/_thesis/3_Long_Range_Falikov_Kimball/3.3_LRFK_Results.html
@@ -470,8 +470,8 @@ H_{\mathrm{DM}} = &amp; \;U \sum_{i} (-1)^i \; d_i \;(c^\dag_{i}c_{i} -
 <div class="sourceCode" id="cb1"><pre
 class="sourceCode python"><code class="sourceCode python"></code></pre></div>
 <p>Next Chapter: <a
-href="../4_Amorphous_Kitaev_Model/4.1.2_AMK_Model.html#vortices-and-their-movements">4
-The Amorphous Kitaev Model</a></p>
+href="../4_Amorphous_Kitaev_Model/4.1.2_AMK_Model.html">4 The Amorphous
+Kitaev Model</a></p>
 </section>
 <section id="bibliography" class="level1 unnumbered">
 <h1 class="unnumbered">Bibliography</h1>
diff --git a/_thesis/4_Amorphous_Kitaev_Model/4.1.2_AMK_Model.html b/_thesis/4_Amorphous_Kitaev_Model/4.1.2_AMK_Model.html
index 4887d94..5525928 100644
--- a/_thesis/4_Amorphous_Kitaev_Model/4.1.2_AMK_Model.html
+++ b/_thesis/4_Amorphous_Kitaev_Model/4.1.2_AMK_Model.html
@@ -1070,8 +1070,7 @@ role="doc-biblioref">16</a>,<a
 href="#ref-kitaevFaulttolerantQuantumComputation2003"
 role="doc-biblioref"><strong>kitaevFaulttolerantQuantumComputation2003?</strong></a>]</span>.</p>
 <p>Next Section: <a
-href="../4_Amorphous_Kitaev_Model/4.1_AMK_Model.html#amk-Model">The
-Model</a></p>
+href="../4_Amorphous_Kitaev_Model/4.1_AMK_Model.html">The Model</a></p>
 </section>
 </section>
 <section id="bibliography" class="level1 unnumbered">
diff --git a/_thesis/4_Amorphous_Kitaev_Model/4.1_AMK_Model.html b/_thesis/4_Amorphous_Kitaev_Model/4.1_AMK_Model.html
index f886cf0..d4d72f8 100644
--- a/_thesis/4_Amorphous_Kitaev_Model/4.1_AMK_Model.html
+++ b/_thesis/4_Amorphous_Kitaev_Model/4.1_AMK_Model.html
@@ -769,7 +769,7 @@ anyway, an arbitrary pairing of the unpaired <span
 class="math inline">\(b^\alpha\)</span> operators could be performed.
 &lt;/i,j&gt;&lt;/i,j&gt;</p>
 <p>Next Section: <a
-href="../4_Amorphous_Kitaev_Model/4.2_AMK_Methods.html#amk-methods">Methods</a></p>
+href="../4_Amorphous_Kitaev_Model/4.2_AMK_Methods.html">Methods</a></p>
 </section>
 </section>
 </section>
diff --git a/_thesis/4_Amorphous_Kitaev_Model/4.2_AMK_Methods.html b/_thesis/4_Amorphous_Kitaev_Model/4.2_AMK_Methods.html
index e903f18..90b6a37 100644
--- a/_thesis/4_Amorphous_Kitaev_Model/4.2_AMK_Methods.html
+++ b/_thesis/4_Amorphous_Kitaev_Model/4.2_AMK_Methods.html
@@ -541,7 +541,7 @@ system.</p>
 <p><strong>Discuss link between Chern number and Anyonic
 Statistics</strong></p>
 <p>Next Section: <a
-href="../4_Amorphous_Kitaev_Model/4.3_AMK_Results.html#amk-results">Results</a></p>
+href="../4_Amorphous_Kitaev_Model/4.3_AMK_Results.html">Results</a></p>
 </section>
 </section>
 <section id="bibliography" class="level1 unnumbered">
diff --git a/_thesis/4_Amorphous_Kitaev_Model/4.3_AMK_Results.html b/_thesis/4_Amorphous_Kitaev_Model/4.3_AMK_Results.html
index 9e7afb0..dde0675 100644
--- a/_thesis/4_Amorphous_Kitaev_Model/4.3_AMK_Results.html
+++ b/_thesis/4_Amorphous_Kitaev_Model/4.3_AMK_Results.html
@@ -665,8 +665,8 @@ href="#ref-Wu2009" role="doc-biblioref">47</a>]</span></p>
 quantum many body phases albeit material candidates aplenty. We expect
 our exact chiral amorphous spin liquid to find many generalisation to
 realistic amorphous quantum magnets and beyond.</p>
-<p>Next Chapter: <a
-href="../5_Conclusion/5_Conclusion.html#discussion">5 Conclusion</a></p>
+<p>Next Chapter: <a href="../5_Conclusion/5_Conclusion.html">5
+Conclusion</a></p>
 </section>
 </section>
 <section id="bibliography" class="level1 unnumbered">
diff --git a/_thesis/5_Conclusion/5_Conclusion.html b/_thesis/5_Conclusion/5_Conclusion.html
index d806d0a..2ee1c5f 100644
--- a/_thesis/5_Conclusion/5_Conclusion.html
+++ b/_thesis/5_Conclusion/5_Conclusion.html
@@ -27,6 +27,14 @@ image:
 <br>
 <nav aria-label="Table of Contents" class="page-table-of-contents">
 <ul>
+<li><a href="#material-realisations"
+id="toc-material-realisations">Material Realisations</a>
+<ul>
+<li><a href="#amorphous-materials"
+id="toc-amorphous-materials">Amorphous Materials</a></li>
+<li><a href="#metal-organic-frameworks"
+id="toc-metal-organic-frameworks">Metal Organic Frameworks</a></li>
+</ul></li>
 <li><a href="#discussion" id="toc-discussion">Discussion</a></li>
 <li><a href="#outlook" id="toc-outlook">Outlook</a></li>
 </ul>
@@ -41,6 +49,14 @@ image:
 <!-- Table of Contents -->
 <!-- <nav id="TOC" role="doc-toc">
 <ul>
+<li><a href="#material-realisations"
+id="toc-material-realisations">Material Realisations</a>
+<ul>
+<li><a href="#amorphous-materials"
+id="toc-amorphous-materials">Amorphous Materials</a></li>
+<li><a href="#metal-organic-frameworks"
+id="toc-metal-organic-frameworks">Metal Organic Frameworks</a></li>
+</ul></li>
 <li><a href="#discussion" id="toc-discussion">Discussion</a></li>
 <li><a href="#outlook" id="toc-outlook">Outlook</a></li>
 </ul>
@@ -52,13 +68,22 @@ image:
 <p>5 Conclusion</p>
 <hr />
 </div>
-<section id="discussion" class="level2">
-<h2>Discussion</h2>
+<section id="material-realisations" class="level1">
+<h1>Material Realisations</h1>
+<section id="amorphous-materials" class="level2">
+<h2>Amorphous Materials</h2>
 </section>
-<section id="outlook" class="level2">
-<h2>Outlook</h2>
+<section id="metal-organic-frameworks" class="level2">
+<h2>Metal Organic Frameworks</h2>
+</section>
+</section>
+<section id="discussion" class="level1">
+<h1>Discussion</h1>
+</section>
+<section id="outlook" class="level1">
+<h1>Outlook</h1>
 <p>Next Chapter: <a
-href="../6_Appendices/A.1_Particle_Hole_Symmetry.html#particle-hole-symmetry">Appendices</a></p>
+href="../6_Appendices/A.1.2_Fermion_Free_Energy.html">Appendices</a></p>
 </section>
 
 
diff --git a/_thesis/6_Appendices/A.1.2_Fermion_Free_Energy.html b/_thesis/6_Appendices/A.1.2_Fermion_Free_Energy.html
new file mode 100644
index 0000000..385426e
--- /dev/null
+++ b/_thesis/6_Appendices/A.1.2_Fermion_Free_Energy.html
@@ -0,0 +1,110 @@
+---
+title: Particle-Hole Symmetry
+excerpt: 
+layout: none
+image: 
+
+---
+<!DOCTYPE html>
+<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
+<head>
+  <meta charset="utf-8" />
+  <meta name="generator" content="pandoc" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
+  <title>Particle-Hole Symmetry</title>
+
+
+<script src="/assets/mathjax/tex-mml-svg.js" id="MathJax-script" async></script>
+<script src="/assets/js/thesis_scrollspy.js"></script>
+
+<link rel="stylesheet" href="/assets/css/styles.css">
+<script src="/assets/js/index.js"></script>
+</head>
+<body>
+
+<!--Capture the table of contents from pandoc as a jekyll variable  -->
+{% capture tableOfContents %}
+<br>
+<nav aria-label="Table of Contents" class="page-table-of-contents">
+<ul>
+<li><a href="#evaluation-of-the-fermion-free-energy"
+id="toc-evaluation-of-the-fermion-free-energy">Evaluation of the Fermion
+Free Energy</a></li>
+</ul>
+</nav>
+{% endcapture %}
+
+<!-- Give the table of contents to header as a variable so it can be put into the sidebar-->
+{% include header.html extra=tableOfContents %}
+
+<main>
+
+<!-- Table of Contents -->
+<!-- <nav id="TOC" role="doc-toc">
+<ul>
+<li><a href="#evaluation-of-the-fermion-free-energy"
+id="toc-evaluation-of-the-fermion-free-energy">Evaluation of the Fermion
+Free Energy</a></li>
+</ul>
+</nav>
+ -->
+
+<!-- Main Page Body -->
+<div id="page-header">
+<p>Appendices</p>
+<hr />
+</div>
+<section id="evaluation-of-the-fermion-free-energy" class="level1">
+<h1>Evaluation of the Fermion Free Energy</h1>
+<p>There are <span class="math inline">\(2^N\)</span> possible ion
+configurations <span class="math inline">\(\{ n_i \}\)</span>, we define
+<span class="math inline">\(n^k_i\)</span> to be the occupation of the
+ith site of the kth configuration. The quantum part of the free energy
+can then be defined through the quantum partition function <span
+class="math inline">\(\mathcal{Z}^k\)</span> associated with each ionic
+state <span class="math inline">\(n^k_i\)</span>: <span
+class="math display">\[\begin{aligned}
+F^k &amp;= -1/\beta \ln{\mathcal{Z}^k} \\
+\end{aligned}\]</span> % Such that the overall partition function is:
+<span class="math display">\[\begin{aligned}
+\mathcal{Z} &amp;= \sum_k e^{- \beta H^k} Z^k \\
+&amp;= \sum_k e^{-\beta (H^k + F^k)} \\
+\end{aligned}\]</span></p>
+<p>Because fermions are limited to occupation numbers of 0 or 1 <span
+class="math inline">\(Z^k\)</span> simplifies nicely. If <span
+class="math inline">\(m^j_i = \{0,1\}\)</span> is defined as the
+occupation of the level with energy <span
+class="math inline">\(\epsilon^k_i\)</span> then the partition function
+is a sum over all the occupation states labelled by j: <span
+class="math display">\[\begin{aligned}
+Z^k    &amp;= \mathrm{Tr} e^{-\beta F^k} = \sum_j e^{-\beta \sum_i m^j_i
+\epsilon^k_i}\\
+       &amp;= \sum_j \prod_i e^{- \beta m^j_i \epsilon^k_i}= \prod_i
+\sum_j e^{- \beta m^j_i \epsilon^k_i}\\
+       &amp;= \prod_i (1 + e^{- \beta \epsilon^k_i})\\
+F^k    &amp;= -1/\beta \sum_k \ln{(1 + e^{- \beta \epsilon^k_i})}
+\end{aligned}\]</span> % Observables can then be calculated from the
+partition function, for examples the occupation numbers:</p>
+<p><span class="math display">\[\begin{aligned}
+\langle N \rangle &amp;= \frac{1}{\beta} \frac{1}{Z} \frac{\partial
+Z}{\partial \mu} = - \frac{\partial F}{\partial \mu}\\
+    &amp;= \frac{1}{\beta} \frac{1}{Z} \frac{\partial}{\partial \mu}
+\sum_k e^{-\beta (H^k + F^k)}\\
+    &amp;= 1/Z \sum_k (N^k_{\mathrm{ion}} + N^k_{\mathrm{electron}})
+e^{-\beta (H^k + F^k)}\\
+\end{aligned}\]</span> % with the definitions:</p>
+<p><span class="math display">\[\begin{aligned}
+N^k_{\mathrm{ion}} &amp;= - \frac{\partial H^k}{\partial \mu} = \sum_i
+n^k_i\\
+N^k_{\mathrm{electron}} &amp;= - \frac{\partial F^k}{\partial \mu} =
+\sum_i \left(1 + e^{\beta \epsilon^k_i}\right)^{-1}\\
+\end{aligned}\]</span></p>
+<p>Next Section: <a
+href="../6_Appendices/A.1_Particle_Hole_Symmetry-Copy1.html">Particle-Hole
+Symmetry</a></p>
+</section>
+
+
+</main>
+</body>
+</html>
diff --git a/_thesis/6_Appendices/A.1_Particle_Hole_Symmetry-Copy1.html b/_thesis/6_Appendices/A.1_Particle_Hole_Symmetry-Copy1.html
new file mode 100644
index 0000000..9773f61
--- /dev/null
+++ b/_thesis/6_Appendices/A.1_Particle_Hole_Symmetry-Copy1.html
@@ -0,0 +1,135 @@
+---
+title: Particle-Hole Symmetry
+excerpt: 
+layout: none
+image: 
+
+---
+<!DOCTYPE html>
+<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
+<head>
+  <meta charset="utf-8" />
+  <meta name="generator" content="pandoc" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
+  <title>Particle-Hole Symmetry</title>
+
+
+<script src="/assets/mathjax/tex-mml-svg.js" id="MathJax-script" async></script>
+<script src="/assets/js/thesis_scrollspy.js"></script>
+
+<link rel="stylesheet" href="/assets/css/styles.css">
+<script src="/assets/js/index.js"></script>
+</head>
+<body>
+
+<!--Capture the table of contents from pandoc as a jekyll variable  -->
+{% capture tableOfContents %}
+<br>
+<nav aria-label="Table of Contents" class="page-table-of-contents">
+<ul>
+<li><a href="#particle-hole-symmetry"
+id="toc-particle-hole-symmetry">Particle-Hole Symmetry</a></li>
+<li><a href="#bibliography" id="toc-bibliography">Bibliography</a></li>
+</ul>
+</nav>
+{% endcapture %}
+
+<!-- Give the table of contents to header as a variable so it can be put into the sidebar-->
+{% include header.html extra=tableOfContents %}
+
+<main>
+
+<!-- Table of Contents -->
+<!-- <nav id="TOC" role="doc-toc">
+<ul>
+<li><a href="#particle-hole-symmetry"
+id="toc-particle-hole-symmetry">Particle-Hole Symmetry</a></li>
+<li><a href="#bibliography" id="toc-bibliography">Bibliography</a></li>
+</ul>
+</nav>
+ -->
+
+<!-- Main Page Body -->
+<div id="page-header">
+<p>Appendices</p>
+<hr />
+</div>
+<section id="particle-hole-symmetry" class="level1">
+<h1>Particle-Hole Symmetry</h1>
+<p>The Hubbard and FK models on a bipartite lattice have particle-hole
+(PH) symmetry <span class="math inline">\(\mathcal{P}^\dagger H
+\mathcal{P} = - H\)</span>, accordingly they have symmetric energy
+spectra. The associated symmetry operator <span
+class="math inline">\(\mathcal{P}\)</span> exchanges creation and
+annihilation operators along with a sign change between the two
+sublattices. In the language of the Hubbard model of electrons <span
+class="math inline">\(c_{\alpha,i}\)</span> with spin <span
+class="math inline">\(\alpha\)</span> at site <span
+class="math inline">\(i\)</span> the particle hole operator corresponds
+to the substitution of new fermion operators <span
+class="math inline">\(d^\dagger_{\alpha,i}\)</span> and number operators
+<span class="math inline">\(m_{\alpha,i}\)</span> where</p>
+<p><span class="math display">\[d^\dagger_{\alpha,i} = \epsilon_i
+c_{\alpha,i}\]</span> <span class="math display">\[m_{\alpha,i} =
+d^\dagger_{\alpha,i}d_{\alpha,i}\]</span></p>
+<p>the lattices must be bipartite because to make this work we set <span
+class="math inline">\(\epsilon_i = +1\)</span> for the A sublattice and
+<span class="math inline">\(-1\)</span> for the even sublattice <span
+class="citation" data-cites="gruberFalicovKimballModel2005"> [<a
+href="#ref-gruberFalicovKimballModel2005"
+role="doc-biblioref">1</a>]</span>.</p>
+<p>The entirely filled state <span class="math inline">\(\ket{\Omega} =
+\sum_{\alpha,i} c^\dagger_{\alpha,i} \ket{0}\)</span> becomes the new
+vacuum state <span class="math display">\[d_{i\sigma} \ket{\Omega} =
+(-1)^i c^\dagger_{i\sigma} \sum_{j\rho} c^\dagger_{j\rho} \ket{0} =
+0.\]</span></p>
+<p>The number operator <span class="math inline">\(m_{\alpha,i} =
+0,1\)</span> counts holes rather than electrons <span
+class="math display">\[ m_{\alpha,i} = c_{\alpha,i} c^\dagger_{\alpha,i}
+= 1 - c^\dagger_{\alpha,i} c_{\alpha,i}.\]</span></p>
+<p>With the last equality following from the fermionic commutation
+relations. In the case of nearest neighbour hopping on a bipartite
+lattice this transformation also leaves the hopping term unchanged
+because <span class="math inline">\(\epsilon_i \epsilon_j = -1\)</span>
+when <span class="math inline">\(i\)</span> and <span
+class="math inline">\(j\)</span> are on different sublattices: <span
+class="math display">\[ d^\dagger_{\alpha,i} d_{\alpha,j} = \epsilon_i
+\epsilon_j c_{\alpha,i} c^\dagger_{\alpha,j} = c^\dagger_{\alpha,i}
+c_{\alpha,j} \]</span></p>
+<p>Defining the particle density <span
+class="math inline">\(\rho\)</span> as the number of fermions per site:
+<span class="math display">\[
+    \rho = \frac{1}{N} \sum_i \left( n_{i \uparrow} + n_{i \downarrow}
+\right)
+\]</span></p>
+<p>The PH symmetry maps the Hamiltonian to itself with the sign of the
+chemical potential reversed and the density inverted about half filling:
+<span class="math display">\[ \text{PH} : H(t, U, \mu) \rightarrow H(t,
+U, -\mu) \]</span> <span class="math display">\[ \rho \rightarrow 2 -
+\rho \]</span></p>
+<p>The Hamiltonian is symmetric under PH at <span
+class="math inline">\(\mu = 0\)</span> and so must all the observables,
+hence half filling <span class="math inline">\(\rho = 1\)</span> occurs
+here. This symmetry and known observable acts as a useful test for the
+numerical calculations.</p>
+<p>Next Section: <a
+href="../6_Appendices/A.2_Markov_Chain_Monte_Carlo.html">Markov Chain
+Monte Carlo</a></p>
+</section>
+<section id="bibliography" class="level1 unnumbered">
+<h1 class="unnumbered">Bibliography</h1>
+<div id="refs" class="references csl-bib-body" role="doc-bibliography">
+<div id="ref-gruberFalicovKimballModel2005" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[1] </div><div class="csl-right-inline">C.
+Gruber and D. Ueltschi, <em><a
+href="http://arxiv.org/abs/math-ph/0502041">The Falicov-Kimball
+Model</a></em>, arXiv:math-Ph/0502041 (2005).</div>
+</div>
+</div>
+</section>
+
+
+</main>
+</body>
+</html>
diff --git a/_thesis/6_Appendices/A.1_Particle_Hole_Symmetry.html b/_thesis/6_Appendices/A.1_Particle_Hole_Symmetry.html
index 48ed5ee..cf22f49 100644
--- a/_thesis/6_Appendices/A.1_Particle_Hole_Symmetry.html
+++ b/_thesis/6_Appendices/A.1_Particle_Hole_Symmetry.html
@@ -113,8 +113,8 @@ hence half filling <span class="math inline">\(\rho = 1\)</span> occurs
 here. This symmetry and known observable acts as a useful test for the
 numerical calculations.</p>
 <p>Next Section: <a
-href="../6_Appendices/A.2_Markov_Chain_Monte_Carlo.html#applying-mcmc-to-the-fk-model">Applying
-MCMC to the FK model</a></p>
+href="../6_Appendices/A.2_Markov_Chain_Monte_Carlo.html">Applying MCMC
+to the FK model</a></p>
 </section>
 <section id="bibliography" class="level1 unnumbered">
 <h1 class="unnumbered">Bibliography</h1>
diff --git a/_thesis/6_Appendices/A.2_Markov_Chain_Monte_Carlo.html b/_thesis/6_Appendices/A.2_Markov_Chain_Monte_Carlo.html
index a3c9d29..f56f227 100644
--- a/_thesis/6_Appendices/A.2_Markov_Chain_Monte_Carlo.html
+++ b/_thesis/6_Appendices/A.2_Markov_Chain_Monte_Carlo.html
@@ -30,10 +30,61 @@ image:
 <li><a href="#markov-chain-monte-carlo"
 id="toc-markov-chain-monte-carlo">Markov Chain Monte Carlo</a>
 <ul>
-<li><a href="#applying-mcmc-to-the-fk-model"
-id="toc-applying-mcmc-to-the-fk-model">Applying MCMC to the FK
-model</a></li>
+<li><a href="#direct-random-sampling"
+id="toc-direct-random-sampling">Direct Random Sampling</a></li>
+<li><a href="#mcmc-sampling" id="toc-mcmc-sampling">MCMC
+Sampling</a></li>
+<li><a href="#implementation-of-mcmc"
+id="toc-implementation-of-mcmc">Implementation of MCMC</a></li>
+<li><a href="#global-and-detailed-balance-equations"
+id="toc-global-and-detailed-balance-equations">Global and Detailed
+balance equations</a></li>
+<li><a href="#the-metropolis-hastings-algorithm"
+id="toc-the-metropolis-hastings-algorithm">The Metropolis-Hastings
+Algorithm</a></li>
+<li><a href="#implementation-of-the-mh-algorithm"
+id="toc-implementation-of-the-mh-algorithm">Implementation of the MH
+Algorithm</a></li>
+<li><a href="#the-metropolis-hasting-algorithm"
+id="toc-the-metropolis-hasting-algorithm">The Metropolis-Hasting
+Algorithm</a></li>
+<li><a href="#metropolis-hastings"
+id="toc-metropolis-hastings">Metropolis-Hastings</a>
+<ul>
+<li><a href="#the-metropolis-hastings-algorithm-1"
+id="toc-the-metropolis-hastings-algorithm-1">The Metropolis-Hastings
+Algorithm</a></li>
 </ul></li>
+<li><a href="#two-step-trick" id="toc-two-step-trick">Two Step
+Trick</a></li>
+<li><a href="#detailed-balance-for-the-two-step-method"
+id="toc-detailed-balance-for-the-two-step-method">Detailed Balance for
+the two step method</a>
+<ul>
+<li><a href="#two-step-trick-1" id="toc-two-step-trick-1">Two Step
+Trick</a></li>
+<li><a href="#auto-correlation-time"
+id="toc-auto-correlation-time">Auto-correlation Time</a></li>
+<li><a href="#tuning-the-proposal-distribution"
+id="toc-tuning-the-proposal-distribution">Tuning the proposal
+distribution</a></li>
+</ul></li>
+<li><a href="#proposal-distributions"
+id="toc-proposal-distributions">Proposal Distributions</a></li>
+<li><a href="#choosing-the-proposal-distribution"
+id="toc-choosing-the-proposal-distribution">Choosing the proposal
+distribution</a></li>
+<li><a href="#perturbation-mcmc" id="toc-perturbation-mcmc">Perturbation
+MCMC</a>
+<ul>
+<li><a href="#convergence-time" id="toc-convergence-time">Convergence
+Time</a></li>
+</ul></li>
+</ul></li>
+<li><a href="#appbalance-detailed-balance"
+id="toc-appbalance-detailed-balance"><span id="app:balance"
+label="app:balance">[app:balance]</span> DETAILED BALANCE</a></li>
+<li><a href="#bibliography" id="toc-bibliography">Bibliography</a></li>
 </ul>
 </nav>
 {% endcapture %}
@@ -49,102 +100,883 @@ model</a></li>
 <li><a href="#markov-chain-monte-carlo"
 id="toc-markov-chain-monte-carlo">Markov Chain Monte Carlo</a>
 <ul>
-<li><a href="#applying-mcmc-to-the-fk-model"
-id="toc-applying-mcmc-to-the-fk-model">Applying MCMC to the FK
-model</a></li>
+<li><a href="#direct-random-sampling"
+id="toc-direct-random-sampling">Direct Random Sampling</a></li>
+<li><a href="#mcmc-sampling" id="toc-mcmc-sampling">MCMC
+Sampling</a></li>
+<li><a href="#implementation-of-mcmc"
+id="toc-implementation-of-mcmc">Implementation of MCMC</a></li>
+<li><a href="#global-and-detailed-balance-equations"
+id="toc-global-and-detailed-balance-equations">Global and Detailed
+balance equations</a></li>
+<li><a href="#the-metropolis-hastings-algorithm"
+id="toc-the-metropolis-hastings-algorithm">The Metropolis-Hastings
+Algorithm</a></li>
+<li><a href="#implementation-of-the-mh-algorithm"
+id="toc-implementation-of-the-mh-algorithm">Implementation of the MH
+Algorithm</a></li>
+<li><a href="#the-metropolis-hasting-algorithm"
+id="toc-the-metropolis-hasting-algorithm">The Metropolis-Hasting
+Algorithm</a></li>
+<li><a href="#metropolis-hastings"
+id="toc-metropolis-hastings">Metropolis-Hastings</a>
+<ul>
+<li><a href="#the-metropolis-hastings-algorithm-1"
+id="toc-the-metropolis-hastings-algorithm-1">The Metropolis-Hastings
+Algorithm</a></li>
 </ul></li>
+<li><a href="#two-step-trick" id="toc-two-step-trick">Two Step
+Trick</a></li>
+<li><a href="#detailed-balance-for-the-two-step-method"
+id="toc-detailed-balance-for-the-two-step-method">Detailed Balance for
+the two step method</a>
+<ul>
+<li><a href="#two-step-trick-1" id="toc-two-step-trick-1">Two Step
+Trick</a></li>
+<li><a href="#auto-correlation-time"
+id="toc-auto-correlation-time">Auto-correlation Time</a></li>
+<li><a href="#tuning-the-proposal-distribution"
+id="toc-tuning-the-proposal-distribution">Tuning the proposal
+distribution</a></li>
+</ul></li>
+<li><a href="#proposal-distributions"
+id="toc-proposal-distributions">Proposal Distributions</a></li>
+<li><a href="#choosing-the-proposal-distribution"
+id="toc-choosing-the-proposal-distribution">Choosing the proposal
+distribution</a></li>
+<li><a href="#perturbation-mcmc" id="toc-perturbation-mcmc">Perturbation
+MCMC</a>
+<ul>
+<li><a href="#convergence-time" id="toc-convergence-time">Convergence
+Time</a></li>
+</ul></li>
+</ul></li>
+<li><a href="#appbalance-detailed-balance"
+id="toc-appbalance-detailed-balance"><span id="app:balance"
+label="app:balance">[app:balance]</span> DETAILED BALANCE</a></li>
+<li><a href="#bibliography" id="toc-bibliography">Bibliography</a></li>
 </ul>
 </nav>
  -->
 
 <!-- Main Page Body -->
-<div id="page-header">
-<p>Appendices</p>
-<hr />
-</div>
 <section id="markov-chain-monte-carlo" class="level1">
 <h1>Markov Chain Monte Carlo</h1>
-<section id="applying-mcmc-to-the-fk-model" class="level2">
-<h2>Applying MCMC to the FK model</h2>
-<p>MCMC can be applied to sample over the classical degrees of freedom
-of the model. We take the full Hamiltonian and split it into a classical
-and a quantum part: <span class="math display">\[\begin{aligned}
-    H_{\mathrm{FK}} &amp;= -\sum_{&lt;ij&gt;} c^\dagger_{i}c_{j} + U
-\sum_{i} (c^\dagger_{i}c_{i} - 1/2)( n_i - 1/2) \\
-    &amp;+ \sum_{ij} J_{ij} (n_i - 1/2) (n_j - 1/2)  - \mu \sum_i
-(c^\dagger_{i}c_{i} + n_i)\\
-    H_q &amp;= -\sum_{&lt;ij&gt;} c^\dagger_{i}c_{j} + \sum_{i}
-\left(U(n_i - 1/2) - \mu\right) c^\dagger_{i}c_{i}\\
-    H_c &amp;= \sum_i \mu n_i - \frac{U}{2}(n_i - 1/2) +
-\sum_{ij}J_{ij}(n_i - 1/2)(n_j - 1/2)
+<p>Markov Chain Monte Carlo (MCMC) is a useful method whenever we have a
+probability distribution that we want to sample from but there is not
+direct sampling way to do so.</p>
+<section id="direct-random-sampling" class="level2">
+<h2>Direct Random Sampling</h2>
+<p>In almost any computer simulation the ultimate source of randomness
+is a stream of (close to) uniform, uncorrelated bits generated from a
+pseudo random number generator. A direct sampling method takes such a
+source and outputs uncorrelated samples from the target distribution.
+The fact they’re uncorrelated is key as we’ll see later. Examples of
+direct sampling methods range from the trivial: take n random bits to
+generate integers uniformly between 0 and <span
+class="math inline">\(2^n\)</span> to more complex methods such as
+inverse transform sampling and rejection sampling <span class="citation"
+data-cites="devroyeRandomSampling1986"> [<a
+href="#ref-devroyeRandomSampling1986"
+role="doc-biblioref">1</a>]</span>.</p>
+<p>In physics the distribution we usually want to sample from is the
+Boltzmann probability over states of the system <span
+class="math inline">\(S\)</span>: <span class="math display">\[
+\begin{aligned}
+p(S)  &amp;= \frac{1}{\mathcal{Z}} e^{-\beta H(S)} \\
 \end{aligned}
-\]</span></p>
-<p>There are <span class="math inline">\(2^N\)</span> possible ion
-configurations <span class="math inline">\(\{ n_i \}\)</span>, we define
-<span class="math inline">\(n^k_i\)</span> to be the occupation of the
-ith site of the kth configuration. The quantum part of the free energy
-can then be defined through the quantum partition function <span
-class="math inline">\(\mathcal{Z}^k\)</span> associated with each ionic
-state <span class="math inline">\(n^k_i\)</span>: <span
-class="math display">\[\begin{aligned}
-F^k &amp;= -1/\beta \ln{\mathcal{Z}^k} \\
-\end{aligned}\]</span> % Such that the overall partition function is:
-<span class="math display">\[\begin{aligned}
-\mathcal{Z} &amp;= \sum_k e^{- \beta H^k} Z^k \\
-&amp;= \sum_k e^{-\beta (H^k + F^k)} \\
-\end{aligned}\]</span> % Because fermions are limited to occupation
-numbers of 0 or 1 <span class="math inline">\(Z^k\)</span> simplifies
-nicely. If <span class="math inline">\(m^j_i = \{0,1\}\)</span> is
-defined as the occupation of the level with energy <span
-class="math inline">\(\epsilon^k_i\)</span> then the partition function
-is a sum over all the occupation states labelled by j: <span
-class="math display">\[\begin{aligned}
-Z^k    &amp;= \Tr e^{-\beta F^k} = \sum_j e^{-\beta \sum_i m^j_i
-\epsilon^k_i}\\
-       &amp;= \sum_j \prod_i e^{- \beta m^j_i \epsilon^k_i}= \prod_i
-\sum_j e^{- \beta m^j_i \epsilon^k_i}\\
-       &amp;= \prod_i (1 + e^{- \beta \epsilon^k_i})\\
-F^k    &amp;= -1/\beta \sum_k \ln{(1 + e^{- \beta \epsilon^k_i})}
-\end{aligned}\]</span> % Observables can then be calculated from the
-partition function, for examples the occupation numbers:</p>
-<p><span class="math display">\[\begin{aligned}
-\tex{N} &amp;= \frac{1}{\beta} \frac{1}{Z} \frac{\partial Z}{\partial
-\mu} = - \frac{\partial F}{\partial \mu}\\
-    &amp;= \frac{1}{\beta} \frac{1}{Z} \frac{\partial}{\partial \mu}
-\sum_k e^{-\beta (H^k + F^k)}\\
-    &amp;= 1/Z \sum_k (N^k_{\mathrm{ion}} + N^k_{\mathrm{electron}})
-e^{-\beta (H^k + F^k)}\\
-\end{aligned}\]</span> % with the definitions:</p>
-<p><span class="math display">\[\begin{aligned}
-N^k_{\mathrm{ion}} &amp;= - \frac{\partial H^k}{\partial \mu} = \sum_i
-n^k_i\\
-N^k_{\mathrm{electron}} &amp;= - \frac{\partial F^k}{\partial \mu} =
-\sum_i \left(1 + e^{\beta \epsilon^k_i}\right)^{-1}\\
-\end{aligned}\]</span> % The MCMC algorithm consists of performing a
-random walk over the states <span class="math inline">\(\{ n^k_i
-\}\)</span>. In the simplest case the proposal distribution corresponds
-to flipping a random site from occupied to unoccupied or vice versa,
-since this proposal is symmetric the acceptance function becomes: <span
-class="math display">\[\begin{aligned}
-P(k) &amp;= \mathcal{Z}^{-1} e^{-\beta(H^k + F^k)} \\
-\mathcal{A}(k \to k&#39;) &amp;= \min\left(1,
-\frac{P(k&#39;)}{P(k)}\right) = \min\left(1, e^{\beta(H^{k&#39;} +
-F^{k&#39;})-\beta(H^k + F^k)}\right)
-\end{aligned}\]</span> % At each step <span
-class="math inline">\(F^k\)</span> is calculated by diagonalising the
-tri-diagonal matrix representation of <span
-class="math inline">\(H_q\)</span> with open boundary conditions.
-Observables are simply averages over the their value at each step of the
-random walk. The full spectrum and eigenbasis is too large to save to
-disk so usually running averages of key observables are taken as the
-walk progresses.</p>
-<div class="sourceCode" id="cb1"><pre
-class="sourceCode python"><code class="sourceCode python"></code></pre></div>
-<p></ij></ij></p>
-<p>Next Section: <a
-href="../6_Appendices/A.3_Lattice_Generation.html#lattice-generation">Lattice
-Generation</a></p>
+\]</span> where <span class="math inline">\(\mathcal{Z} = \sum_S
+e^{-\beta H(S)}\)</span> is the normalisation factor and ubiquitous
+partition function. In principle we could directly sample from this, for
+a discrete system there are finitely many choices. We could calculate
+the probability of each one and assign each a region of the unit
+interval which we could then sample uniformly from.</p>
+<p>However if we actually try to do this we will run into two problems,
+we can’t calculate <span class="math inline">\(\mathcal{Z}\)</span> for
+any reasonably sized systems because the state space grows exponentially
+with system size. Even if we could calculate <span
+class="math inline">\(\mathcal{Z}\)</span>, sampling from an
+exponentially large number of options quickly become tricky. This kind
+of problem happens in many other disciplines too, particularly when
+fitting statistical models using Bayesian inference <span
+class="citation" data-cites="BMCP2021"> [<a href="#ref-BMCP2021"
+role="doc-biblioref">2</a>]</span>.</p>
 </section>
+<section id="mcmc-sampling" class="level2">
+<h2>MCMC Sampling</h2>
+<p>So what can we do? Well it turns out that if we’re willing to give up
+in the requirement that the samples be uncorrelated then we can use MCMC
+instead.</p>
+<p>MCMC defines a weighted random walk over the states <span
+class="math inline">\((S_0, S_1, S_2, ...)\)</span>, such that in the
+long time limit, states are visited according to their probability <span
+class="math inline">\(p(S)\)</span>. <span class="citation"
+data-cites="binderGuidePracticalWork1988 kerteszAdvancesComputerSimulation1998 wolffMonteCarloErrors2004"> [<a
+href="#ref-binderGuidePracticalWork1988" role="doc-biblioref">3</a>–<a
+href="#ref-wolffMonteCarloErrors2004"
+role="doc-biblioref">5</a>]</span>.  <span class="citation"
+data-cites="krauthIntroductionMonteCarlo1998"> [<a
+href="#ref-krauthIntroductionMonteCarlo1998"
+role="doc-biblioref">6</a>]</span></p>
+<p><span class="math display">\[\lim_{i\to\infty} p(S_i) =
+P(S)\]</span></p>
+<p>In a physics context this lets us evaluate any observable with a mean
+over the states visited by the walk. <span
+class="math display">\[\begin{aligned}
+\langle O \rangle &amp; = \sum_{S} p(S) \langle O \rangle_{S} = \sum_{i
+= 0}^{M} \langle O\rangle_{S_i} + \mathcal{O}(\tfrac{1}{\sqrt{M}})\\
+\end{aligned}\]</span></p>
+<p>The the samples in the random walk are correlated so the samples
+effectively contain less information than <span
+class="math inline">\(N\)</span> independent samples would. As a
+consequence the variance is larger than the <span
+class="math inline">\(\langle O^2 \rangle - \langle O\rangle^2\)</span>
+form it would have if the estimates were uncorrelated. Methods of
+estimating the true variance of <span class="math inline">\(\langle O
+\rangle\)</span> and decided how many steps are needed will be
+considered later.</p>
+</section>
+<section id="implementation-of-mcmc" class="level2">
+<h2>Implementation of MCMC</h2>
+<p>In implementation MCMC can be boiled down to choosing a transition
+function $(S_{t} S_{t+1}) $ where <span class="math inline">\(S\)</span>
+are vectors representing classical spin configurations. We start in some
+initial state <span class="math inline">\(S_0\)</span> and then
+repeatedly jump to new states according to the probabilities given by
+<span class="math inline">\(\mathcal{T}\)</span>. This defines a set of
+random walks <span class="math inline">\(\{S_0\ldots S_i\ldots
+S_N\}\)</span>.</p>
+<p>In pseudo-code one could write the MCMC simulation for a single
+walker as:</p>
+<div class="sourceCode" id="cb1"><pre
+class="sourceCode python"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> sample_T(current_state) </span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>    states[i] <span class="op">=</span> current_state</span></code></pre></div>
+<p>Where the <code>sample_T</code> function samples directly from the
+transition function <span
+class="math inline">\(\mathcal{T}\)</span>.</p>
+<p>If we run many such walkers in parallel we can then approximate the
+distribution <span class="math inline">\(p_t(S; S)\)</span> which tells
+us where the walkers are likely to be after they’ve evolved for <span
+class="math inline">\(t\)</span> steps from an initial state <span
+class="math inline">\(S_0\)</span>. We need to carefully choose <span
+class="math inline">\(\mathcal{T}\)</span> such that the the probability
+<span class="math inline">\(p_t(S; S_0)\)</span> approaches the
+distribution of interest. In this case the thermal distribution <span
+class="math inline">\(P(S; \beta) = \mathcal{Z}^{-1} e^{-\beta
+F(S)}\)</span>.</p>
+</section>
+<section id="global-and-detailed-balance-equations" class="level2">
+<h2>Global and Detailed balance equations</h2>
+<p>We cam quite easily write down the properties that <span
+class="math inline">\(\mathcal{T}\)</span> must have in order to yield
+the correct target distribution. Since we must transition somewhere at
+each step, we first have the normalisation condition that <span
+class="math display">\[\sum\limits_S \mathcal{T}(S&#39; \rightarrow S) =
+1.\]</span></p>
+<p>Second, let us move to an ensemble view, where rather than individual
+walkers and states, we think about the probability distribution of many
+walkers at each step. If we start all the walkers in the same place the
+initial distribution will be a delta function and as we step the walkers
+will wander around, giving us a sequence of probability distributions
+<span class="math inline">\(\{p_0(S), p_1(S), p_2(S)\ldots\}\)</span>.
+For discrete spaces we can write the action of the transition function
+on <span class="math inline">\(p_i\)</span> as a matrix equation</p>
+<p><span class="math display">\[\begin{aligned}
+p_{i+1}(S) &amp;= \sum_{S&#39; \in \{S\}} p_i(S&#39;) \mathcal{T}(S&#39;
+\rightarrow S)
+\end{aligned}\]</span></p>
+<p>This equation is essentially just stating that total probability mass
+is conserved as our walkers flow around the state space.</p>
+<p>In order that <span class="math inline">\(p_i\)</span> converges to
+our target distribution <span class="math inline">\(p\)</span> in the
+long time limit, we need the target distribution to be a fixed point of
+the transition function</p>
+<p><span class="math display">\[\begin{aligned}
+P(S) &amp;= \sum_{S&#39;} P(S&#39;) \mathcal{T}(S&#39; \rightarrow S)
+\end{aligned}
+\]</span> Along with some more technical considerations such as
+ergodicity which won’t be considered here, global balance suffices to
+ensure that a MCMC method is correct <span class="citation"
+data-cites="kellyReversibilityStochasticNetworks1981"> [<a
+href="#ref-kellyReversibilityStochasticNetworks1981"
+role="doc-biblioref">7</a>]</span>.</p>
+<p>A sufficient but not necessary condition for global balance to hold
+is called detailed balance:</p>
+<p><span class="math display">\[
+P(S) \mathcal{T}(S \rightarrow S&#39;) = P(S&#39;) \mathcal{T}(S&#39;
+\rightarrow S)
+\]</span></p>
+<p>In practice most algorithms are constructed to satisfy detailed
+rather than global balance, though there are arguments that the relaxed
+requirements of global balance can lead to faster algorithms <span
+class="citation" data-cites="kapferSamplingPolytopeHarddisk2013"> [<a
+href="#ref-kapferSamplingPolytopeHarddisk2013"
+role="doc-biblioref">8</a>]</span>.</p>
+<p>The goal of MCMC is then to choose <span
+class="math inline">\(\mathcal{T}\)</span> so that it has the desired
+thermal distribution <span class="math inline">\(P(S)\)</span> as its
+fixed point and converges quickly onto it. This boils down to requiring
+that the matrix representation of <span class="math inline">\(T_{ij} =
+\mathcal{T}(S_i \to S_j)\)</span> has an eigenvector with entries <span
+class="math inline">\(P_i = P(S_i)\)</span> with eigenvalue 1 and all
+other eigenvalues with magnitude less than one. The convergence time
+depends on the magnitude of the second largest eigenvalue.</p>
+<p>The choice of the transition function for MCMC is under-determined as
+one only needs to satisfy a set of balance conditions for which there
+are many solutions <span class="citation"
+data-cites="kellyReversibilityStochasticNetworks1981"> [<a
+href="#ref-kellyReversibilityStochasticNetworks1981"
+role="doc-biblioref">7</a>]</span>. The standard choice that satisfies
+these requirements is called the Metropolis-Hastings algorithm.</p>
+</section>
+<section id="the-metropolis-hastings-algorithm" class="level2">
+<h2>The Metropolis-Hastings Algorithm</h2>
+<p>The Metropolis-Hastings algorithm breaks the transition function into
+a proposal distribution <span class="math inline">\(q(S \to
+S&#39;)\)</span> and an acceptance function <span
+class="math inline">\(\mathcal{A}(S \to S&#39;)\)</span>. <span
+class="math inline">\(q\)</span> must be a function we can directly
+sample from, and in many cases takes the form of flipping some number of
+spins in <span class="math inline">\(S\)</span>, i.e if we’re flipping a
+single random spin in the spin chain, <span class="math inline">\(q(S
+\to S&#39;)\)</span> is the uniform distribution on states reachable by
+one spin flip from <span class="math inline">\(S\)</span>. This also
+gives the symmetry property that <span class="math inline">\(q(S \to
+S&#39;) = q(S&#39; \to S)\)</span>.</p>
+<p>The proposal <span class="math inline">\(S&#39;\)</span> is then
+accepted or rejected with an acceptance probability <span
+class="math inline">\(\mathcal{A}(S \to S&#39;)\)</span>, if the
+proposal is rejected then <span class="math inline">\(S_{i+1} =
+S_{i}\)</span>. Hence:</p>
+<p><span class="math display">\[\mathcal{T}(S\to S&#39;) = q(S\to
+S&#39;)\mathcal{A}(S \to S&#39;)\]</span></p>
+<p>The Metropolis-Hasting algorithm is a slight extension of the
+original Metropolis algorithm which allows for non-symmetric proposal
+distributions $q(SS’) q(S’S) $. It can be derived starting from detailed
+balance <span class="citation"
+data-cites="krauthIntroductionMonteCarlo1998"> [<a
+href="#ref-krauthIntroductionMonteCarlo1998"
+role="doc-biblioref">6</a>]</span>:</p>
+<p><span class="math display">\[
+P(S)\mathcal{T}(S \to S&#39;) = P(S&#39;)\mathcal{T}(S&#39; \to S)
+\]</span></p>
+<p>inserting the proposal and acceptance function</p>
+<p><span class="math display">\[
+P(S)q(S \to S&#39;)\mathcal{A}(S \to S&#39;) = P(S&#39;)q(S&#39; \to
+S)\mathcal{A}(S&#39; \to S)
+\]</span></p>
+<p>rearranging gives us a condition on the acceptance function in terms
+of the target distribution and the proposal distribution which can be
+thought of as inputs to the algorithm</p>
+<p><span class="math display">\[
+\frac{\mathcal{A}(S \to S&#39;)}{\mathcal{A}(S&#39; \to S)} =
+\frac{P(S&#39;)q(S&#39; \to S)}{P(S)q(S \to S&#39;)} = f(S, S&#39;)
+\]</span></p>
+<p>The Metropolis-Hastings algorithm is the choice</p>
+<p><span class="math display">\[
+\begin{aligned}
+\label{eq:mh}
+\mathcal{A}(S \to S&#39;) = \min\left(1, f(S,S&#39;)\right)
+\end{aligned}
+\]</span> for the acceptance function. The proposal distribution is left
+as a free choice.</p>
+<p>Noting that <span class="math inline">\(f(S,S&#39;) =
+1/f(S&#39;,S)\)</span>, we can see that the MH algorithm satifies
+detailed balance by considering the two cases <span
+class="math inline">\(f(S,S&#39;) &gt; 1\)</span> and <span
+class="math inline">\(f(S,S&#39;) &lt; 1\)</span>.</p>
+<p>By choosing the proposal distribution such that <span
+class="math inline">\(f(S,S&#39;)\)</span> is as close as possible to
+one, the rate of rejections can be reduced and the algorithm sped up.
+This can be challenging though, as getting <span
+class="math inline">\(f(S,S&#39;)\)</span> close to 1 would imply that
+we can directly sample from a distribution very close to the target
+distribution. As MCMC is usually applied to problems for which there is
+virtually no hope of sampling directly from the target distribution,
+it’s rare that one can do so approximately.</p>
+<p>When the proposal distribution is symmetric as ours is, it cancels
+out in the expression for the acceptance function and the
+Metropolis-Hastings algorithm is simply the choice</p>
+<p><span class="math display">\[\mathcal{A}(S \to S&#39;) = \min\left(1,
+e^{-\beta\;\Delta F}\right)\]</span></p>
+<p>where <span class="math inline">\(F\)</span> is the overall free
+energy of the system, including both the quantum and classical
+sector.</p>
+</section>
+<section id="implementation-of-the-mh-algorithm" class="level2">
+<h2>Implementation of the MH Algorithm</h2>
+<p>To implement the acceptance function in practice we pick a random
+number in the unit interval and accept if it is less than <span
+class="math inline">\(e^{-\beta\;\Delta F}\)</span>:</p>
+<div class="sourceCode" id="cb2"><pre
+class="sourceCode python"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
+<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
+<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> proposal(current_state)</span>
+<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>    df <span class="op">=</span> free_energy_change(current_state, new_state, parameters)</span>
+<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span>beta <span class="op">*</span> df):</span>
+<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>        current_state <span class="op">=</span> new_state</span>
+<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>        </span>
+<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a>    states[i] <span class="op">=</span> current_state</span></code></pre></div>
+</section>
+<section id="the-metropolis-hasting-algorithm" class="level2">
+<h2>The Metropolis-Hasting Algorithm</h2>
+</section>
+<section id="metropolis-hastings" class="level2">
+<h2>Metropolis-Hastings</h2>
+<p>In order to actually choose new states according to <span
+class="math inline">\(\mathcal{T}\)</span> one chooses states from a
+proposal distribution <span class="math inline">\(q(S_i \to
+S&#39;)\)</span> that can be directly sampled from. For instance, this
+might mean flipping a single random spin in a spin chain, in which case
+<span class="math inline">\(q(x_i\to x_i)\)</span> is the uniform
+distribution on states reachable by one spin flip from <span
+class="math inline">\(x_i\)</span>. The proposal <span
+class="math inline">\(S&#39;\)</span> is then accepted or rejected with
+an acceptance probability <span class="math inline">\(\mathcal{A}(x_i\to
+x_{i+1})\)</span>, if the proposal is rejected then <span
+class="math inline">\(x_{i+1} = x_{i}\)</span>. Now <span
+class="math inline">\(\mathcal{T}(S\to S&#39;) = q(S\to
+S&#39;)\mathcal{A}(S \to S&#39;)\)</span>.</p>
+<section id="the-metropolis-hastings-algorithm-1" class="level3">
+<h3>The Metropolis-Hastings Algorithm</h3>
+<p>MH breaks up the transition function into a proposal distribution
+<span class="math inline">\(q(S \to S&#39;)\)</span> and an acceptance
+function <span class="math inline">\(\mathcal{A}(S \to S&#39;)\)</span>.
+<span class="math inline">\(q\)</span> needs to be something that we can
+directly sample from, and in our case generally takes the form of
+flipping some number of spins in <span class="math inline">\(S\)</span>,
+i.e if we’re flipping a single random spin in the spin chain, <span
+class="math inline">\(q(S \to S&#39;)\)</span> is the uniform
+distribution on states reachable by one spin flip from <span
+class="math inline">\(S\)</span>. This also gives the nice symmetry
+property that <span class="math inline">\(q(S \to S&#39;) = q(S&#39; \to
+S)\)</span>.</p>
+<p>The proposal <span class="math inline">\(S&#39;\)</span> is then
+accepted or rejected with an acceptance probability <span
+class="math inline">\(\mathcal{A}(S \to S&#39;)\)</span>, if the
+proposal is rejected then <span class="math inline">\(S_{i+1} =
+S_{i}\)</span>. Hence:</p>
+<p><span class="math display">\[\mathcal{T}(S\to S&#39;) = q(S\to
+S&#39;)\mathcal{A}(S \to S&#39;)\]</span></p>
+<p>When the proposal distribution is symmetric as ours is, it cancels
+out in the expression for the acceptance function and the
+Metropolis-Hastings algorithm is simply the choice: <span
+class="math display">\[ \mathcal{A}(S \to S&#39;) = \min\left(1,
+e^{-\beta\;\Delta F}\right)\]</span> Where <span
+class="math inline">\(F\)</span> is the overall free energy of the
+system, including both the quantum and classical sector.</p>
+<p>To implement the acceptance function in practice we pick a random
+number in the unit interval and accept if it is less than <span
+class="math inline">\(e^{-\beta\;\Delta F}\)</span>:</p>
+<div class="sourceCode" id="cb3"><pre
+class="sourceCode python"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
+<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
+<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> proposal(current_state)</span>
+<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a>    df <span class="op">=</span> free_energy_change(current_state, new_state, parameters)</span>
+<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span>beta <span class="op">*</span> df):</span>
+<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a>        current_state <span class="op">=</span> new_state</span>
+<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a>        </span>
+<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a>    states[i] <span class="op">=</span> current_state</span></code></pre></div>
+<p>This has the effect of always accepting proposed states that are
+lower in energy and sometimes accepting those that are higher in energy
+than the current state.</p>
+</section>
+</section>
+<section id="two-step-trick" class="level2">
+<h2>Two Step Trick</h2>
+<p>Our method already relies heavily on the split between the classical
+and quantum sector to derive a sign problem free MCMC algorithm but it
+turns out that there is a further trick we can play with it. The free
+energy term is the sum of an easy to compute classical energy and a more
+expensive quantum free energy, we can split the acceptance function into
+two in such as way as to avoid having to compute the full exact
+diagonalisation some of the time:</p>
+<div class="sourceCode" id="cb4" data-language="Python"><pre
+class="sourceCode python"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
+<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
+<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> proposal(current_state)</span>
+<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>    df_classical <span class="op">=</span> classical_free_energy_change(current_state, new_state, parameters)</span>
+<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> exp(<span class="op">-</span>beta <span class="op">*</span> df_classical) <span class="op">&lt;</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>):</span>
+<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>        f_quantum <span class="op">=</span> quantum_free_energy(current_state, new_state, parameters)</span>
+<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>    </span>
+<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> exp(<span class="op">-</span> beta <span class="op">*</span> df_quantum) <span class="op">&lt;</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>):</span>
+<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a>          current_state <span class="op">=</span> new_state</span>
+<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a>    </span>
+<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a>        states[i] <span class="op">=</span> current_state</span>
+<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a>    </span></code></pre></div>
+</section>
+<section id="detailed-balance-for-the-two-step-method" class="level2">
+<h2>Detailed Balance for the two step method</h2>
+<p>Given a MCMC algorithm with target distribution <span
+class="math inline">\(\pi(a)\)</span> and transition function <span
+class="math inline">\(\mathcal{T}\)</span> the detailed balance
+condition is sufficient (along with some technical constraints <span
+class="citation" data-cites="wolffMonteCarloErrors2004"> [<a
+href="#ref-wolffMonteCarloErrors2004"
+role="doc-biblioref">5</a>]</span>) to guarantee that in the long time
+limit the algorithm produces samples from <span
+class="math inline">\(\pi\)</span>. <span
+class="math display">\[\pi(a)\mathcal{T}(a \to b) = \pi(b)\mathcal{T}(b
+\to a)\]</span></p>
+<p>In pseudo-code, our two step method corresponds to two nested
+comparisons with the majority of the work only occurring if the first
+test passes:</p>
+<div class="sourceCode" id="cb5"><pre
+class="sourceCode python"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
+<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
+<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a>  new_state <span class="op">=</span> proposal(current_state)</span>
+<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a>  c_dE <span class="op">=</span> classical_energy_change(</span>
+<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a>                               current_state,</span>
+<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a>                               new_state)</span>
+<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a>  <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span>beta <span class="op">*</span> c_dE):</span>
+<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a>    q_dF <span class="op">=</span> quantum_free_energy_change(</span>
+<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a>                                current_state,</span>
+<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a>                                new_state)</span>
+<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span> beta <span class="op">*</span> q_dF):</span>
+<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a>      current_state <span class="op">=</span> new_state</span>
+<span id="cb5-15"><a href="#cb5-15" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb5-16"><a href="#cb5-16" aria-hidden="true" tabindex="-1"></a>    states[i] <span class="op">=</span> current_state</span></code></pre></div>
+<p>Defining <span class="math inline">\(r_c = e^{-\beta H_c}\)</span>
+and <span class="math inline">\(r_q = e^{-\beta F_q}\)</span> our target
+distribution is <span class="math inline">\(\pi(a) = r_c r_q\)</span>.
+This method has <span class="math inline">\(\mathcal{T}(a\to b) = q(a\to
+b)\mathcal{A}(a \to b)\)</span> with symmetric <span
+class="math inline">\(p(a \to b) = \pi(b \to a)\)</span> and <span
+class="math inline">\(\mathcal{A} = \min\left(1, r_c\right) \min\left(1,
+r_q\right)\)</span></p>
+<p>Substituting this into the detailed balance equation gives: <span
+class="math display">\[\mathcal{T}(a \to b)/\mathcal{T}(b \to a) =
+\pi(b)/\pi(a) = r_c r_q\]</span></p>
+<p>Taking the LHS and substituting in our transition function: <span
+class="math display">\[\begin{aligned}
+\mathcal{T}(a \to b)/\mathcal{T}(b \to a) = \frac{\min\left(1,
+r_c\right) \min\left(1, r_q\right)}{ \min\left(1, 1/r_c\right)
+\min\left(1, 1/r_q\right)}\end{aligned}\]</span></p>
+<p>which simplifies to <span class="math inline">\(r_c r_q\)</span> as
+<span class="math inline">\(\min(1,r)/\min(1,1/r) = r\)</span> for <span
+class="math inline">\(r &gt; 0\)</span>.</p>
+<section id="two-step-trick-1" class="level3">
+<h3>Two Step Trick</h3>
+<p>Our method already relies heavily on the split between the classical
+and quantum sector to derive a sign problem free MCMC algorithm but it
+turns out that there is a further trick we can play with it. The free
+energy term is the sum of an easy to compute classical energy and a more
+expensive quantum free energy, we can split the acceptance function into
+two in such as way as to avoid having to compute the full exact
+diagonalisation some of the time:</p>
+<div class="sourceCode" id="cb6"><pre
+class="sourceCode python"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
+<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
+<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a>    new_state <span class="op">=</span> proposal(current_state)</span>
+<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a>    df_classical <span class="op">=</span> classical_free_energy_change(current_state, new_state, parameters)</span>
+<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> exp(<span class="op">-</span>beta <span class="op">*</span> df_classical) <span class="op">&lt;</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>):</span>
+<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a>        f_quantum <span class="op">=</span> quantum_free_energy(current_state, new_state, parameters)</span>
+<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a>    </span>
+<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> exp(<span class="op">-</span> beta <span class="op">*</span> df_quantum) <span class="op">&lt;</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>):</span>
+<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a>          current_state <span class="op">=</span> new_state</span>
+<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a>    </span>
+<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a>        states[i] <span class="op">=</span> current_state</span>
+<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a>    </span></code></pre></div>
+</section>
+<section id="auto-correlation-time" class="level3">
+<h3>Auto-correlation Time</h3>
+<div id="fig:m_autocorr" class="fignos">
+<figure>
+<img src="/assets/thesis/fk_chapter/lsr/figs/m_autocorr.png"
+data-short-caption="no title" style="width:100.0%"
+alt="Figure 1: (Upper) 10 MCMC chains starting from the same initial state for a system with N = 150 sites and 3000 MCMC steps. At each MCMC step, n spins are flipped where n is drawn from Uniform(1,N) and this is repeated N^2/100 times. The simulations therefore have the potential to necessitate 10*N^2 matrix diagonalisations for each 100 MCMC steps. (Lower) The normalised auto-correlation (\langle m_i m_{i-j}\rangle - \langle m_i\rangle \langle m_i \rangle) / Var(m_i)) averaged over i. It can be seen that even with each MCMC step already being composed of many individual flip attempts, the auto-correlation is still non negligible and must be taken into account in the statistics. t = 1, \alpha = 1.25, T = 2.2, J = U = 5" />
+<figcaption aria-hidden="true"><span>Figure 1:</span> (Upper) 10 MCMC
+chains starting from the same initial state for a system with <span
+class="math inline">\(N = 150\)</span> sites and 3000 MCMC steps. At
+each MCMC step, n spins are flipped where n is drawn from Uniform(1,N)
+and this is repeated <span class="math inline">\(N^2/100\)</span> times.
+The simulations therefore have the potential to necessitate <span
+class="math inline">\(10*N^2\)</span> matrix diagonalisations for each
+100 MCMC steps. (Lower) The normalised auto-correlation <span
+class="math inline">\((\langle m_i m_{i-j}\rangle - \langle m_i\rangle
+\langle m_i \rangle) / Var(m_i))\)</span> averaged over <span
+class="math inline">\(i\)</span>. It can be seen that even with each
+MCMC step already being composed of many individual flip attempts, the
+auto-correlation is still non negligible and must be taken into account
+in the statistics. <span class="math inline">\(t = 1, \alpha = 1.25, T =
+2.2, J = U = 5\)</span></figcaption>
+</figure>
+</div>
+<p>At this stage one might think we’re done. We can indeed draw
+independent samples from <span class="math inline">\(P(S;
+\beta)\)</span> by starting from some arbitrary initial state and doing
+<span class="math inline">\(k\)</span> steps to arrive at a sample.
+However a key insight is that after the convergence time, every state
+generated is a sample from <span class="math inline">\(P(S;
+\beta)\)</span>! They are not, however, independent samples. In Fig. ??
+it is already clear that the samples of the order parameter m have some
+auto-correlation because only a few spins are flipped each step but even
+when the number of spins flipped per step is increased, Fig.
+autocorrelation shows that it can be an important effect near the phase
+transition. Let’s define the auto-correlation time <span
+class="math inline">\(\tau(O)\)</span> informally as the number of MCMC
+samples of some observable O that are statistically equal to one
+independent sample or equivalently as the number of MCMC steps after
+which the samples are correlated below some cut-off, see <span
+class="citation" data-cites="krauthIntroductionMonteCarlo1996"> [<a
+href="#ref-krauthIntroductionMonteCarlo1996"
+role="doc-biblioref">9</a>]</span> for a more rigorous definition
+involving a sum over the auto-correlation function. The auto-correlation
+time is generally shorter than the convergence time so it therefore
+makes sense from an efficiency standpoint to run a single walker for
+many MCMC steps rather than to run a huge ensemble for <span
+class="math inline">\(k\)</span> steps each.</p>
+<p>Once the random walk has been carried out for many steps, the
+expectation values of <span class="math inline">\(O\)</span> can be
+estimated from the MCMC samples <span
+class="math inline">\(S_i\)</span>: <span class="math display">\[
+    \tex{O} = \sum_{i = 0}^{N} O(S_i) + \mathcal{O}(\frac{1}{\sqrt{N}})
+\]</span></p>
+<p>The the samples are correlated so the N of them effectively contains
+less information than <span class="math inline">\(N\)</span> independent
+samples would, in fact roughly <span
+class="math inline">\(N/\tau\)</span> effective samples. As a
+consequence the variance is larger than the <span
+class="math inline">\(\qex{O^2} - \qex{O}^2\)</span> form it would have
+if the estimates were uncorrelated. There are many methods in the
+literature for estimating the true variance of <span
+class="math inline">\(\qex{O}\)</span> and deciding how many steps are
+needed but my approach has been to run a small number of parallel
+chains, which are independent, in order to estimate the statistical
+error produced. This is a slightly less computationally efficient
+because it requires throwing away those <span
+class="math inline">\(k\)</span> steps generated before convergence
+multiple times but it is a conceptually simple workaround.</p>
+<p>In summary, to do efficient simulations we want to reduce both the
+convergence time and the auto-correlation time as much as possible. In
+order to explain how, we need to introduce the Metropolis-Hasting (MH)
+algorithm and how it gives an explicit form for the transition
+function.</p>
+</section>
+<section id="tuning-the-proposal-distribution" class="level3">
+<h3>Tuning the proposal distribution</h3>
+<div id="fig:autocorr_multiple_proposals" class="fignos">
+<figure>
+<img
+src="../figure_code/fk_chapter/lsr/figs/autocorr_multiple_proposals.png"
+data-short-caption="no title" style="width:100.0%"
+alt="Figure 2: Simulations showing how the autocorrelation of the order parameter depends on the proposal distribution used at different temperatures, we see that at T = 1.5 &lt; T_c a single spin flip is likely the best choice, while at the high temperature T = 2.5 &gt; T_c flipping two sites or a mixture of flipping two and 1 sites is likely a better choice. $t = 1, = 1.25, J = U = 5 $" />
+<figcaption aria-hidden="true"><span>Figure 2:</span> Simulations
+showing how the autocorrelation of the order parameter depends on the
+proposal distribution used at different temperatures, we see that at
+<span class="math inline">\(T = 1.5 &lt; T_c\)</span> a single spin flip
+is likely the best choice, while at the high temperature <span
+class="math inline">\(T = 2.5 &gt; T_c\)</span> flipping two sites or a
+mixture of flipping two and 1 sites is likely a better choice. $t = 1, =
+1.25, J = U = 5 $</figcaption>
+</figure>
+</div>
+<p>Now we can discuss how to minimise the auto-correlations. The general
+principle is that one must balance the proposal distribution between two
+extremes. Choose overlay small steps, like flipping only a single spin
+and the acceptance rate will be high because <span
+class="math inline">\(\Delta F\)</span> will usually be small, but each
+state will be very similar to the previous and the auto-correlations
+will be high too, making sampling inefficient. On the other hand,
+overlay large steps, like randomising a large portion of the spins each
+step, will result in very frequent rejections, especially at low
+temperatures.</p>
+<p>I evaluated a few different proposal distributions for use with the
+FK model.</p>
+<ol type="1">
+<li>Flipping a single random site</li>
+<li>Flipping N random sites for some N</li>
+<li>Choosing n from Uniform(1, N) and then flipping n sites for some
+fixed N.</li>
+<li>Attempting to tune the proposal distribution for each parameter
+regime.</li>
+</ol>
+<p>Fro Figure~<span class="math inline">\(\ref{fig:comparison}\)</span>
+we see that even at moderately high temperatures <span
+class="math inline">\(T &gt; T_c\)</span> flipping one or two sites is
+the best choice. However for some simulations at very high temperature
+flipping more spins is warranted. Tuning the proposal distribution
+automatically seems like something that would not yield enough benefit
+for the additional complexity it would require.</p>
+</section>
+</section>
+<section id="proposal-distributions" class="level2">
+<h2>Proposal Distributions</h2>
+<p>In a MCMC method a key property is the proportion of the time that
+proposals are accepted, the acceptance rate. If this rate is too low the
+random walk is trying to take overly large steps in energy space which
+problematic because it means very few new samples will be generated. If
+it is too high it implies the steps are too small, a problem because
+then the walk will take longer to explore the state space and the
+samples will be highly correlated. Ideal values for the acceptance rate
+can be calculated under certain assumptions <span class="citation"
+data-cites="robertsWeakConvergenceOptimal1997"> [<a
+href="#ref-robertsWeakConvergenceOptimal1997"
+role="doc-biblioref">10</a>]</span>. Here we monitor the acceptance rate
+and if it is too high we re-run the MCMC with a modified proposal
+distribution that has a chance to propose moves that flip multiple sites
+at a time.</p>
+<p>In addition we exploit the particle-hole symmetry of the problem by
+occasionally proposing a flip of the entire state. This works because
+near half-filling, flipping the occupations of all the sites will
+produce a state at or near the energy of the current one.</p>
+</section>
+<section id="choosing-the-proposal-distribution" class="level2">
+<h2>Choosing the proposal distribution</h2>
+<p><img src="figs/lsr/autocorr_multiple_proposals.png" title="fig:"
+id="fig:comparison"
+alt="t = 1, \alpha = 1.25, J = U = 5 [fig:comparison]" /> Simulations
+showing how the autocorrelation of the order parameter depends on the
+proposal distribution used at different temperatures, we see that at
+<span class="math inline">\(T = 1.5 &lt; T_c\)</span> a single spin flip
+is likely the best choice, while at the high temperature <span
+class="math inline">\(T = 2.5 &gt; T_c\)</span> flipping two sites or a
+mixture of flipping two and 1 sites is likely a better choice.</p>
+<p>Now we can discuss how to minimise the auto-correlations. The general
+principle is that one must balance the proposal distribution between two
+extremes. Choose overlay small steps, like flipping only a single spin
+and the acceptance rate will be high because <span
+class="math inline">\(\Delta F\)</span> will usually be small, but each
+state will be very similar to the previous and the auto-correlations
+will be high too, making sampling inefficient. On the other hand,
+overlay large steps, like randomising a large portion of the spins each
+step, will result in very frequent rejections, especially at low
+temperatures.</p>
+<p>I evaluated a few different proposal distributions for use with the
+FK model.</p>
+<ol type="1">
+<li><p>Flipping a single random site</p></li>
+<li><p>Flipping N random sites for some N</p></li>
+<li><p>Choosing n from Uniform(1, N) and then flipping n sites for some
+fixed N.</p></li>
+<li><p>Attempting to tune the proposal distribution for each parameter
+regime.</p></li>
+</ol>
+<p>Fro Figure <a href="#fig:comparison" data-reference-type="ref"
+data-reference="fig:comparison">4</a> we see that even at moderately
+high temperatures <span class="math inline">\(T &gt; T_c\)</span>
+flipping one or two sites is the best choice. However for some
+simulations at very high temperature flipping more spins is warranted.
+Tuning the proposal distribution automatically seems like something that
+would not yield enough benefit for the additional complexity it would
+require.</p>
+</section>
+<section id="perturbation-mcmc" class="level2">
+<h2>Perturbation MCMC</h2>
+<p>The matrix diagonalisation is the most computationally expensive step
+of the process, a speed up can be obtained by modifying the proposal
+distribution to depend on the classical part of the energy, a trick
+gleaned from Ref. <span class="citation"
+data-cites="krauthIntroductionMonteCarlo1998"> [<a
+href="#ref-krauthIntroductionMonteCarlo1998"
+role="doc-biblioref">6</a>]</span>: <span class="math display">\[
+\begin{aligned}
+q(k \to k&#39;) &amp;= \min\left(1, e^{\beta (H^{k&#39;} - H^k)}\right)
+\\
+\mathcal{A}(k \to k&#39;) &amp;= \min\left(1, e^{\beta(F^{k&#39;}-
+F^k)}\right)
+\end{aligned}\]</span> % This allows the method to reject some states
+without performing the diagonalisation at no cost to the accuracy of the
+MCMC method.</p>
+<p>An extension of this idea is to try to define a classical model with
+a similar free energy dependence on the classical state as the full
+quantum, Ref. <span class="citation"
+data-cites="huangAcceleratedMonteCarlo2017"> [<a
+href="#ref-huangAcceleratedMonteCarlo2017"
+role="doc-biblioref">11</a>]</span> does this with restricted Boltzmann
+machines whose form is very similar to a classical spin model.</p>
+<section id="convergence-time" class="level3">
+<h3>Convergence Time</h3>
+<p>Considering <span class="math inline">\(p(S)\)</span> as a vector
+<span class="math inline">\(\vec{p}\)</span> whose jth entry is the
+probability of the jth state <span class="math inline">\(p_j =
+p(S_j)\)</span>, and writing <span
+class="math inline">\(\mathcal{T}\)</span> as the matrix with entries
+<span class="math inline">\(T_{ij} = \mathcal{T}(S_j \rightarrow
+S_i)\)</span> we can write the update rule for the ensemble probability
+as: <span class="math display">\[\vec{p}_{t+1} = \mathcal{T} \vec{p}_t
+\implies \vec{p}_{t} = \mathcal{T}^t \vec{p}_0\]</span> where <span
+class="math inline">\(\vec{p}_0\)</span> is vector which is one on the
+starting state and zero everywhere else. Since all states must
+transition to somewhere with probability one: <span
+class="math inline">\(\sum_i T_{ij} = 1\)</span>.</p>
+<p>Matrices that satisfy this are called stochastic matrices exactly
+because they model these kinds of Markov processes. It can be shown that
+they have real eigenvalues, and ordering them by magnitude, that <span
+class="math inline">\(\lambda_0 = 1\)</span> and <span
+class="math inline">\(0 &lt; \lambda_{i\neq0} &lt; 1\)</span>.
+%https://en.wikipedia.org/wiki/Stochastic_matrix</p>
+<p>Assuming <span class="math inline">\(\mathcal{T}\)</span> has been
+chosen correctly, its single eigenvector with eigenvalue 1 will be the
+thermal distribution so repeated application of the transition function
+eventually leads there, while memory of the initial conditions decays
+exponentially with a convergence time <span
+class="math inline">\(k\)</span> determined by <span
+class="math inline">\(\lambda_1\)</span>. In practice this means that
+one throws away the data from the beginning of the random walk in order
+reduce the dependence on the initial conditions and be close enough to
+the target distribution.</p>
+</section>
+</section>
+</section>
+<section id="appbalance-detailed-balance" class="level1">
+<h1><span id="app:balance" label="app:balance">[app:balance]</span>
+DETAILED BALANCE</h1>
+<p>Given a <span data-acronym-label="MCMC"
+data-acronym-form="singular+short">MCMC</span> algorithm with target
+distribution <span class="math inline">\(\pi(a)\)</span> and transition
+function <span class="math inline">\(\T\)</span> the detailed balance
+condition is sufficient (along with some technical constraints <span
+class="citation" data-cites="wolffMonteCarloErrors2004"> [<a
+href="#ref-wolffMonteCarloErrors2004"
+role="doc-biblioref">5</a>]</span>) to guarantee that in the long time
+limit the algorithm produces samples from <span
+class="math inline">\(\pi\)</span>. <span
+class="math display">\[\pi(a)\T(a \to b) = \pi(b)\T(b \to
+a)\]</span></p>
+<p>In pseudo-code, our two step method corresponds to two nested
+comparisons with the majority of the work only occurring if the first
+test passes:</p>
+<div class="sourceCode" id="cb7"><pre
+class="sourceCode python"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>current_state <span class="op">=</span> initial_state</span>
+<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(N_steps):</span>
+<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a>  new_state <span class="op">=</span> proposal(current_state)</span>
+<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a>  c_dE <span class="op">=</span> classical_energy_change(</span>
+<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a>                               current_state,</span>
+<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a>                               new_state)</span>
+<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a>  <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span>beta <span class="op">*</span> c_dE):</span>
+<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a>    q_dF <span class="op">=</span> quantum_free_energy_change(</span>
+<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a>                                current_state,</span>
+<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a>                                new_state)</span>
+<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> uniform(<span class="dv">0</span>,<span class="dv">1</span>) <span class="op">&lt;</span> exp(<span class="op">-</span> beta <span class="op">*</span> q_dF):</span>
+<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a>      current_state <span class="op">=</span> new_state</span>
+<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb7-16"><a href="#cb7-16" aria-hidden="true" tabindex="-1"></a>    states[i] <span class="op">=</span> current_state</span></code></pre></div>
+<p>Defining <span class="math inline">\(r_c = e^{-\beta H_c}\)</span>
+and <span class="math inline">\(r_q = e^{-\beta F_q}\)</span> our target
+distribution is <span class="math inline">\(\pi(a) = r_c r_q\)</span>.
+This method has <span class="math inline">\(\T(a\to b) = q(a\to b)\A(a
+\to b)\)</span> with symmetric <span class="math inline">\(p(a \to b) =
+\p(b \to a)\)</span> and <span class="math inline">\(\A = \min\left(1,
+r_c\right) \min\left(1, r_q\right)\)</span></p>
+<p>Substituting this into the detailed balance equation gives: <span
+class="math display">\[\T(a \to b)/\T(b \to a) = \pi(b)/\pi(a) = r_c
+r_q\]</span></p>
+<p>Taking the LHS and substituting in our transition function: <span
+class="math display">\[\begin{aligned}
+\T(a \to b)/\T(b \to a) = \frac{\min\left(1, r_c\right) \min\left(1,
+r_q\right)}{ \min\left(1, 1/r_c\right) \min\left(1,
+1/r_q\right)}\end{aligned}\]</span></p>
+<p>which simplifies to <span class="math inline">\(r_c r_q\)</span> as
+<span class="math inline">\(\min(1,r)/\min(1,1/r) = r\)</span> for <span
+class="math inline">\(r &gt; 0\)</span>.</p>
+<div class="sourceCode" id="cb8"><pre
+class="sourceCode python"><code class="sourceCode python"></code></pre></div>
+</section>
+<section id="bibliography" class="level1 unnumbered">
+<h1 class="unnumbered">Bibliography</h1>
+<div id="refs" class="references csl-bib-body" role="doc-bibliography">
+<div id="ref-devroyeRandomSampling1986" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[1] </div><div class="csl-right-inline">L.
+Devroye, <em><a
+href="https://doi.org/10.1007/978-1-4613-8643-8_12">Random
+Sampling</a></em>, in <em>Non-Uniform Random Variate Generation</em>,
+edited by L. Devroye (Springer, New York, NY, 1986), pp. 611–641.</div>
+</div>
+<div id="ref-BMCP2021" class="csl-entry" role="doc-biblioentry">
+<div class="csl-left-margin">[2] </div><div class="csl-right-inline">O.
+A. Martin, R. Kumar, and J. Lao, <em>Bayesian Modeling and Computation
+in Python</em> (Boca Raton, 2021).</div>
+</div>
+<div id="ref-binderGuidePracticalWork1988" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[3] </div><div class="csl-right-inline">K.
+Binder and D. W. Heermann, <em><a
+href="https://doi.org/10.1007/978-3-662-08854-8_3">Guide to Practical
+Work with the Monte Carlo Method</a></em>, in <em>Monte Carlo Simulation
+in Statistical Physics: An Introduction</em>, edited by K. Binder and D.
+W. Heermann (Springer Berlin Heidelberg, Berlin, Heidelberg, 1988), pp.
+68–112.</div>
+</div>
+<div id="ref-kerteszAdvancesComputerSimulation1998" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[4] </div><div class="csl-right-inline">J.
+Kertesz and I. Kondor, editors, <em><a
+href="https://doi.org/10.1007/BFb0105456">Advances in Computer
+Simulation: Lectures Held at the Eötvös Summer School in Budapest,
+Hungary, 16–20 July 1996</a></em> (Springer-Verlag, Berlin Heidelberg,
+1998).</div>
+</div>
+<div id="ref-wolffMonteCarloErrors2004" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[5] </div><div class="csl-right-inline">U.
+Wolff, <em><a href="https://doi.org/10.1016/S0010-4655(03)00467-3">Monte
+Carlo Errors with Less Errors</a></em>, Computer Physics Communications
+<strong>156</strong>, 143 (2004).</div>
+</div>
+<div id="ref-krauthIntroductionMonteCarlo1998" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[6] </div><div class="csl-right-inline">W.
+Krauth, <em><a href="https://doi.org/10.1007/BFb0105456">Introduction To
+Monte Carlo Algorithms</a></em>, in <em>Advances in Computer Simulation:
+Lectures Held at the Eötvös Summer School in Budapest, Hungary, 16–20
+July 1996</em> (Springer-Verlag, Berlin Heidelberg, 1998).</div>
+</div>
+<div id="ref-kellyReversibilityStochasticNetworks1981" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[7] </div><div class="csl-right-inline">F.
+P. Kelly, <em><a href="https://doi.org/10.2307/2287860">Reversibility
+and Stochastic Networks / F.P. Kelly</a></em>, SERBIULA (Sistema Librum
+2.0) <strong>76</strong>, (1981).</div>
+</div>
+<div id="ref-kapferSamplingPolytopeHarddisk2013" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[8] </div><div class="csl-right-inline">S.
+C. Kapfer and W. Krauth, <em><a
+href="https://doi.org/10.1088/1742-6596/454/1/012031">Sampling from a
+Polytope and Hard-Disk Monte Carlo</a></em>, J. Phys.: Conf. Ser.
+<strong>454</strong>, 012031 (2013).</div>
+</div>
+<div id="ref-krauthIntroductionMonteCarlo1996" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[9] </div><div class="csl-right-inline">W.
+Krauth, <em><a href="http://arxiv.org/abs/cond-mat/9612186">Introduction
+To Monte Carlo Algorithms</a></em>, arXiv:cond-Mat/9612186 (1996).</div>
+</div>
+<div id="ref-robertsWeakConvergenceOptimal1997" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[10] </div><div class="csl-right-inline">G.
+O. Roberts, A. Gelman, and W. R. Gilks, <em><a
+href="https://doi.org/10.1214/aoap/1034625254">Weak Convergence and
+Optimal Scaling of Random Walk Metropolis Algorithms</a></em>, Ann.
+Appl. Probab. <strong>7</strong>, 110 (1997).</div>
+</div>
+<div id="ref-huangAcceleratedMonteCarlo2017" class="csl-entry"
+role="doc-biblioentry">
+<div class="csl-left-margin">[11] </div><div class="csl-right-inline">L.
+Huang and L. Wang, <em><a
+href="https://doi.org/10.1103/PhysRevB.95.035105">Accelerated Monte
+Carlo Simulations with Restricted Boltzmann Machines</a></em>, Phys.
+Rev. B <strong>95</strong>, 035105 (2017).</div>
+</div>
+</div>
 </section>
 
 
diff --git a/_thesis/6_Appendices/A.3_Lattice_Generation.html b/_thesis/6_Appendices/A.3_Lattice_Generation.html
index 704c3aa..e89d427 100644
--- a/_thesis/6_Appendices/A.3_Lattice_Generation.html
+++ b/_thesis/6_Appendices/A.3_Lattice_Generation.html
@@ -57,7 +57,7 @@ Generation</a></li>
 <div class="sourceCode" id="cb1"><pre
 class="sourceCode python"><code class="sourceCode python"></code></pre></div>
 <p>Next Section: <a
-href="../6_Appendices/A.4_Lattice_Colouring.html#lattice-colouring">Lattice
+href="../6_Appendices/A.4_Lattice_Colouring.html">Lattice
 Colouring</a></p>
 </section>
 
diff --git a/_thesis/toc.html b/_thesis/toc.html
index f6e68a8..238d775 100644
--- a/_thesis/toc.html
+++ b/_thesis/toc.html
@@ -1,35 +1,43 @@
   <ul>
-  <li><a href="./1_Introduction/1_Intro.html#interacting-quantum-many-body-systems">1 Introduction</a></li>
+  <li><a href="./1_Introduction/1_Intro.html">1 Introduction</a></li>
     <ul>
-    <li><a href="./1_Introduction/1_Intro.html#interacting-quantum-many-body-systems">Interacting Quantum Many Body Systems</a></li>
+    <li><a href="./1_Introduction/1_Intro.html">Interacting Quantum Many Body Systems</a></li>
     <li><a href="./1_Introduction/1_Intro.html#mott-insulators">Mott Insulators</a></li>
     <li><a href="./1_Introduction/1_Intro.html#quantum-spin-liquids">Quantum Spin Liquids</a></li>
   </ul>
-  <li><a href="./2_Background/2.1_FK_Model.html#the-falikov-kimball-model">2 Background</a></li>
+  <li><a href="./2_Background/2.1_FK_Model.html">2 Background</a></li>
     <ul>
-    <li><a href="./2_Background/2.1_FK_Model.html#the-falikov-kimball-model">The Falikov Kimball Model</a></li>
+    <li><a href="./2_Background/2.1_FK_Model.html">The Falikov Kimball Model</a></li>
     <li><a href="./2_Background/2.2_HKM_Model.html#the-kitaev-honeycomb-model">The Kitaev Honeycomb Model</a></li>
     <li><a href="./2_Background/2.3_Disorder.html#disorder-and-localisation">Disorder and Localisation</a></li>
   </ul>
-  <li><a href="./3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html#the-model">3 The Long Range Falikov-Kimball Model</a></li>
+  <li><a href="./3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html">3 The Long Range Falikov-Kimball Model</a></li>
     <ul>
-    <li><a href="./3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html#the-model">The Model</a></li>
+    <li><a href="./3_Long_Range_Falikov_Kimball/3.1_LRFK_Model.html">The Model</a></li>
     <li><a href="./3_Long_Range_Falikov_Kimball/3.2_LRFK_Methods.html#methods">Methods</a></li>
     <li><a href="./3_Long_Range_Falikov_Kimball/3.3_LRFK_Results.html#results">Results</a></li>
     <li><a href="./3_Long_Range_Falikov_Kimball/3.3_LRFK_Results.html#discussion-and-conclusion">Discussion and Conclusion</a></li>
   </ul>
-  <li><a href="./4_Amorphous_Kitaev_Model/4.1.2_AMK_Model.html#gauge-fields">4 The Amorphous Kitaev Model</a></li>
+  <li><a href="./4_Amorphous_Kitaev_Model/4.1.2_AMK_Model.html">4 The Amorphous Kitaev Model</a></li>
     <ul>
     <li><a href="./4_Amorphous_Kitaev_Model/4.1_AMK_Model.html#the-model">The Model</a></li>
     <li><a href="./4_Amorphous_Kitaev_Model/4.2_AMK_Methods.html#methods">Methods</a></li>
     <li><a href="./4_Amorphous_Kitaev_Model/4.3_AMK_Results.html#results">Results</a></li>
     <li><a href="./4_Amorphous_Kitaev_Model/4.3_AMK_Results.html#discussion-and-conclusion">Discussion and Conclusion</a></li>
   </ul>
-  <li><a href="./5_Conclusion/5_Conclusion.html#discussion">5 Conclusion</a></li>
-  <li><a href="./6_Appendices/A.1_Particle_Hole_Symmetry.html#particle-hole-symmetry">Appendices</a></li>
+  <li><a href="./5_Conclusion/5_Conclusion.html">5 Conclusion</a></li>
     <ul>
+    <li><a href="./5_Conclusion/5_Conclusion.html">Material Realisations</a></li>
+    <li><a href="./5_Conclusion/5_Conclusion.html#discussion">Discussion</a></li>
+    <li><a href="./5_Conclusion/5_Conclusion.html#outlook">Outlook</a></li>
+  </ul>
+  <li><a href="./6_Appendices/A.1.2_Fermion_Free_Energy.html">Appendices</a></li>
+    <ul>
+    <li><a href="./6_Appendices/A.1.2_Fermion_Free_Energy.html">Evaluation of the Fermion Free Energy</a></li>
+    <li><a href="./6_Appendices/A.1_Particle_Hole_Symmetry-Copy1.html#particle-hole-symmetry">Particle-Hole Symmetry</a></li>
     <li><a href="./6_Appendices/A.1_Particle_Hole_Symmetry.html#particle-hole-symmetry">Particle-Hole Symmetry</a></li>
     <li><a href="./6_Appendices/A.2_Markov_Chain_Monte_Carlo.html#markov-chain-monte-carlo">Markov Chain Monte Carlo</a></li>
+    <li><a href="./6_Appendices/A.2_Markov_Chain_Monte_Carlo.html#[\[app:balance\]]">[\[app:balance\]]</a></li>
     <li><a href="./6_Appendices/A.3_Lattice_Generation.html#lattice-generation">Lattice Generation</a></li>
     <li><a href="./6_Appendices/A.4_Lattice_Colouring.html#lattice-colouring">Lattice Colouring</a></li>
     <li><a href="./6_Appendices/A.5_The_Projector.html#the-projector">The Projector</a></li>