From 7e5f20aa2df7c6855ce4e27f08fcb563a2aea595 Mon Sep 17 00:00:00 2001 From: gnikit Date: Mon, 18 Jul 2022 17:05:46 +0100 Subject: [PATCH 01/12] fix: name conflict with other exemplars The venv cannot be named recode, since a student might be working through multiple exemplars which would lead to name conflicts for the venvs. I named the conda venv mcmc for Monte Carlo Markov Chain. --- environment.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/environment.yml b/environment.yml index 6a460b2..3f453d5 100644 --- a/environment.yml +++ b/environment.yml @@ -1,4 +1,4 @@ -name: recode +name: mcmc channels: - defaults From 3443455cf92228a653ff8c70dc1066df23f327a0 Mon Sep 17 00:00:00 2001 From: gnikit Date: Mon, 18 Jul 2022 23:56:42 +0100 Subject: [PATCH 02/12] fix: multiple changes in the Jupyter notebooks - Replaces references to `code` to `src` - Adds language attributes to fenced code blocks - Updates output of `setup.cfg` in notebook 2. - Fixes spelling and grammar errors --- docs/learning/01 Introduction.ipynb | 32 ++++---- docs/learning/02 Packaging It Up.ipynb | 80 +++++++++---------- ...g a Markov Chain Monte Carlo Sampler.ipynb | 25 +++--- docs/learning/04 Testing.ipynb | 46 +++++------ docs/learning/05 Adding Functionality.ipynb | 16 ++-- docs/learning/06 Speeding It Up.ipynb | 8 +- .../07 Producing Research Outputs.ipynb | 12 +-- .../08 Doing Reproducible Science.ipynb | 61 ++++++++------ docs/learning/09 Adding Documentation.ipynb | 10 +-- 9 files changed, 155 insertions(+), 135 deletions(-) diff --git a/docs/learning/01 Introduction.ipynb b/docs/learning/01 Introduction.ipynb index a5e6e1b..47a2cc5 100644 --- a/docs/learning/01 Introduction.ipynb +++ b/docs/learning/01 Introduction.ipynb @@ -12,20 +12,20 @@ "\n", "# Introduction\n", "\n", - "Hello and welcome to the documentation for MCMCFF! These notebooks will guide you through the process of writing a medium sized scientific software project, discussing the decision and tradeoffs made along the way.\n", + "Hello and welcome to the documentation for MCMCFF! These notebooks will guide you through the process of writing a medium-sized scientific software project, discussing the decision and trade-offs made along the way.\n", "\n", "## Setting up your environment\n", "\n", - "It's strongly encouraged that you follow along this notebook in an enviroment where you can run the cells yourself and change them. You can either clone this git repository and run the cells in a python environment on your local machine, or if you for some reason can't do that (because you're an a phone or tablet for instance) you can instead open this notebook in [binder](https://mybinder.org/v2/gh/TomHodson/ReCoDE_MCMCFF/HEAD)\n", + "It's strongly encouraged that you follow along this notebook in an environment where you can run the cells yourself and change them. You can either clone this git repository and run the cells in a python environment on your local machine, or if you for some reason can't do that (because you are on a phone or tablet for instance) you can instead open this notebook in [binder](https://mybinder.org/v2/gh/TomHodson/ReCoDE_MCMCFF/HEAD)\n", "\n", - "I would also suggest you setup a python environment just for this. You can use your preferred method to do this, but I will recomend `conda` because it's both what I currently use and what is recommeded by Imperial.\n", + "I would also suggest you set up a python environment just for this. You can use your preferred method to do this, but I will recommend `conda` because it's both what I currently use and what is recommended by Imperial.\n", "\n", "```bash\n", "#make a new conda environment from the specification in environment.yml\n", "conda env create --file environment.yml\n", "\n", "#activate the environment\n", - "conda activate recode\n", + "conda activate mcmc\n", "```\n", "\n", "If you'd prefer to keep this environment nicely stored away in this repository, you can save in a folder called env by doing\n", @@ -44,7 +44,7 @@ "\n", "## The Problem\n", "\n", - "So without further ado lets talk about the problem we'll be working on, you don't necessaryily need to understand the full details of this to learn the important lessons but I will give a quick summary here. We want to simulate a physical model called the **Ising model**, which is famous in physics because it's about the simplest thing you can come up with that displays a phase transition, a special kind of shift between two different behaviours." + "So without further ado lets talk about the problem we'll be working on, you don't necessarily need to understand the full details of this to learn the important lessons, but I will give a quick summary here. We want to simulate a physical model called the **Ising model**, which is famous in physics because it's about the simplest thing you can come up with that displays a phase transition, a special kind of shift between two different behaviours." ] }, { @@ -73,7 +73,7 @@ "\n", "np.random.seed(\n", " 42\n", - ") # This makes our random numbers reproducable when the notebook is rerun in order" + ") # This makes our random numbers reproducible when the notebook is rerun in order" ] }, { @@ -81,7 +81,7 @@ "id": "e52245f1-8ecc-45f1-8d52-337916b0ce7c", "metadata": {}, "source": [ - "We're going to be working with arrays of numbers so it will make sense to work with `Numpy` and we'll also want to plot things, the standard choice for this is `matplotlib`, though there are other options, `pandas` and `plotly` being notable ones.\n", + "We're going to be working with arrays of numbers, so it will make sense to work with `Numpy`, and we'll also want to plot things, the standard choice for this is `matplotlib`, though there are other options, `pandas` and `plotly` being notable ones.\n", "\n", "Let me quickly plot something to aid the imagination:" ] @@ -122,15 +122,15 @@ "id": "9a919be9-2737-4d79-9607-4daf3b457364", "metadata": {}, "source": [ - "In my head, the Ising model is basically all about peer pressure. You're a tiny creature and you live in a little world where you can only be one of two things, up/down, left/right, in/out doesn't matter. \n", + "In my head, the Ising model is basically all about peer pressure. You're a tiny creature, and you live in a little world where you can only be one of two things, up/down, left/right, in/out doesn't matter. \n", "\n", "But what *does matter* is that you're doing the same thing as you're neighbours. We're going to visualise this with images like the above, representing the two different camps, though at the moment what I've plotted is random, there's no peer pressure going on yet.\n", "\n", "The way that a physicist would quantify this peer pressure is to assign a number to each state, lower numbers meaning more of the little creatures are doing the same thing as their neighbours. We'll call this the Energy, because physicists always call things Energy, that's just what we do.\n", "\n", - "To calculate the energy what we're gonna do is look at all the pixels/creatures, and for each one, we look at the four neighbours to the N/E/S/W, everytime we find a neighbour that agrees, we'll subtract 1 from our total and every time we find neighbours that disagree we'll add 1 to our total. Creatures at the edges will simply have fewer neighbours to worry about. \n", + "To calculate the energy what we're going to do is look at all the pixels/creatures, and for each one, we look at the four neighbours to the N/E/S/W, every time we find a neighbour that agrees, we'll subtract 1 from our total and every time we find neighbours that disagree we'll add 1 to our total. Creatures at the edges will simply have fewer neighbours to worry about. \n", "\n", - "I'll show you what the equation for this looks like, but don't worry to much about it, the word description should be enough to write some code. If we assign the ith creature the label $s_i = \\pm1$ then the energy is \n", + "I'll show you what the equation for this looks like, but don't worry too much about it, the word description should be enough to write some code. If we assign the ith creature the label $s_i = \\pm1$ then the energy is \n", "$$E = \\sum_{(i,j)} s_i s_j$$\n", "\n", "Ok let's do some little tests, let's make the all up, all down and random state and see if we can compute their energies." @@ -389,25 +389,25 @@ "source": [ "### Making it a little faster\n", "\n", - "This project is not intended to focus on optimising for performance but it is worth putting a little effort into making this function faster so that we can run experiments more quickly later.\n", + "This project is not intended to focus on optimising for performance, but it is worth putting a little effort into making this function faster so that we can run experiments more quickly later.\n", "\n", "The main thing that slows us down here is that we've written a 'tight loop' in pure python, the energy function is just a loop over the fundamental operation:\n", "```python\n", "E -= state[i,j] * state[i+di, j]\n", "```\n", - "which in theoy only requires a few memory load operations, a multiply, an add and a store back to memory (give or take). However because Python is such a dynamic language, it will have to do extra things like check the type and methods of `state` and `E`, invoke their array access methods `object.__get__`, etc etc. We call this extra work overhead.\n", + "which in theory only requires a few memory load operations, a multiply, an add and a store back to memory (give or take). However, because Python is such a dynamic language, it will have to do extra things like check the type and methods of `state` and `E`, invoke their array access methods `object.__get__`, etc. We call this extra work overhead.\n", "\n", - "In most cases the ratio of overhead to actual computation is not too bad, but here because the fundamental computation is so simple it's likely the overhead accounts for much more of the overal time.\n", + "In most cases the ratio of overhead to actual computation is not too bad, but here because the fundamental computation is so simple it's likely the overhead accounts for much more of the overall time.\n", "\n", "In scientific python like this there are usually two main options for reducing the overhead:\n", "\n", "#### Using Arrays\n", - "One way is we work with arrays of numbers and operations defined over those arrays such as `sum`, `product` etc. `Numpy` is the canonical example of this in Python but many machine learning libraries are essentually doing a similar thing. We rely on the library to implement the operations efficiently and try to chain those operations together to achieve what we want. This imposes some limitations on the way we can write our code.\n", + "One way is we work with arrays of numbers and operations defined over those arrays such as `sum`, `product` etc. `Numpy` is the canonical example of this in Python, but many machine learning libraries are essentially doing a similar thing. We rely on the library to implement the operations efficiently and try to chain those operations together to achieve what we want. This imposes some limitations on the way we can write our code.\n", "\n", "#### Using Compilation\n", "The alternative is that we convert our Python code into a more efficient form that incurs less overhead. This requires a compilation or transpilation step and imposes a different set of constraints on the code.\n", "\n", - "It's a little tricky to decide which of the two approaches will work best for a given problem. My advice would be to have some familiarity with both but ultimatly to use what makes your development experience the best, since you'll likely spend more time writing the code than you will waiting for it to run!" + "It's a little tricky to decide which of the two approaches will work best for a given problem. My advice would be to have some familiarity with both but ultimately to use what makes your development experience the best, since you'll likely spend more time writing the code than you will waiting for it to run!" ] }, { @@ -613,7 +613,7 @@ "## Conclusion\n", "So far we've discussed the problem we want to solve, written a little code, tested it a bit and made some speed improvements.\n", "\n", - "In the next notebook we will package the code up into a little python package, this is has two big benefits to use: \n", + "In the next notebook we will package the code up into a little python package, this has two big benefits to use: \n", "1. I won't have to redefine the energy function we just wrote in the next notebook \n", "1. It will help with testing and documenting our code later" ] diff --git a/docs/learning/02 Packaging It Up.ipynb b/docs/learning/02 Packaging It Up.ipynb index a53e1aa..14d392c 100644 --- a/docs/learning/02 Packaging It Up.ipynb +++ b/docs/learning/02 Packaging It Up.ipynb @@ -30,9 +30,9 @@ "- [Packaging for pytest](https://docs.pytest.org/en/6.2.x/goodpractices.html)\n", "\n", "\n", - "Before we can do any testing, it is best practice to structure and then package your code up as a python project up. You don't have to do it like this but but it carrys with it the benefit that many only tutorial _expect_ you to do it like this and generally you want to reduce friction for yourself later. \n", + "Before we can do any testing, it is best practice to structure and then package your code up as a python project up. You don't have to do it like this, but it carries with it the benefit that many other tutorials _expect_ you to do it like this, and generally you want to reduce friction for yourself later. \n", "\n", - "Like all things progamming, there are many opinions about how python projects should be structured, as I write this the structure of this repository is this: (This is the lightly edited output of the `tree` command if you're interested) \n", + "Like all things programming, there are many opinions about how python projects should be structured, as I write this the structure of this repository is this: (This is the lightly edited output of the `tree` command if you're interested) \n", "```bash\n", ".\n", "├── CITATION.cff # This file describes how to cite the work contained in this repository.\n", @@ -41,7 +41,7 @@ "├── docs\n", "│ ├── ... #Files to do with making the documentation\n", "│ └── learning\n", - "│ └── #The Jupyer notebooks that form the main body of this project\n", + "│ └── #The Jupyter notebooks that form the main body of this project\n", "│\n", "├── pyproject.toml # Machine readable information about the MCFF package\n", "├── readthedocs.yml # Tells readthedocs.com how to build the documentation\n", @@ -53,15 +53,15 @@ "└── tests # automated tests for the code\n", "```\n", "\n", - "It's looks pretty intimidating! But let's quickly go through it, at the top level of most projects you'll find on Github and elsewhere you'll find files to do with the project as a whole:\n", + "It's looks pretty intimidating! But let's quickly go through it, at the top level of most projects you'll find on GitHub, and elsewhere you'll find files to do with the project as a whole:\n", "- `README.md` - An intro to the project\n", "- `LICENSE` - The software license that governs this project, there are a few standard ones people use.\n", - "- `environment.yaml` (or both) this list what python packages the project needs in a standard format\n", + "- `environment.yml` (or both) this list what python packages the project needs in a standard format\n", "- `CITATION.cff` This is the new standard way to describe how a work should be cited, v useful for academic software.\n", "\n", - "Then below that you will usually have directories breaking the project up into main categories, here I have `code/` and `learning/` but it would be more typical to have what is in `code` at the top level.\n", + "Then below that you will usually have directories breaking the project up into main categories, here I have `src/` and `docs/learning/`.\n", "\n", - "Inside `code/` we have a standard python package directory structure.\n", + "Inside `src/` we have a standard python package directory structure.\n", "\n", "## Packaging\n", "There are a few things going on here, our actual code lives in `MCFF/` which is wrapped up inside a `src` folder, the `src` thing is a convention related to pytests, check [Packaging for pytest](https://docs.pytest.org/en/6.2.x/goodpractices.html) if you want the gory details.\n", @@ -76,24 +76,16 @@ "`pyproject.toml` and `setup.cfg` are the current way to describe the metadata about a python package like how it should be installed and who the author is etc, but typically you just copy the standard layouts and build from there. The empty `__init__.py` file flags that this folder is a python module.\n", "\n", "pyproject.toml:\n", - "```\n", + "```toml\n", "[build-system]\n", "requires = [\"setuptools>=4.2\"]\n", "build-backend = \"setuptools.build_meta\"\n", "```\n", "\n", - "requirements.txt\n", - "```\n", - "ipykernel\n", - "numpy\n", - "scipy\n", - "matplotlib\n", - "numba\n", - "```\n", - "`ipykernel` is there because it lets you run the envronment in a jupyter notebook easily. \n", - "\n", "setup.cfg\n", - "```\n", + "```ini\n", + "[metadata]\n", + "name = MCFF\n", "version = 0.0.1\n", "author = Tom Hodson\n", "author_email = tch14@ic.ac.uk\n", @@ -112,26 +104,29 @@ "packages = find:\n", "python_requires = >=3.6\n", "install_requires =\n", - " numpy == 1.21 \n", - " scipy == 1.7\n", - " matplotlib == 3.5\n", - " numba == 0.55\n", - " ipykernel == 6.9 # Allows this conda environment to show up automatically in Jupyter Lab\n", - " watermark == 2.3 # Generates a summary of package version for use inside Jupyter Notebooks\n", + " numpy == 1.21\n", + " scipy == 1.7\n", + " matplotlib == 3.5\n", + " numba == 0.55\n", + "\n", + "[options.extras_require]\n", + "dev =\n", + " pytest == 7.1 # Testing\n", + " pytest-cov == 3.0 # For Coverage testing\n", + " hypothesis == 6.29 # Property based testing\n", + " pre-commit == 2.20\n", + " jupyterlab == 3.4.3\n", + " ipykernel == 6.9 # Allows this conda environment to show up automatically in Jupyter Lab\n", + " watermark == 2.3 # Generates a summary of package version for use inside Jupyter Notebooks\n", + "\n", + "docs =\n", + " sphinx == 5.0.0\n", + " myst-nb == 0.16.0\n", "\n", "[options.packages.find]\n", "where = src\n", - "dev = \n", - " pytest == 7.1 # Testing\n", - " pytest-cov == 3.0 # For Coverage testing\n", - " hypothesis == 6.29 # Property based testing\n", - " pre-commit == 2.20\n", - " \n", - "docs = \n", - " sphinx == 5.0 # For building the documentation\n", - " myst-nb == 0.16 \n", "```\n", - "Phew, that was a lot. Python packaging has been evolving a lot over the years and the consequence is there is a lot of out of date advice and there are many other ways to do this. You're best bet to figure out what the current best practice is is to consult offical sources like python.org" + "Phew, that was a lot. Python packaging has been evolving a lot over the years and the consequence is there is a lot of out of date advice and there are many other ways to do this. You're best bet to figure out what the current best practice is to consult official sources like python.org." ] }, { @@ -139,11 +134,11 @@ "id": "cef1ba97-db03-45ce-b428-a027133eabc9", "metadata": {}, "source": [ - "Once all that is setup, cd to the `code/` folder and install the module using:\n", + "Once all that is set up, from the top level of the project you can run:\n", "```bash\n", "pip install --editable \".[dev,docs]\"\n", "```\n", - "The dot means we should install MCFF from the current directory and `--editable` means to do it as an editable package so that we can edit the files in MCFF and not have to reinstall. This is really useful for development. `[dev,docs]` means we also want to install the packages that are needed to do development of this repository and to build the documentation, boths those things will become relevant later!" + "The dot means we should install MCFF from the current directory and `--editable` means to do it as an editable package so that we can edit the files in MCFF and not have to reinstall. This is really useful for development. `[dev,docs]` means we also want to install the packages that are needed to do development of this repository and to build the documentation, both those things will become relevant later!" ] }, { @@ -195,9 +190,9 @@ ], "metadata": { "kernelspec": { - "display_name": "Python [conda env:recode]", + "display_name": "Python 3.8.10 ('venv': venv)", "language": "python", - "name": "conda-env-recode-py" + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -209,7 +204,12 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.12" + "version": "3.8.10" + }, + "vscode": { + "interpreter": { + "hash": "f5403acae4671aac0ae5a29dd5903d33d0105a9e9d4148f755d3321f5023d387" + } } }, "nbformat": 4, diff --git a/docs/learning/03 Writing a Markov Chain Monte Carlo Sampler.ipynb b/docs/learning/03 Writing a Markov Chain Monte Carlo Sampler.ipynb index 1f5d333..17f0ccf 100644 --- a/docs/learning/03 Writing a Markov Chain Monte Carlo Sampler.ipynb +++ b/docs/learning/03 Writing a Markov Chain Monte Carlo Sampler.ipynb @@ -30,7 +30,7 @@ "\n", "np.random.seed(\n", " 42\n", - ") # This makes our random numbers reproducable when the notebook is rerun in order" + ") # This makes our random numbers reproducible when the notebook is rerun in order" ] }, { @@ -48,14 +48,14 @@ "1. We've also got some nice tests running that give us some confidence the code is right.\n", "\n", "\n", - "There isn't that much more work to do Markov Chain Monte Carlo. I won't go into the details of how MCMC works but put very simply MCMC lets us calculate thermal averages of a physical system at some temperature. For example, the physical system might be \"[$10^{23}$][wa] H20 molecules in a box\" and the thermal average we want is \"Are they organised like a solid or a liquid?\". We can ask that question at different temperatures and we will get different answers.\n", + "There isn't that much more work to do Markov Chain Monte Carlo. I won't go into the details of how MCMC works but put very simply MCMC lets us calculate thermal averages of a physical system at some temperature. For example, the physical system might be \"[$10^{23}$][wa] H20 molecules in a box\" and the thermal average we want is \"Are they organised like a solid or a liquid?\". We can ask that question at different temperatures, and we will get different answers.\n", "\n", "\n", - "For our Ising model the equivalent question would be what's the average color of this system? At high temperatures we expect the pixels to be random and average out ot grey, while at low temperatures they will all be either black or while.\n", + "For our Ising model the equivalent question would be what's the average color of this system? At high temperatures we expect the pixels to be random and average out out grey, while at low temperatures they will all be either black or while.\n", "\n", "What happens in between? This question is pretty hard to answer using maths, it can be done for the 2D Ising model but for anything more complicated it's pretty much impossible. This is where MCMC comes in.\n", "\n", - "MCMC is a numerical method that lets us calculate these thermal averages. MCMC is essentially a description of how to probalistically step from one state of the system to another. \n", + "MCMC is a numerical method that lets us calculate these thermal averages. MCMC is essentially a description of how to probabilistically step from one state of the system to another. \n", "\n", "If we perform these steps many times we get a (Markov) chain of states. The great thing about this chain is that if we average a measurement over it, such as looking at the average proportion of white pixels, the answer we get will be close to the real answer for this system and will converge closer and closer to the true answer as we extend the chain. \n", "\n", @@ -139,9 +139,9 @@ "id": "5d1874d4-4585-49ed-bc6f-b11c22231669", "metadata": {}, "source": [ - "These images give a flavour of why physicists find this model useful, it gives window into how thermal noise and spontaneous order interact. At low temperatures the energy cost of being different from your neighbours is the most important thing, while at high temperatures, it doesn't matter and you really just do your own thing.\n", + "These images give a flavour of why physicists find this model useful, it gives window into how thermal noise and spontaneous order interact. At low temperatures the energy cost of being different from your neighbours is the most important thing, while at high temperatures, it doesn't matter, and you really just do your own thing.\n", "\n", - "There's a special point somewhere in the middle called the critical point $T_c$ where all sorts of cool things happen, but my favourite is that for large system sizes you get a kind of fractal behaviour which I will demonstrate more once we've sped this code up and can simulate larger systems in a reasonable time. You can kinda see it for 50x50 systesm at T = 5 but not really clearly." + "There's a special point somewhere in the middle called the critical point $T_c$ where all sorts of cool things happen, but my favourite is that for large system sizes you get a kind of fractal behaviour which I will demonstrate more once we've sped this code up and can simulate larger systems in a reasonable time. You can kinda see it for 50x50 system at T = 5 but not really clearly." ] }, { @@ -160,7 +160,7 @@ "I've already missed at least one devastating bug in this code, and there are almost certainly more! Before we start adding too much new code we should think about how to increase our confidence that the individual components are working correctly. It's very easy to build a huge project out of hundreds of functions, realise there's a bug and then struggle to find the source of that bug. If we test our components individually and thoroughly, we can avoid some of that pain.\n", "\n", "**Performance**\n", - "Performance only matters in so far as it limits what we can do. And there is a real danger that trying to optimise for performance too early or in the wrong places will just lead to complexity that makes the code harder to read, harder to write and more likely to contain bugs. However I do want to show you the fractal states at the critical point, and I can't currently generate those images in a reasonable time, so some optimisation will happen!" + "Performance only matters in so far as it limits what we can do. And there is a real danger that trying to optimise for performance too early or in the wrong places will just lead to complexity that makes the code harder to read, harder to write and more likely to contain bugs. However, I do want to show you the fractal states at the critical point, and I can't currently generate those images in a reasonable time, so some optimisation will happen!" ] }, { @@ -206,9 +206,9 @@ ], "metadata": { "kernelspec": { - "display_name": "Python [conda env:recode]", + "display_name": "Python 3.8.10 ('venv': venv)", "language": "python", - "name": "conda-env-recode-py" + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -220,7 +220,12 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.12" + "version": "3.8.10" + }, + "vscode": { + "interpreter": { + "hash": "f5403acae4671aac0ae5a29dd5903d33d0105a9e9d4148f755d3321f5023d387" + } } }, "nbformat": 4, diff --git a/docs/learning/04 Testing.ipynb b/docs/learning/04 Testing.ipynb index 67b514b..b4a7fe6 100644 --- a/docs/learning/04 Testing.ipynb +++ b/docs/learning/04 Testing.ipynb @@ -25,7 +25,7 @@ "\n", "Ok we can finally start writing and running some tests!\n", "\n", - "I copied some of the initial tests that we did in chapter 1 into `test_energy.py` installed pytest into my development environment with `pip install pytest`. If you're using conda you need to use `conda install pytest` and now I can run the `pytest` command in the ReCoDE_MCFF directory. Pytest will automatically discover our tests and run them, to do this it relies on their being python files with functions named `test_\\*` which it will run.\n", + "I copied some of the initial tests that we did in chapter 1 into `test_energy.py` installed pytest into my development environment with `pip install pytest`. If you're using conda you need to use `conda install pytest`, and now I can run the `pytest` command in the `mcmc` directory. Pytest will automatically discover our tests and run them, to do this it relies on their being python files with functions named `test_\\*` which it will run.\n", "\n", "If that doesn't work and complains it can't find MCFF, try `python -m pytest`, this asks python to find a module and run it, which can be useful to ensure you're running pytest inside the correct environment. I ran into this problem because I used `pip install pytest` into a conda environment when I should have done `conda install pytest`.\n", "\n", @@ -92,11 +92,11 @@ " assert energy(state) == E_prediction_all_the_same(L)\n", "```\n", "\n", - "I will defer to external resources for a full discussion of the philosphy of testing but I generally think of tests as an aid to my future debugging. If I make a change that breaks something then I want my tests to catch that and to make it clear what has broken. As such I generally put tests that check very basic properties of my code early on in the file and then follow them with tests that probe more subtle things or more obscure edges cases.\n", + "I will defer to external resources for a full discussion of the philosophy of testing, but I generally think of tests as an aid to my future debugging. If I make a change that breaks something then I want my tests to catch that and to make it clear what has broken. As such I generally put tests that check very basic properties of my code early on in the file and then follow them with tests that probe more subtle things or more obscure edges cases.\n", "\n", "`test_exact_energies` checks that the energies of our exact states come out as we calculated they should in chapter 1. This is testing a very limited space of the possible inputs to `energy` so we'd like to find some way to be more confident that our implementation is correct.\n", "\n", - "One was is to test multiple independant implementations against one another: `test_energy_implementations` checks our numpy implementation against our numba one. This should catch implementation bugs because it's unlikely we will make the same such error in both implementations. \n", + "One is to test multiple independent implementations against one another: `test_energy_implementations` checks our numpy implementation against our numba one. This should catch implementation bugs because it's unlikely we will make the same such error in both implementations. \n", "\n", "```python\n", "def test_energy_implementations():\n", @@ -104,7 +104,7 @@ " assert np.allclose(energy(state), energy_numpy(state))\n", "```\n", "\n", - "However if we have made some logical errors in how we've defined the energy, that error will likely appear in both implememtations and thus won't be caught by this. \n", + "However, if we have made some logical errors in how we've defined the energy, that error will likely appear in both implementations and thus won't be caught by this. \n", "\n", "Generally what we will do now, is that as we write more code or add new functionality we will add tests to check that functionality." ] @@ -116,7 +116,7 @@ "source": [ "## Coverage Testing\n", "\n", - "A useful little trick for testing, are tools like pytest-cov that can measure *coverage*, that is, the amount of your code base that is activate by your tests. Unfortunatley Numba does not play super well with pytest-cov so we have to turn off numba to generate the test report using an environment variable.\n", + "A useful little trick for testing, are tools like pytest-cov that can measure *coverage*, that is, the amount of your code base that is activated by your tests. Unfortunately Numba does not play super well with pytest-cov, so we have to turn off numba to generate the test report using an environment variable.\n", "\n", "```bash\n", "(recode) tom@TomsLaptop ReCoDE_MCMCFF % pip install pytest-cov # install the coverage testing plugin\n", @@ -144,7 +144,7 @@ "=================================================== 3 passed in 1.89s ===================================================\n", "```\n", "\n", - "Ok so this is telling us that we currently test 86% of the lines in ising_model.py. We can also change `--cov-report=html` to get a really nice html output which shows which parts of your code aren't being run.\n", + "Ok so this is telling us that we currently test 86% of the lines in ising_model.py. We can also change `--cov-report=html` to get a really nice `html` output which shows which parts of your code aren't being run.\n", "\n", "A warning though, testing 100% of your lines of code doesn't mean it's correct, you need to think carefully about the data you test on, try to pick the hardest examples you can think of! What edge cases might there be that would break your code? Zero, empty strings and empty arrays are classic examples." ] @@ -156,7 +156,7 @@ "source": [ "## Advanced Testing Methods: Property Based Testing\n", "\n", - "I won't do into huge detail here but I thought it would be nice to make you aware of a nice library called `Hypothesis` that helps with this problem of finding edge cases. `Hypothesis` gives you tools to generate randomised inputs to functions, so as long as you can come up with some way to verify the output is correct or has the correct _properties_ (or just that the code doens't throw and error!) then this can be a powerful method of testing. \n", + "I won't do into huge detail here, but I thought it would be nice to make you aware of a nice library called `Hypothesis` that helps with this problem of finding edge cases. `Hypothesis` gives you tools to generate randomised inputs to functions, so as long as you can come up with some way to verify the output is correct or has the correct _properties_ (or just that the code doesn't throw and error!) then this can be a powerful method of testing. \n", "\n", "\n", "Take a look in `test_energy_using_hypothesis.py`\n", @@ -169,7 +169,7 @@ "def test_generated_states(state):\n", " assert np.allclose(energy(state), energy_numpy(state))\n", "```\n", - "You tell Hypothesis how to generate the test data, in this case we use some numpy specifc code to generate 2 dimensional arrays with `dtype = int` and entries randomly sampled from `[1, -1]`. We use the same trick as before of checking two implementations against one another." + "You tell Hypothesis how to generate the test data, in this case we use some numpy specific code to generate 2 dimensional arrays with `dtype = int` and entries randomly sampled from `[1, -1]`. We use the same trick as before of checking two implementations against one another." ] }, { @@ -181,14 +181,14 @@ "source": [ "## Testing Stochastic Code\n", "\n", - "We have a interesting problem here, most testing assumes that for the same inputs we will always get the same outputs but our MCMC sampler is a stochastic algorithm. So how can we test it? I can see three mains routes we can take:\n", + "We have an interesting problem here, most testing assumes that for the same inputs we will always get the same outputs but our MCMC sampler is a stochastic algorithm. So how can we test it? I can see three mains routes we can take:\n", "\n", "- Fix the seed of the random number generator to make it deterministic\n", "- Do statistical tests on the output \n", "- Use property based testing (see above)\n", "\n", "### Fixed Seeds\n", - "The random number generators we typically use are really pseudo-random number generators: given a value called a seed they generate a deterministic pattern that looks for most purposes like a random sequence. Typically the seed is determined by something that is _more random_ such as a physical random number generator. However if we fix the seed we can create reproducabile plots and test our code more easily!" + "The random number generators we typically use are really pseudo-random number generators: given a value called a seed they generate a deterministic pattern that looks for most purposes like a random sequence. Typically, the seed is determined by something that is _more random_ such as a physical random number generator. However, if we fix the seed we can create reproducible plots and test our code more easily!" ] }, { @@ -242,7 +242,7 @@ "id": "fb281250-0f08-43a8-bcb2-4b9e2c262cd9", "metadata": {}, "source": [ - "However this has a major drawback, if we want this to work we must always generate the same random numbers in the same order and use them in the same way if we want the output to be the same. This is a problem because we might want to make a change to our MCMC sampler in a way that changes the way it call the rng but still want to compare it to the previous version. In this case we have to use statistical tests instead.\n", + "However, this has a major drawback, if we want this to work we must always generate the same random numbers in the same order and use them in the same way if we want the output to be the same. This is a problem because we might want to make a change to our MCMC sampler in a way that changes the way it calls the rng but still want to compare it to the previous version. In this case we have to use statistical tests instead.\n", "\n", "### Statistical Tests\n", "If we want to verify that two different implementations of our algorithm agree or that the output matches our expectations, we can use something like a t-test to check our samples. Now this gets complicated very fast but bear with me for this simple example:" @@ -284,7 +284,7 @@ }, { "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD4CAYAAAAXUaZHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAb7ElEQVR4nO3df5BU5b3n8fd3fjHIb5gBBoYwGEACQoBCoqvRITGKxohbFRdM4mJwi0qpiana1Ab3lonXLSrsVpLKJq5VoXIN3CSrktKUJN7rDXGhEg0XAhGJyE8FYWTkxwjyS2Bm+O4fc5gMMNDnYbrn6W4/r6quPuf0c7o/3QzfeeY5p59j7o6IiBSXktgBREQk+1TcRUSKkIq7iEgRUnEXESlCKu4iIkWoLHYAgKqqKq+rq4sdQyJoamoCYNCgQZGTpLN161YArrrqqshJ0im0z1fCrF+//qC7V3f2WF4U97q6OtatWxc7hkSwZMkSAO67776oOdK67rrrAFi9enXkJOkU2ucrYczsnYs9lhfFXaRQbN68OXYEkVRU3EUCjBw5MnYEkVRU3EUCDBgwIHYEkVRU3EUCnDx5MnaEgtXc3ExDQ4M+w8tQWVlJbW0t5eXlqfdRcRcJsGXLltgRClZDQwN9+vShrq4OM4sdp2C4O01NTTQ0NDBq1KjU+6m4iwTQKbuX7+TJkyrsl8HMGDRoEAcOHAjaT8VdJED//v1jRyhoKuyX53I+NxV3kQAnTpyIHUEkFU0/IBJg27ZtbNu2LXYM6YKXXnqJq666itGjR7No0aLYcXJGPXcpCHULXoz22rsWfb59+corr4yWQ7qutbWVBx98kBUrVlBbW8s111zDnXfeyfjx42NHyzr13EUC9O3bl759+8aOIZdp7dq1jB49miuvvJKKigrmzJnDCy+8EDtWTqi4iwQ4fvw4x48fjx2jKNTX17fPfdPc3Ex9fT2//OUvgbZjG/X19Tz77LMAfPDBB9TX1/P8888DcPDgQerr6/ntb38LwHvvvZfqNd99911GjBjRvl5bW8u7776brbeUVzQsIxJg+/btsSNIF3R2zehiPYNHxV0kwMc//vHYEYrGqlWr2pfLy8vPWb/iiivOWe/Xr98561VVVeesDx06NNVr1tbWsmfPnvb1hoYGhg0bFhq9IGhYRiRAnz596NOnT+wYcpmuueYatm/fzs6dOzl9+jTPPPMMd955Z+xYOaGeu0iAY8eOxY4gXVBWVsYTTzzBrbfeSmtrK/PmzWPChAmxY+WEirtIgB07dsSOIF10++23c/vtt8eOkXMq7iIBRo8eHTuCSCoq7iIBevfuHTuCSCoq7iIBjh49GjuCSCoq7iIB3nrrrdgRRFJRcRcJMGbMmNgRRFJRcRcJ0KtXr9gRRFLRl5hEAhw5coQjR47EjiGXycy4995729dbWlqorq7mjjvuiJgqN9RzFwnw9ttvx44gXdCrVy/eeOMNPvzwQ3r27MmKFSsYPnx47Fg5oZ67SICxY8cyduzY2DGkC2677TZefLHt+gBPP/0099xzT/tjx48fZ968eVxzzTVMmTKlfTrgXbt28elPf5qpU6cydepU/vznPwNt8+PU19fzxS9+kXHjxvHlL3+508nJYkjVczezXcBRoBVocfdpZjYQeBaoA3YB/8ndDyXtHwHuT9p/w93/LevJRSK44oorYkcoCt/85jfZsGFDVp9z8uTJ/OhHP8rYbs6cOTz++OPccccdbNy4kXnz5vGnP/0JgIULF/KZz3yGp556isOHDzN9+nRuvvlmBg8ezIoVK6isrGT79u3cc889rFu3DoDXXnuNTZs2MWzYMK6//npeffVVbrjhhqy+t8sRMiwzw90PdlhfALzs7ovMbEGy/m0zGw/MASYAw4A/mNlYd2/NWmqRSA4fPhw7gnTRpEmT2LVrF08//fQF0xD8/ve/Z/ny5Xz/+98H4OTJk+zevZthw4bx0EMPsWHDBkpLS8+51OL06dOpra0F2n7B7Nq1q+CK+/lmAfXJ8lJgFfDtZPsz7n4K2GlmO4DpwOouvJZIXti1a1fsCEUhTQ87l+68806+9a1vsWrVKpqamtq3uzvPPfccV1111TntH3vsMYYMGcLrr7/OmTNnqKysbH+sR48e7culpaW0tLTk/g2kkHbM3YHfm9l6M5ufbBvi7o0Ayf3gZPtwYE+HfRuSbSIFb9y4cYwbNy52DOmiefPm8Z3vfIeJEyees/3WW2/lJz/5Sfu4+WuvvQa0XQmqpqaGkpISfvGLX9Damv8DEWmL+/XuPhW4DXjQzG68RNvOLmtywREGM5tvZuvMbN2BAwdSxhCJq7Ky8pxemxSm2tpaHn744Qu2P/roozQ3NzNp0iSuvvpqHn30UQAeeOABli5dyrXXXsu2bdsK4vsOqYZl3H1vcr/fzH5D2zDLPjOrcfdGM6sB9ifNG4ARHXavBfZ28pyLgcUA06ZNy4/DyyIZHDp0KHYE6YLO5uOvr6+nvr4egJ49e/LTn/70gjZjxoxh48aN7evf+973LtgX4Iknnshu4C7I2HM3s15m1ufsMnAL8AawHJibNJsLnL2E+HJgjpn1MLNRwBhgbbaDi8Twzjvv8M4778SOIZJRmp77EOA3yUVky4D/6+4vmdlfgGVmdj+wG7gbwN03mdky4E2gBXhQZ8pIsfjEJz4RO4JIKhmLu7u/DXyyk+1NwGcvss9CYGGX04nkmY5nRojkM00/IBLg/fffjx1BJBUVd5EAu3fvjh1BJBUVd5EA48ePjx1BJBVNHCYSoKKigoqKitgx5DItXLiQCRMmMGnSJCZPnsyaNWuAtm/MnjhxotN9lixZwkMPPdTp9urqaiZPntx+e/PNN3OaP4R67iIBOn5VXQrL6tWr+d3vfsdf//pXevTowcGDBzl9+jTQVty/8pWvBE8MN3v27Eue297a2kppaelF1y+mpaWFsrKulWf13EUC7Nmzhz179mRuKHmnsbGRqqqq9jOeqqqqGDZsGD/+8Y/Zu3cvM2bMYMaMGQD8/Oc/Z+zYsdx00028+uqrQa+zatUqZsyYwZe+9CUmTpx4wfrJkyf56le/ysSJE5kyZQorV64E2v4SuPvuu/nCF77ALbfc0uX3q567SIAJEybEjlAUXnrpJd57772sPufQoUOZOXPmRR+/5ZZbePzxxxk7diw333wzs2fP5qabbuIb3/gGP/zhD1m5ciVVVVU0Njby3e9+l/Xr19OvXz9mzJjBlClTOn3OZ599lldeeaV9ffXqtvkR165dyxtvvMGoUaNYtWrVOes/+MEPAPjb3/7Gli1buOWWW9pnmVy9ejUbN25k4MCBXf481HMXCVBeXk55eXnsGHIZevfuzfr161m8eDHV1dXMnj2bJUuWXNBuzZo11NfXU11dTUVFBbNnz77oc86ePZsNGza033r27Am0TQM8atSo9nYd11955ZX2S/2NGzeOkSNHthf3z33uc1kp7KCeu0iQgwcPZm4kGV2qh51LpaWl7fPBTJw4kaVLl3Lfffdd0C75Rv5lO39isY7rl7pSUzYnJFPPXSRAQ0MDDQ0NsWPIZdi6dSvbt29vX9+wYQMjR44EoE+fPhw9ehSAT33qU+3zvDc3N/PrX/86qzluvPFGfvWrXwGwbds2du/efcH88dmgnrtIgKuvvjp2BLlMx44d4+tf/zqHDx+mrKyM0aNHs3jxYgDmz5/PbbfdRk1NDStXruSxxx7juuuuo6amhqlTp150/vbzx9yffPLJjDkeeOABvva1rzFx4kTKyspYsmRJTqa1sHy4mOu0adP87PUI5aPl7JhnZ38ad1S34MXch7mIXYs+3758dnrXVatW5fx1s/GeZ1ZsAeCl02EXGOn4nrNl8+bNmnitCzr7/MxsvbtP66y9eu4iAfbv35+5kUgeUHEXCbB37wXXnRHJSyruIgEmTZoUO0JBc/cun4nyUXQ5w+c6W0YkQElJCSUl+m9zOSorK2lqarqsQvVR5u40NTUFX7tXPXeRAPv27YsdoWDV1tbS0NDAgQMHYkcpOJWVldTW1gbto+IuEqCxsTF2hIJVXl5+zrc2JbdU3EUCfPKTF1xxUiQvqbiLZNDxfPP3dr5/wTaRfKTiLhKg9fih2BFEUlFxFwmg4i6FQsVdJEDF4CtjRxBJRSfsiogUIfXcRQK0Hns/dgSRVFTcRQK0nvggdgSRVFTcRQJUDNaXcKQwaMxdRKQIqecuEkBj7lIoUvfczazUzF4zs98l6wPNbIWZbU/uB3Ro+4iZ7TCzrWZ2ay6Ci8TQ+uERWj88EjuGSEYhwzIPA5s7rC8AXnb3McDLyTpmNh6YA0wAZgJPmllpduKKxFVRXUdFdV3sGCIZpSruZlYLfB74WYfNs4ClyfJS4K4O259x91PuvhPYAUzPSloREUklbc/9R8B/A8502DbE3RsBkvvByfbhwJ4O7RqSbecws/lmts7M1ml+ZykUrUebaD3aFDuGSEYZi7uZ3QHsd/f1KZ+zs2toXXDpFXdf7O7T3H1adXV1yqcWievMyWOcOXksdgyRjNKcLXM9cKeZ3Q5UAn3N7JfAPjOrcfdGM6sBzl4WvgEY0WH/WkBXFZaiUF49MnYEkVQy9tzd/RF3r3X3OtoOlP4/d/8KsByYmzSbC7yQLC8H5phZDzMbBYwB1mY9uYiIXFRXznNfBCwzs/uB3cDdAO6+ycyWAW8CLcCD7t7a5aQieaDlyMHYEURSCSru7r4KWJUsNwGfvUi7hcDCLmYTyTt++kTsCCKp6BuqIgHKqz4WO4JIKppbRkSkCKnnLhKg5Yi+kyGFQcVdJICfPhk7gkgqKu4iAcqrRmRuJJIHNOYuIlKE1HMXCdDywf7MjUTygIq7SABvORU7gkgqKu4iAcoHacxdCoPG3EVEipB67iIBWj7YFzuCSCoq7iIBvKU5dgSRVFTcRQKUD6qNHUEkFY25i4gUIfXcRQK0HH4vdgSRVFTcRQL4GV13RgqDirtIgPKBw2NHEElFY+4iIkVIPXeRABpzl0Kh4i4SwP1M7Agiqai4iwQoHzAsdgSRVDTmLiJShNRzFwnQcqgxdgSRVNRzFxEpQuq5iwQoG1ATO4JIKuq5i4gUIfXcRQI0H9obO4JIKiruIgHM9MeuFIaMP6lmVmlma83sdTPbZGb/mGwfaGYrzGx7cj+gwz6PmNkOM9tqZrfm8g2IdKey/kMp6z80dgyRjNJ0Q04Bn3H3TwKTgZlmdi2wAHjZ3ccALyfrmNl4YA4wAZgJPGlmpTnILiIiF5GxuHubY8lqeXJzYBawNNm+FLgrWZ4FPOPup9x9J7ADmJ7N0CKxNL//Ls3vvxs7hkhGqQYQzazUzDYA+4EV7r4GGOLujQDJ/eCk+XBgT4fdG5Jt5z/nfDNbZ2brDhw40IW3INJ9rKQUK9EfopL/UhV3d29198lALTDdzK6+RHPr7Ck6ec7F7j7N3adVV1enCisSm8bcpVAEHfp398PAKtrG0veZWQ1Acr8/adYAjOiwWy2g88dERLpRmrNlqs2sf7LcE7gZ2AIsB+YmzeYCLyTLy4E5ZtbDzEYBY4C1Wc4tEkVzUwPNTQ2xY4hklOY89xpgaXLGSwmwzN1/Z2argWVmdj+wG7gbwN03mdky4E2gBXjQ3XXhSSkKVlYeO4JIKhmLu7tvBKZ0sr0J+OxF9lkILOxyOpE8U9ZvSOwIIqno63YiIkVI0w+IBGhu2pO5kUgeUHEXCWBlPWJHEElFxV0kQFm/wZkbieQBjbmLiBQh9dxFAjQf1Ji7FAYVd5EAVlEZO4JIKiruIgHK+moeJCkMGnMXESlC6rmLBGg+uDt2BJFUVNxFAljFFbEjiKSi4i4SoKxvVewIIqlozF1EpAip5y4SoPnAO7EjiKSi4i5B6ha8mNXnm1nRBMBjWX7eXCmp7B07gkgqKu4iAUr7DIodQSQVjbmLiBQh9dxFApw+sCt2BJFUVNxFApT27Bs7gkgqKu4iAUp7D4wdQSQVjbmLiBQh9dxFApzevzN2BJFUVNxFApRe0S92BJFUVNxFAmjMXQqFxtxFRIqQeu4iAU7vfzt2BJFUVNxFApT2GhA7gkgqKu4iAVTcpVBkHHM3sxFmttLMNpvZJjN7ONk+0MxWmNn25H5Ah30eMbMdZrbVzG7N5RsQ6VbubTeRPJfmgGoL8F/d/RPAtcCDZjYeWAC87O5jgJeTdZLH5gATgJnAk2ZWmovwIt3t9IGdnD6gc90l/2Us7u7e6O5/TZaPApuB4cAsYGnSbClwV7I8C3jG3U+5+05gBzA9y7lFoijtNZDSXjodUvJf0KmQZlYHTAHWAEPcvRHafgEAg5Nmw4E9HXZrSLad/1zzzWydma07cODAZUQX6X6lvfpT2qt/7BgiGaUu7mbWG3gO+Ka7H7lU0062XTBI6e6L3X2au0+rrq5OG0MkLj/TdhPJc6nOljGzctoK+6/c/flk8z4zq3H3RjOrAfYn2xuAER12rwX2ZiuwSEyaz10KRZqzZQz4J2Czu/+ww0PLgbnJ8lzghQ7b55hZDzMbBYwB1mYvskg8pb0HUdpbl9qT/Jem5349cC/wNzPbkGz778AiYJmZ3Q/sBu4GcPdNZrYMeJO2M20edPfWbAcXiUETh0mhyFjc3f0VOh9HB/jsRfZZCCzsQi6R/HRG4+1SGPQNVZEApw/uih1BJBUVd5EAZb2rYkcQSUXFXSRAyRW6QLYUBhV3kQB+RucGSGFQcRcJ0HzwndgRRFJRcRcJUNZH36aWwqDiLhKgpGef2BFEUlFxFwngrS2xI4ikouIuEqC5aXfsCCKpqLiLBCjrOzhzI5E8oOIuEqCksnfsCCKpqLiLBPDW5tgRRFJRcRcJ0Ny0J3MjkTyg4i4SQGPuUihU3EUCaMxdCoWKu0gAbzkdO4JIKiruIgGa32+IHUEkFRV3kQBl/YbEjiCSioq7SICSHr1iRxBJRcVdJIA3n4odQSQVFXeRAM2H3o0dQSQVFXeRAGX9hsaOIJKKirtIgJIeV8SOIJKKirtIAG8+GTuCSCoq7iIBmg/tjR2hW9QteDHK6+5a9Pkor1uMVNxFApT1r4kdQSQVFXeRACUVPWNHEElFxV0kgJ/WmLsUhpJMDczsKTPbb2ZvdNg20MxWmNn25H5Ah8ceMbMdZrbVzG7NVXCRGJoP76X58Edj3F0KW5qe+xLgCeCfO2xbALzs7ovMbEGy/m0zGw/MASYAw4A/mNlYd2/NbuyPtlgHuwTK+w+LHUEklYw9d3f/I/D+eZtnAUuT5aXAXR22P+Pup9x9J7ADmJ6dqCLxWUUlVlEZO4ZIRhmL+0UMcfdGgOT+7OVphgMdr0PWkGwTKQpnTn/ImdMfxo4hklG2D6haJ9u804Zm84H5AB/72MeyHEMkN1oON8aOIJLK5fbc95lZDUByvz/Z3gCM6NCuFuj06JO7L3b3ae4+rbq6+jJjiHSv8gHDKB+gcXfJf5db3JcDc5PlucALHbbPMbMeZjYKGAOs7VpEkfxh5ZVYucbcJf9lHJYxs6eBeqDKzBqA7wKLgGVmdj+wG7gbwN03mdky4E2gBXhQZ8pIMTlz6kTsCCKpZCzu7n7PRR767EXaLwQWdiWUSL5q+eC92BFEUtE3VEUClA/QyV9SGFTcRQJYeY/YEURSUXEXCXDm1PHYEURSUXEXCdDywb7YEURSUXEXCVA+sDZ2BJFUVNxFAlhZRewIIqmouIsEOHPyWOwIIqmouIsEaDmyP3MjkTyg4i4SoHzQiMyNRPKAirtIACstjx1BJBUVd5EAGnOXQqHiLhJAY+5SKFTcRQKUD9KFZaQwqLiLBLBS/ZfJpVgXf9+16PNRXjeX9JMqEuDMh0djRxBJRcVdJEDL0QOxI4ikouIuEqC8amTsCCKpqLiLBLCS0tgRRFJRcRcJcObEkdgRRFJRcRcJ0HLsYOwIIqmouIsEqKiqix1BJBUVd5EQJSWxE4ikouIuEqD1xAexI4ikouIuEqD1WFPsCCKpqLiLBKiorosdQSQVFfcuiDUPhkRkGnOXwqDiLhKg9fjh2BFEUlFxFwnQevz92BFEUlFxFwlQUT0qdgSRVHJW3M1sJvC/gVLgZ+6+KFevJdJtzGInkByIefwsV3PJ56S4m1kp8H+AzwENwF/MbLm7v5mL19OBTekurccPxY4gkkquDv1PB3a4+9vufhp4BpiVo9cS6Tatxw+pwEtBMHfP/pOafRGY6e7/JVm/F/iUuz/Uoc18YH6yehWwNetBsq8KKJSZowopKxRW3kLKCoWVt5CyQvy8I929urMHcjXm3tnA5Dm/Rdx9MbA4R6+fE2a2zt2nxc6RRiFlhcLKW0hZobDyFlJWyO+8uRqWaQBGdFivBfbm6LVEROQ8uSrufwHGmNkoM6sA5gDLc/RaIiJynpwMy7h7i5k9BPwbbadCPuXum3LxWt2skIaRCikrFFbeQsoKhZW3kLJCHufNyQFVERGJS7MgiYgUIRV3EZEipOJ+CWY20MxWmNn25H5AJ20qzWytmb1uZpvM7B/zOOsIM1tpZpuTrA/HyJpkyZg3afeUme03szciZJxpZlvNbIeZLejkcTOzHyePbzSzqd2dsUOWTFnHmdlqMztlZt+KkfG8PJnyfjn5TDea2Z/N7JMxciZZMmWdleTcYGbrzOyGGDkv4O66XeQG/C9gQbK8APifnbQxoHeyXA6sAa7N06w1wNRkuQ+wDRifr59t8tiNwFTgjW7OVwq8BVwJVACvn/9ZAbcD/5r8DFwLrIn0WabJOhi4BlgIfCtGzsC8/wEYkCzfluefbW/+fvxyErAl5ud79qae+6XNApYmy0uBu85v4G2OJavlyS3GUeo0WRvd/a/J8lFgMzC8uwKeJ2NeAHf/IxBjnt00U2jMAv45+Rn4d6C/mdV0d1BSZHX3/e7+F6A5Qr7zpcn7Z3c/O8/Dv9P2XZkY0mQ95kllB3oR5///BVTcL22IuzdCW2GkrfdzATMrNbMNwH5ghbuv6b6I7VJlPcvM6oAptP2lEUNQ3giGA3s6rDdw4S/CNG26Q77kSCs07/20/YUUQ6qsZvYfzWwL8CIwr5uyXdJHfj53M/sDMLSTh/4h7XO4eysw2cz6A78xs6vdPetjxNnImjxPb+A54JvufiQb2S7yOlnJG0nGKTRStukO+ZIjrdR5zWwGbcU91jh2qqzu/hva/u/fCPwP4OZcB8vkI1/c3f2i/whmts/Maty9Mflze3+G5zpsZquAmUDWi3s2sppZOW2F/Vfu/ny2M3aUzc82gjRTaOTLNBv5kiOtVHnNbBLwM+A2d2/qpmznC/ps3f2PZvZxM6ty96gToGlY5tKWA3OT5bnAC+c3MLPqpMeOmfWk7Tf2lu4K2EGarAb8E7DZ3X/Yjdk6kzFvZGmm0FgO/OfkrJlrgQ/ODjV1s0Kb7iNjXjP7GPA8cK+7b4uQ8aw0WUcn/7dIzpiqAGL9Mvq72Ed08/kGDAJeBrYn9wOT7cOAf/G/Hx1/DdhIW2/9O3mc9Qba/qTcCGxIbrfna95k/WmgkbYDgQ3A/d2Y8Xbazih6C/iHZNvXgK8ly0bbRWneAv4GTIv4s5op69Dk8zsCHE6W++Zx3p8Bhzr8nK7L46zfBjYlOVcDN8TK2vGm6QdERIqQhmVERIqQiruISBFScRcRKUIq7iIiRUjFXUSkCKm4i4gUIRV3EZEi9P8BiBgjshCt/QsAAAAASUVORK5CYII=\n", + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD4CAYAAAAXUaZHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAb7ElEQVR4nO3df5BU5b3n8fd3fjHIb5gBBoYwGEACQoBCoqvRITGKxohbFRdM4mJwi0qpiana1Ab3lonXLSrsVpLKJq5VoXIN3CSrktKUJN7rDXGhEg0XAhGJyE8FYWTkxwjyS2Bm+O4fc5gMMNDnYbrn6W4/r6quPuf0c7o/3QzfeeY5p59j7o6IiBSXktgBREQk+1TcRUSKkIq7iEgRUnEXESlCKu4iIkWoLHYAgKqqKq+rq4sdQyJoamoCYNCgQZGTpLN161YArrrqqshJ0im0z1fCrF+//qC7V3f2WF4U97q6OtatWxc7hkSwZMkSAO67776oOdK67rrrAFi9enXkJOkU2ucrYczsnYs9lhfFXaRQbN68OXYEkVRU3EUCjBw5MnYEkVRU3EUCDBgwIHYEkVRU3EUCnDx5MnaEgtXc3ExDQ4M+w8tQWVlJbW0t5eXlqfdRcRcJsGXLltgRClZDQwN9+vShrq4OM4sdp2C4O01NTTQ0NDBq1KjU+6m4iwTQKbuX7+TJkyrsl8HMGDRoEAcOHAjaT8VdJED//v1jRyhoKuyX53I+NxV3kQAnTpyIHUEkFU0/IBJg27ZtbNu2LXYM6YKXXnqJq666itGjR7No0aLYcXJGPXcpCHULXoz22rsWfb59+corr4yWQ7qutbWVBx98kBUrVlBbW8s111zDnXfeyfjx42NHyzr13EUC9O3bl759+8aOIZdp7dq1jB49miuvvJKKigrmzJnDCy+8EDtWTqi4iwQ4fvw4x48fjx2jKNTX17fPfdPc3Ex9fT2//OUvgbZjG/X19Tz77LMAfPDBB9TX1/P8888DcPDgQerr6/ntb38LwHvvvZfqNd99911GjBjRvl5bW8u7776brbeUVzQsIxJg+/btsSNIF3R2zehiPYNHxV0kwMc//vHYEYrGqlWr2pfLy8vPWb/iiivOWe/Xr98561VVVeesDx06NNVr1tbWsmfPnvb1hoYGhg0bFhq9IGhYRiRAnz596NOnT+wYcpmuueYatm/fzs6dOzl9+jTPPPMMd955Z+xYOaGeu0iAY8eOxY4gXVBWVsYTTzzBrbfeSmtrK/PmzWPChAmxY+WEirtIgB07dsSOIF10++23c/vtt8eOkXMq7iIBRo8eHTuCSCoq7iIBevfuHTuCSCoq7iIBjh49GjuCSCoq7iIB3nrrrdgRRFJRcRcJMGbMmNgRRFJRcRcJ0KtXr9gRRFLRl5hEAhw5coQjR47EjiGXycy4995729dbWlqorq7mjjvuiJgqN9RzFwnw9ttvx44gXdCrVy/eeOMNPvzwQ3r27MmKFSsYPnx47Fg5oZ67SICxY8cyduzY2DGkC2677TZefLHt+gBPP/0099xzT/tjx48fZ968eVxzzTVMmTKlfTrgXbt28elPf5qpU6cydepU/vznPwNt8+PU19fzxS9+kXHjxvHlL3+508nJYkjVczezXcBRoBVocfdpZjYQeBaoA3YB/8ndDyXtHwHuT9p/w93/LevJRSK44oorYkcoCt/85jfZsGFDVp9z8uTJ/OhHP8rYbs6cOTz++OPccccdbNy4kXnz5vGnP/0JgIULF/KZz3yGp556isOHDzN9+nRuvvlmBg8ezIoVK6isrGT79u3cc889rFu3DoDXXnuNTZs2MWzYMK6//npeffVVbrjhhqy+t8sRMiwzw90PdlhfALzs7ovMbEGy/m0zGw/MASYAw4A/mNlYd2/NWmqRSA4fPhw7gnTRpEmT2LVrF08//fQF0xD8/ve/Z/ny5Xz/+98H4OTJk+zevZthw4bx0EMPsWHDBkpLS8+51OL06dOpra0F2n7B7Nq1q+CK+/lmAfXJ8lJgFfDtZPsz7n4K2GlmO4DpwOouvJZIXti1a1fsCEUhTQ87l+68806+9a1vsWrVKpqamtq3uzvPPfccV1111TntH3vsMYYMGcLrr7/OmTNnqKysbH+sR48e7culpaW0tLTk/g2kkHbM3YHfm9l6M5ufbBvi7o0Ayf3gZPtwYE+HfRuSbSIFb9y4cYwbNy52DOmiefPm8Z3vfIeJEyees/3WW2/lJz/5Sfu4+WuvvQa0XQmqpqaGkpISfvGLX9Damv8DEWmL+/XuPhW4DXjQzG68RNvOLmtywREGM5tvZuvMbN2BAwdSxhCJq7Ky8pxemxSm2tpaHn744Qu2P/roozQ3NzNp0iSuvvpqHn30UQAeeOABli5dyrXXXsu2bdsK4vsOqYZl3H1vcr/fzH5D2zDLPjOrcfdGM6sB9ifNG4ARHXavBfZ28pyLgcUA06ZNy4/DyyIZHDp0KHYE6YLO5uOvr6+nvr4egJ49e/LTn/70gjZjxoxh48aN7evf+973LtgX4Iknnshu4C7I2HM3s15m1ufsMnAL8AawHJibNJsLnL2E+HJgjpn1MLNRwBhgbbaDi8Twzjvv8M4778SOIZJRmp77EOA3yUVky4D/6+4vmdlfgGVmdj+wG7gbwN03mdky4E2gBXhQZ8pIsfjEJz4RO4JIKhmLu7u/DXyyk+1NwGcvss9CYGGX04nkmY5nRojkM00/IBLg/fffjx1BJBUVd5EAu3fvjh1BJBUVd5EA48ePjx1BJBVNHCYSoKKigoqKitgx5DItXLiQCRMmMGnSJCZPnsyaNWuAtm/MnjhxotN9lixZwkMPPdTp9urqaiZPntx+e/PNN3OaP4R67iIBOn5VXQrL6tWr+d3vfsdf//pXevTowcGDBzl9+jTQVty/8pWvBE8MN3v27Eue297a2kppaelF1y+mpaWFsrKulWf13EUC7Nmzhz179mRuKHmnsbGRqqqq9jOeqqqqGDZsGD/+8Y/Zu3cvM2bMYMaMGQD8/Oc/Z+zYsdx00028+uqrQa+zatUqZsyYwZe+9CUmTpx4wfrJkyf56le/ysSJE5kyZQorV64E2v4SuPvuu/nCF77ALbfc0uX3q567SIAJEybEjlAUXnrpJd57772sPufQoUOZOXPmRR+/5ZZbePzxxxk7diw333wzs2fP5qabbuIb3/gGP/zhD1m5ciVVVVU0Njby3e9+l/Xr19OvXz9mzJjBlClTOn3OZ599lldeeaV9ffXqtvkR165dyxtvvMGoUaNYtWrVOes/+MEPAPjb3/7Gli1buOWWW9pnmVy9ejUbN25k4MCBXf481HMXCVBeXk55eXnsGHIZevfuzfr161m8eDHV1dXMnj2bJUuWXNBuzZo11NfXU11dTUVFBbNnz77oc86ePZsNGza033r27Am0TQM8atSo9nYd11955ZX2S/2NGzeOkSNHthf3z33uc1kp7KCeu0iQgwcPZm4kGV2qh51LpaWl7fPBTJw4kaVLl3Lfffdd0C75Rv5lO39isY7rl7pSUzYnJFPPXSRAQ0MDDQ0NsWPIZdi6dSvbt29vX9+wYQMjR44EoE+fPhw9ehSAT33qU+3zvDc3N/PrX/86qzluvPFGfvWrXwGwbds2du/efcH88dmgnrtIgKuvvjp2BLlMx44d4+tf/zqHDx+mrKyM0aNHs3jxYgDmz5/PbbfdRk1NDStXruSxxx7juuuuo6amhqlTp150/vbzx9yffPLJjDkeeOABvva1rzFx4kTKyspYsmRJTqa1sHy4mOu0adP87PUI5aPl7JhnZ38ad1S34MXch7mIXYs+3758dnrXVatW5fx1s/GeZ1ZsAeCl02EXGOn4nrNl8+bNmnitCzr7/MxsvbtP66y9eu4iAfbv35+5kUgeUHEXCbB37wXXnRHJSyruIgEmTZoUO0JBc/cun4nyUXQ5w+c6W0YkQElJCSUl+m9zOSorK2lqarqsQvVR5u40NTUFX7tXPXeRAPv27YsdoWDV1tbS0NDAgQMHYkcpOJWVldTW1gbto+IuEqCxsTF2hIJVXl5+zrc2JbdU3EUCfPKTF1xxUiQvqbiLZNDxfPP3dr5/wTaRfKTiLhKg9fih2BFEUlFxFwmg4i6FQsVdJEDF4CtjRxBJRSfsiogUIfXcRQK0Hns/dgSRVFTcRQK0nvggdgSRVFTcRQJUDNaXcKQwaMxdRKQIqecuEkBj7lIoUvfczazUzF4zs98l6wPNbIWZbU/uB3Ro+4iZ7TCzrWZ2ay6Ci8TQ+uERWj88EjuGSEYhwzIPA5s7rC8AXnb3McDLyTpmNh6YA0wAZgJPmllpduKKxFVRXUdFdV3sGCIZpSruZlYLfB74WYfNs4ClyfJS4K4O259x91PuvhPYAUzPSloREUklbc/9R8B/A8502DbE3RsBkvvByfbhwJ4O7RqSbecws/lmts7M1ml+ZykUrUebaD3aFDuGSEYZi7uZ3QHsd/f1KZ+zs2toXXDpFXdf7O7T3H1adXV1yqcWievMyWOcOXksdgyRjNKcLXM9cKeZ3Q5UAn3N7JfAPjOrcfdGM6sBzl4WvgEY0WH/WkBXFZaiUF49MnYEkVQy9tzd/RF3r3X3OtoOlP4/d/8KsByYmzSbC7yQLC8H5phZDzMbBYwB1mY9uYiIXFRXznNfBCwzs/uB3cDdAO6+ycyWAW8CLcCD7t7a5aQieaDlyMHYEURSCSru7r4KWJUsNwGfvUi7hcDCLmYTyTt++kTsCCKp6BuqIgHKqz4WO4JIKppbRkSkCKnnLhKg5Yi+kyGFQcVdJICfPhk7gkgqKu4iAcqrRmRuJJIHNOYuIlKE1HMXCdDywf7MjUTygIq7SABvORU7gkgqKu4iAcoHacxdCoPG3EVEipB67iIBWj7YFzuCSCoq7iIBvKU5dgSRVFTcRQKUD6qNHUEkFY25i4gUIfXcRQK0HH4vdgSRVFTcRQL4GV13RgqDirtIgPKBw2NHEElFY+4iIkVIPXeRABpzl0Kh4i4SwP1M7Agiqai4iwQoHzAsdgSRVDTmLiJShNRzFwnQcqgxdgSRVNRzFxEpQuq5iwQoG1ATO4JIKuq5i4gUIfXcRQI0H9obO4JIKiruIgHM9MeuFIaMP6lmVmlma83sdTPbZGb/mGwfaGYrzGx7cj+gwz6PmNkOM9tqZrfm8g2IdKey/kMp6z80dgyRjNJ0Q04Bn3H3TwKTgZlmdi2wAHjZ3ccALyfrmNl4YA4wAZgJPGlmpTnILiIiF5GxuHubY8lqeXJzYBawNNm+FLgrWZ4FPOPup9x9J7ADmJ7N0CKxNL//Ls3vvxs7hkhGqQYQzazUzDYA+4EV7r4GGOLujQDJ/eCk+XBgT4fdG5Jt5z/nfDNbZ2brDhw40IW3INJ9rKQUK9EfopL/UhV3d29198lALTDdzK6+RHPr7Ck6ec7F7j7N3adVV1enCisSm8bcpVAEHfp398PAKtrG0veZWQ1Acr8/adYAjOiwWy2g88dERLpRmrNlqs2sf7LcE7gZ2AIsB+YmzeYCLyTLy4E5ZtbDzEYBY4C1Wc4tEkVzUwPNTQ2xY4hklOY89xpgaXLGSwmwzN1/Z2argWVmdj+wG7gbwN03mdky4E2gBXjQ3XXhSSkKVlYeO4JIKhmLu7tvBKZ0sr0J+OxF9lkILOxyOpE8U9ZvSOwIIqno63YiIkVI0w+IBGhu2pO5kUgeUHEXCWBlPWJHEElFxV0kQFm/wZkbieQBjbmLiBQh9dxFAjQf1Ji7FAYVd5EAVlEZO4JIKiruIgHK+moeJCkMGnMXESlC6rmLBGg+uDt2BJFUVNxFAljFFbEjiKSi4i4SoKxvVewIIqlozF1EpAip5y4SoPnAO7EjiKSi4i5B6ha8mNXnm1nRBMBjWX7eXCmp7B07gkgqKu4iAUr7DIodQSQVjbmLiBQh9dxFApw+sCt2BJFUVNxFApT27Bs7gkgqKu4iAUp7D4wdQSQVjbmLiBQh9dxFApzevzN2BJFUVNxFApRe0S92BJFUVNxFAmjMXQqFxtxFRIqQeu4iAU7vfzt2BJFUVNxFApT2GhA7gkgqKu4iAVTcpVBkHHM3sxFmttLMNpvZJjN7ONk+0MxWmNn25H5Ah30eMbMdZrbVzG7N5RsQ6VbubTeRPJfmgGoL8F/d/RPAtcCDZjYeWAC87O5jgJeTdZLH5gATgJnAk2ZWmovwIt3t9IGdnD6gc90l/2Us7u7e6O5/TZaPApuB4cAsYGnSbClwV7I8C3jG3U+5+05gBzA9y7lFoijtNZDSXjodUvJf0KmQZlYHTAHWAEPcvRHafgEAg5Nmw4E9HXZrSLad/1zzzWydma07cODAZUQX6X6lvfpT2qt/7BgiGaUu7mbWG3gO+Ka7H7lU0062XTBI6e6L3X2au0+rrq5OG0MkLj/TdhPJc6nOljGzctoK+6/c/flk8z4zq3H3RjOrAfYn2xuAER12rwX2ZiuwSEyaz10KRZqzZQz4J2Czu/+ww0PLgbnJ8lzghQ7b55hZDzMbBYwB1mYvskg8pb0HUdpbl9qT/Jem5349cC/wNzPbkGz778AiYJmZ3Q/sBu4GcPdNZrYMeJO2M20edPfWbAcXiUETh0mhyFjc3f0VOh9HB/jsRfZZCCzsQi6R/HRG4+1SGPQNVZEApw/uih1BJBUVd5EAZb2rYkcQSUXFXSRAyRW6QLYUBhV3kQB+RucGSGFQcRcJ0HzwndgRRFJRcRcJUNZH36aWwqDiLhKgpGef2BFEUlFxFwngrS2xI4ikouIuEqC5aXfsCCKpqLiLBCjrOzhzI5E8oOIuEqCksnfsCCKpqLiLBPDW5tgRRFJRcRcJ0Ny0J3MjkTyg4i4SQGPuUihU3EUCaMxdCoWKu0gAbzkdO4JIKiruIgGa32+IHUEkFRV3kQBl/YbEjiCSioq7SICSHr1iRxBJRcVdJIA3n4odQSQVFXeRAM2H3o0dQSQVFXeRAGX9hsaOIJKKirtIgJIeV8SOIJKKirtIAG8+GTuCSCoq7iIBmg/tjR2hW9QteDHK6+5a9Pkor1uMVNxFApT1r4kdQSQVFXeRACUVPWNHEElFxV0kgJ/WmLsUhpJMDczsKTPbb2ZvdNg20MxWmNn25H5Ah8ceMbMdZrbVzG7NVXCRGJoP76X58Edj3F0KW5qe+xLgCeCfO2xbALzs7ovMbEGy/m0zGw/MASYAw4A/mNlYd2/NbuyPtlgHuwTK+w+LHUEklYw9d3f/I/D+eZtnAUuT5aXAXR22P+Pup9x9J7ADmJ6dqCLxWUUlVlEZO4ZIRhmL+0UMcfdGgOT+7OVphgMdr0PWkGwTKQpnTn/ImdMfxo4hklG2D6haJ9u804Zm84H5AB/72MeyHEMkN1oON8aOIJLK5fbc95lZDUByvz/Z3gCM6NCuFuj06JO7L3b3ae4+rbq6+jJjiHSv8gHDKB+gcXfJf5db3JcDc5PlucALHbbPMbMeZjYKGAOs7VpEkfxh5ZVYucbcJf9lHJYxs6eBeqDKzBqA7wKLgGVmdj+wG7gbwN03mdky4E2gBXhQZ8pIMTlz6kTsCCKpZCzu7n7PRR767EXaLwQWdiWUSL5q+eC92BFEUtE3VEUClA/QyV9SGFTcRQJYeY/YEURSUXEXCXDm1PHYEURSUXEXCdDywb7YEURSUXEXCVA+sDZ2BJFUVNxFAlhZRewIIqmouIsEOHPyWOwIIqmouIsEaDmyP3MjkTyg4i4SoHzQiMyNRPKAirtIACstjx1BJBUVd5EAGnOXQqHiLhJAY+5SKFTcRQKUD9KFZaQwqLiLBLBS/ZfJpVgXf9+16PNRXjeX9JMqEuDMh0djRxBJRcVdJEDL0QOxI4ikouIuEqC8amTsCCKpqLiLBLCS0tgRRFJRcRcJcObEkdgRRFJRcRcJ0HLsYOwIIqmouIsEqKiqix1BJBUVd5EQJSWxE4ikouIuEqD1xAexI4ikouIuEqD1WFPsCCKpqLiLBKiorosdQSQVFfcuiDUPhkRkGnOXwqDiLhKg9fjh2BFEUlFxFwnQevz92BFEUlFxFwlQUT0qdgSRVHJW3M1sJvC/gVLgZ+6+KFevJdJtzGInkByIefwsV3PJ56S4m1kp8H+AzwENwF/MbLm7v5mL19OBTekurccPxY4gkkquDv1PB3a4+9vufhp4BpiVo9cS6Tatxw+pwEtBMHfP/pOafRGY6e7/JVm/F/iUuz/Uoc18YH6yehWwNetBsq8KKJSZowopKxRW3kLKCoWVt5CyQvy8I929urMHcjXm3tnA5Dm/Rdx9MbA4R6+fE2a2zt2nxc6RRiFlhcLKW0hZobDyFlJWyO+8uRqWaQBGdFivBfbm6LVEROQ8uSrufwHGmNkoM6sA5gDLc/RaIiJynpwMy7h7i5k9BPwbbadCPuXum3LxWt2skIaRCikrFFbeQsoKhZW3kLJCHufNyQFVERGJS7MgiYgUIRV3EZEipOJ+CWY20MxWmNn25H5AJ20qzWytmb1uZpvM7B/zOOsIM1tpZpuTrA/HyJpkyZg3afeUme03szciZJxpZlvNbIeZLejkcTOzHyePbzSzqd2dsUOWTFnHmdlqMztlZt+KkfG8PJnyfjn5TDea2Z/N7JMxciZZMmWdleTcYGbrzOyGGDkv4O66XeQG/C9gQbK8APifnbQxoHeyXA6sAa7N06w1wNRkuQ+wDRifr59t8tiNwFTgjW7OVwq8BVwJVACvn/9ZAbcD/5r8DFwLrIn0WabJOhi4BlgIfCtGzsC8/wEYkCzfluefbW/+fvxyErAl5ud79qae+6XNApYmy0uBu85v4G2OJavlyS3GUeo0WRvd/a/J8lFgMzC8uwKeJ2NeAHf/IxBjnt00U2jMAv45+Rn4d6C/mdV0d1BSZHX3/e7+F6A5Qr7zpcn7Z3c/O8/Dv9P2XZkY0mQ95kllB3oR5///BVTcL22IuzdCW2GkrfdzATMrNbMNwH5ghbuv6b6I7VJlPcvM6oAptP2lEUNQ3giGA3s6rDdw4S/CNG26Q77kSCs07/20/YUUQ6qsZvYfzWwL8CIwr5uyXdJHfj53M/sDMLSTh/4h7XO4eysw2cz6A78xs6vdPetjxNnImjxPb+A54JvufiQb2S7yOlnJG0nGKTRStukO+ZIjrdR5zWwGbcU91jh2qqzu/hva/u/fCPwP4OZcB8vkI1/c3f2i/whmts/Maty9Mflze3+G5zpsZquAmUDWi3s2sppZOW2F/Vfu/ny2M3aUzc82gjRTaOTLNBv5kiOtVHnNbBLwM+A2d2/qpmznC/ps3f2PZvZxM6ty96gToGlY5tKWA3OT5bnAC+c3MLPqpMeOmfWk7Tf2lu4K2EGarAb8E7DZ3X/Yjdk6kzFvZGmm0FgO/OfkrJlrgQ/ODjV1s0Kb7iNjXjP7GPA8cK+7b4uQ8aw0WUcn/7dIzpiqAGL9Mvq72Ed08/kGDAJeBrYn9wOT7cOAf/G/Hx1/DdhIW2/9O3mc9Qba/qTcCGxIbrfna95k/WmgkbYDgQ3A/d2Y8Xbazih6C/iHZNvXgK8ly0bbRWneAv4GTIv4s5op69Dk8zsCHE6W++Zx3p8Bhzr8nK7L46zfBjYlOVcDN8TK2vGm6QdERIqQhmVERIqQiruISBFScRcRKUIq7iIiRUjFXUSkCKm4i4gUIRV3EZEi9P8BiBgjshCt/QsAAAAASUVORK5CYII=", "text/plain": [ "
" ] @@ -357,11 +357,11 @@ "\n", "1. We are taking 2000 samples of a random variable X, those samples has some mean $m$ and standard deviation $\\sigma_X$, the mean is the center of mass of the above histogram and the standard deviation is a measure of how wide it is.\n", "\n", - "2. However what we actually want to do is ask \"How close is the mean to 0?\", to answer that we need to know how much we expect the mean to vary by when we rerun the calculation. Turns the mean of N samples of a variable X then the mean varies by \n", + "2. However, what we actually want to do is ask \"How close is the mean to 0?\", to answer that we need to know how much we expect the mean to vary by when we rerun the calculation. Turns the mean of N samples of a variable X then the mean varies by \n", " $$\\sigma_m = \\sigma_X / \\sqrt{N}$$\n", " this is usually called the standard error of the mean.\n", "\n", - "3. Each time we run this code, we estimate the mean and the standard error of the mean, when it comes out to be a lot more than 100% then our t-test is very confident that the data is consistent with the true mean being 0. However when it's less than 100% we get a smaller p_value and this is saying that we should suspect that maybe the mean is not 0 after all.\n", + "3. Each time we run this code, we estimate the mean and the standard error of the mean, when it comes out to be a lot more than 100% then our t-test is very confident that the data is consistent with the true mean being 0. However, when it's less than 100% we get a smaller p_value and this is saying that we should suspect that maybe the mean is not 0 after all.\n", "\n", "\"An" ] @@ -371,7 +371,7 @@ "id": "0d6622b6-6598-4b61-84a8-2a3907315599", "metadata": {}, "source": [ - "So to do our test, we check that the p value is less than some arbitrary cutoff such as 0.1 or 0.01. This test should usually pass if the mean is in fact close to zero and it should fail if the mean is not zero. However due to random variation it can also fail randomly.\n", + "So to do our test, we check that the p value is less than some arbitrary cut-off such as 0.1 or 0.01. This test should usually pass if the mean is in fact close to zero, and it should fail if the mean is not zero. However, due to random variation it can also fail randomly.\n", "\n", "This is just one of those things that you can't really do anything about. Incidentally this can be used in reverse to generate fake \"highly significant\" scientific results in a practice called p-hacking. As usual XKCD has [explained this](https://xkcd.com/882/) better than I ever could." ] @@ -383,9 +383,9 @@ "source": [ "## Test Driven Development\n", "\n", - "I won't talk about TDD much here but it's likely a term you will hear at some point. It essentially referes to the practice of writing tests as part of your process of writing code. Rather than writing all your code and then writing tests for them. You could instead write some or all of your tests upfront and then write code that passes them. \n", + "I won't talk about TDD much here, but it's likely a term you will hear at some point. It essentially refers to the practice of writing tests as part of your process of writing code. Rather than writing all your code and then writing tests for them. You could instead write some or all of your tests upfront and then write code that passes them. \n", "\n", - "This can be an incredibly prodctive way to work, it forces you think about the structure and interface of your software before you start writing it. It aslo gives you nice incremental goals that you can tick off once each test starts to pass, gamification maybe?" + "This can be an incredibly productive way to work, it forces you think about the structure and interface of your software before you start writing it. It also gives you nice incremental goals that you can tick off once each test starts to pass, gamification maybe?" ] }, { @@ -403,13 +403,13 @@ "- It makes git diffs as small as possible because formatting changes never show up\n", "- It means you never have to discuss with your collaborators about code formatting, something which can waste a lot of time!\n", "\n", - "Here I will show you how to setup `black` as a pre-commit hook, this means it runs before you commit anything to git, which is probably the best time to do it. We'll use a helper tool called [pre-commit](https://pre-commit.com/).\n", + "Here I will show you how to set up `black` as a pre-commit hook, this means it runs before you commit anything to git, which is probably the best time to do it. We'll use a helper tool called [pre-commit](https://pre-commit.com/).\n", "\n", "```bash\n", "pip install pre-commit\n", "pre-commit sample-config >> .pre-commit-config.yaml # Generate an initial config\n", "```\n", - "Now we add some additional lines to the `.pre-commit-config.yaml` config file to setup black:\n", + "Now we add some additional lines to the `.pre-commit-config.yaml` config file to set up black:\n", "```yaml\n", "- repo: https://github.com/psf/black\n", " rev: 21.12b0\n", @@ -417,7 +417,7 @@ " - id: black\n", " - id: black-jupyter\n", "```\n", - "And finally `pre-commit install` will make this run every time you commit to git. It's worth running it manually once the first time to check it works: `pre-commit run --all-files`. Running this I immediatly got a cryptic error that, on googling, turned out to be that something broke in version 21.12b0 of `21.12b0`. Running `precommit autoupdate` fixed this for me by updated `black` to a later version. Running `pre-commit run --all-files` a second time now gives me:\n", + "And finally `pre-commit install` will make this run every time you commit to git. It's worth running it manually once the first time to check it works: `pre-commit run --all-files`. Running this I immediately got a cryptic error that, on googling, turned out to be that something broke in version 21.12b0 of `21.12b0`. Running `precommit autoupdate` fixed this for me by updated `black` to a later version. Running `pre-commit run --all-files` a second time now gives me:\n", "```bash\n", "(recode) tom@TomsLaptop ReCoDE_MCMCFF % pre-commit run --all-files\n", "trim trailing whitespace.................................................Passed\n", @@ -428,11 +428,11 @@ "(recode) tom@TomsLaptop ReCoDE_MCMCFF % \n", "```\n", "\n", - "Now whenever you commit anything, `black` will autoformat it before it actually gets commited. I'll test this for myself by putting\n", + "Now whenever you commit anything, `black` will autoformat it before it actually gets committed. I'll test this for myself by putting\n", "```python\n", "def ugly_litte_one_liner(a,b,c): return \" \".join([a,b,c/2. +3])\n", "```\n", - "in a code cell below and we'll see how `black` formats it. The only gotcha here is that you have to reload jupyter notebooks from disk in order to see the changes that `black` makes." + "in a code cell below and we'll see how `black` formats it. The only gotcha here is that you have to reload Jupyter notebooks from disk in order to see the changes that `black` makes." ] }, { @@ -451,7 +451,7 @@ "id": "68cccdcc-b82e-4dec-bfc1-54072db8d762", "metadata": {}, "source": [ - "Finally, be aware that if you try to commit code with incorrect syntax then black will just error and prevent it, this is probably a good thing but there may be the occasional time where that's a problem." + "Finally, be aware that if you try to commit code with incorrect syntax then black will just error and prevent it, this is probably a good thing, but there may be the occasional time when that's a problem." ] }, { diff --git a/docs/learning/05 Adding Functionality.ipynb b/docs/learning/05 Adding Functionality.ipynb index f11e8bb..5930cc3 100644 --- a/docs/learning/05 Adding Functionality.ipynb +++ b/docs/learning/05 Adding Functionality.ipynb @@ -30,7 +30,7 @@ "\n", "np.random.seed(\n", " 42\n", - ") # This makes our random numbers reproducable when the notebook is rerun in order" + ") # This makes our random numbers reproducible when the notebook is rerun in order" ] }, { @@ -40,7 +40,7 @@ "source": [ "# Adding Functionality\n", "\n", - "The main thing we want to be able to do is to take measurements, the code as I have writting it doesn't really allow that because it only returns the final state in the chain. Let's say we have a measurement called `average_color(state)` that we want to average over the whole chain. We could just stick that inside our definition of `mcmc` but we know that we will likely make other measurements too and we don't want to keep writing new versions of our core functionality!\n", + "The main thing we want to be able to do is to take measurements, the code as I have writing it doesn't really allow that because it only returns the final state in the chain. Let's say we have a measurement called `average_color(state)` that we want to average over the whole chain. We could just stick that inside our definition of `mcmc`, but we know that we will likely make other measurements too, and we don't want to keep writing new versions of our core functionality!\n", "\n", "## Exercise 1\n", "Have a think about how you would implement this and what options you have." @@ -52,11 +52,11 @@ "metadata": {}, "source": [ "## Solution 1\n", - "So I chatted with my mentors on this project on how to best do this and we came up with a few ideas:\n", + "So I chatted with my mentors on this project on how to best do this, and we came up with a few ideas:\n", "\n", "### Option 1: Just save all the states and return them\n", "\n", - "The problem with this is the states are very big and we don't want to waste all that memory. For an NxN state that uses 8 bit integers (the smallest we can use in numpy) 1000 samples would already use 2.5Gb of memory! We will see later that we'd really like to be able to go a bit bigger than 50x50 and 1000 samples!\n", + "The problem with this is the states are very big, and we don't want to waste all that memory. For an `NxN` state that uses 8-bit integers (the smallest we can use in numpy) `1000` samples would already use `2.5Gb` of memory! We will see later that we'd really like to be able to go a bit bigger than `50x50` and `1000` samples!\n", "\n", "### Option 2: Pass in a function to make measurements\n", "```python\n", @@ -73,7 +73,7 @@ " return measurements\n", "```\n", "\n", - "This could work but it limits how we can store measurements and what shape and type they can be. What if we want to store our measurements in a numpy array? Or what if your measurement itself is a vector or and object that can't easily be stored in a numpy array? We would have to think carefully about what functionality we want." + "This could work, but it limits how we can store measurements and what shape and type they can be. What if we want to store our measurements in a numpy array? Or what if your measurement itself is a vector or and object that can't easily be stored in a numpy array? We would have to think carefully about what functionality we want." ] }, { @@ -106,7 +106,7 @@ "measurements = color_sampler.run(...)\n", "```\n", "\n", - "This would definitely work but I personally am not a huge fan of object oriented programming so I'm gonna skip this option!" + "This would definitely work, but I personally am not a huge fan of object-oriented programming, so I'm going to skip this option!" ] }, { @@ -153,7 +153,7 @@ "id": "b74fadbe-80c2-4a20-b651-0e47188b005a", "metadata": {}, "source": [ - "This requires only a very small change to our mcmc function and suddenly we can do whatever we like with the states! While we're at it I'm going to add an aditional argument `stepsize` that allows us to only sample the state every `stepsize` MCMC steps. You'll see why we would want to set this to value greater than 1 in a moment." + "This requires only a very small change to our `mcmc` function, and suddenly we can do whatever we like with the states! While we're at it, I'm going to add an argument `stepsize` that allows us to only sample the state every `stepsize` MCMC steps. You'll see why we would want to set this to value greater than 1 in a moment." ] }, { @@ -215,7 +215,7 @@ "source": [ "Fun fact: if you replace `yield current_state.copy()` with `yield current_state` your python kernel will crash when you run the code. I believe this is a bug in Numba that related to how pointers to numpy arrays work but let's not worry too much about it. \n", "\n", - "We take a factor of two slowdown but that doesn't seem so much to pay for the fact we can now sample the state at every single step rather than just the last." + "We take a factor of two slowdown, but that doesn't seem so much to pay for the fact we can now sample the state at every single step rather than just the last." ] }, { diff --git a/docs/learning/06 Speeding It Up.ipynb b/docs/learning/06 Speeding It Up.ipynb index ba4a83a..03d7180 100644 --- a/docs/learning/06 Speeding It Up.ipynb +++ b/docs/learning/06 Speeding It Up.ipynb @@ -30,7 +30,7 @@ "\n", "np.random.seed(\n", " 42\n", - ") # This makes our random numbers reproducable but only when the notebook is run once in order" + ") # This makes our random numbers reproducible but only when the notebook is run once in order" ] }, { @@ -40,7 +40,7 @@ "source": [ "# Speeding It Up\n", "\n", - "In order to show you a really big system will still need to make the code a bit faster. Right now we calculate the energy of each state, flip a pixel and then calculate the energy again. It turns out that you can actually directly calculate the energy change instead of doing this subtraction. Let's do this is a sort of test driven decelopment fashion: we want to write a function that when given a state and a pixel to flip, returns how much the energy goes up by (negative if down) upon performing the flip.\n", + "In order to show you a really big system will still need to make the code a bit faster. Right now we calculate the energy of each state, flip a pixel and then calculate the energy again. It turns out that you can actually directly calculate the energy change instead of doing this subtraction. Let's do this is a sort of test driven development fashion: we want to write a function that when given a state and a pixel to flip, returns how much the energy goes up by (negative if down) upon performing the flip.\n", "\n", "I'll first write a slow version of this using the code we already have, and then use that to validate our faster version:" ] @@ -69,7 +69,7 @@ "id": "7b16f42a-0178-4753-9e9d-2f78aed40509", "metadata": {}, "source": [ - "Now if you stare at the definition of energy long enough, you can convince yourself that the energy change when you flip one pixel only depends on the four surounding pixels in a simple way:" + "Now if you stare at the definition of energy long enough, you can convince yourself that the energy change when you flip one pixel only depends on the four surrounding pixels in a simple way:" ] }, { @@ -160,7 +160,7 @@ "id": "e6ecbc7c-530f-494b-aa31-0a118a104328", "metadata": {}, "source": [ - "Ok great! And this function is much much faster because it only has to look at four pixels rather than all $N^2$ of them!" + "Ok great! And this function is much, much faster because it only has to look at four pixels rather than all $N^2$ of them!" ] }, { diff --git a/docs/learning/07 Producing Research Outputs.ipynb b/docs/learning/07 Producing Research Outputs.ipynb index d0dbbb7..85d5b47 100644 --- a/docs/learning/07 Producing Research Outputs.ipynb +++ b/docs/learning/07 Producing Research Outputs.ipynb @@ -30,7 +30,7 @@ "\n", "np.random.seed(\n", " 42\n", - ") # This makes our random numbers reproducable when the notebook is rerun in order" + ") # This makes our random numbers reproducible when the notebook is rerun in order" ] }, { @@ -40,7 +40,7 @@ "source": [ "# Producing Research Outputs\n", "\n", - "So now that we have the ability to simulate our system lets do a little exploration. First let's take three temperatures at each we'll do 10 runs and see how the systems evolve. I'll also tack on a little histogram at the right hand side of where the systens spent their time." + "So now that we have the ability to simulate our system let's do a little exploration. First let's take three temperatures at each we'll do `10` runs and see how the systems evolve. I'll also tack on a little histogram at the right-hand side of where the systems spent their time." ] }, { @@ -63,7 +63,7 @@ "steps = 200 # How many times to sample the state\n", "stepsize = N**2 # How many individual monte carlo flips to do in between each sample\n", "N_repeats = 10 # How many times to repeat each run at fixed temperature\n", - "initial_state = np.ones(shape=(N, N)) # the intial state to use\n", + "initial_state = np.ones(shape=(N, N)) # the initial state to use\n", "flips = (\n", " np.arange(steps) * stepsize\n", ") # Use this to plot the data in terms of individual flip attemps\n", @@ -138,9 +138,9 @@ "source": [ "There are a few key takeaways about MCMC in this plot:\n", "\n", - "- It takes a while for MCMC to 'settle in', you can see that for T = 10 the natural state is somewhere around c = 0, which takes about 2000 steps to reach from the initial state with c = 1. In general when doing MCMC we want to throw away some of the values at the beginging because they're too affected by the initial state.\n", - "- At High and Low temperatures the we basically just get small fluctuations about an average value\n", - "- At intermediate temperature the fluctuations occur on much longer time scales! Because the systems can only move a little bit each timestep, it means that the measurements we are making are *correlated* with themselves at previous times. The result of this is that if we use MCMC to draw N samples, we don't get as much information as if we had drawn samples from an uncorrelated variable (like a die roll for instance)." + "- It takes a while for MCMC to 'settle in', you can see that for T = 10 the natural state is somewhere around c = 0, which takes about 2000 steps to reach from the initial state with c = 1. In general when doing MCMC we want to throw away some values at the beginning because they're too affected by the initial state.\n", + "- At High and Low temperatures we basically just get small fluctuations about an average value\n", + "- At intermediate temperature the fluctuations occur on much longer time scales! Because the systems can only move a little each timestep, it means that the measurements we are making are *correlated* with themselves at previous times. The result of this is that if we use MCMC to draw N samples, we don't get as much information as if we had drawn samples from an uncorrelated variable (like a die roll for instance)." ] }, { diff --git a/docs/learning/08 Doing Reproducible Science.ipynb b/docs/learning/08 Doing Reproducible Science.ipynb index 3f764e6..d677a6d 100644 --- a/docs/learning/08 Doing Reproducible Science.ipynb +++ b/docs/learning/08 Doing Reproducible Science.ipynb @@ -15,15 +15,15 @@ "metadata": {}, "source": [ "# Doing Reproducible Science\n", - "Further Reading on this software reproducability: [The Turing Way: Guide to producing reproducable research](https://the-turing-way.netlify.app/reproducible-research/reproducible-research.html)\n", + "Further Reading on this software reproducibility: [The Turing Way: Guide to producing reproducible research](https://the-turing-way.netlify.app/reproducible-research/reproducible-research.html)\n", "\n", - "In the last chapter we made a nice littel graph, let's imagine we wanted to include that in a paper and we want other researchers to be able to understand and reproduce how it was generated.\n", + "In the last chapter we made a nice little graph, let's imagine we wanted to include that in a paper, and we want other researchers to be able to understand and reproduce how it was generated.\n", "\n", - "There are many aspects to this but I'll list what I think is relevant here:\n", + "There are many aspects to this, but I'll list what I think is relevant here:\n", "1. We have some code that generates the data and some code that uses it to plot the output, let's split that into two python files.\n", - "2. Our code has external dependencies on numpy and matplotlib, in the future those packages could change their behaviour in a way that breaks the code, so lets record what version our code is compatible with.\n", + "2. Our code has external dependencies on numpy and matplotlib, in the future those packages could change their behaviour in a way that breaks the code, so let's record what version our code is compatible with.\n", "3. We also have an internal dependency on other code in this MCFF repository, that could also change so let's record the git hash of the commit where the code works for posterity.\n", - "4. The data generating process is random so we'll fix the seed as discussed in the testing section to make it reproducable.\n", + "4. The data generating process is random, so we'll fix the seed as discussed in the testing section to make it reproducible.\n", "\n" ] }, @@ -36,27 +36,41 @@ "\n", "There are many ways to specify the versions of your python packages but the two most common are with a `requirements.txt` or with an `environment.yml`.\n", "\n", - "`requirements.txt` is quite simple and just lists packages and versions i.e `numpy==1.21`that can be installed with `pip install -r requirements.txt`, more details of the format [here][requirements]. The problem with requirements.txt is that it can only tell you about software installable with pip, which is often not enough.\n", + "`requirements.txt` is quite simple and just lists packages and versions i.e. `numpy==1.21`that can be installed with `pip install -r requirements.txt`, more details of the format [here][requirements]. The problem with requirements.txt is that it can only tell you about software installable with pip, which is often not enough.\n", "\n", "Consequently, many people now use `environment.yml` files, which work with conda. There's a great intro to all this [here][conda-intro] which I will not reproduce here. The gist of it is that we end up with a file that looks like this:\n", - "```\n", + "```yaml\n", "#contents of environment.yml\n", - "name: recode\n", + "name: mcmc\n", "\n", - "channels: # tells us what conda channels we need to look in\n", + "channels:\n", " - defaults\n", " - conda-forge\n", "\n", "dependencies:\n", " - python=3.9\n", - " - pytest=7.1\n", - " - pytest-cov=3.0\n", - " - ipykernel=6.9\n", + "\n", + " # Core packages\n", " - numpy=1.21\n", " - scipy=1.7\n", " - matplotlib=3.5\n", " - numba=0.55\n", - " - pre-commit\n", + " - ipykernel=6.9 # Allows this conda environment to show up automatically in Jupyter Lab\n", + " - watermark=2.3 # Generates a summary of package version for use inside Jupyter Notebooks\n", + "\n", + " # Testing\n", + " - pytest=7.1 # Testing\n", + " - pytest-cov=3.0 # For Coverage testing\n", + " - hypothesis=6.29 # Property based testing\n", + "\n", + " # Development\n", + " - pre-commit=2.20 # For running black and other tools before commits\n", + "\n", + " # Documentation\n", + " - sphinx=5.0 # For building the documentation\n", + " - myst-nb=0.16 # Allows sphinx to include Jupyter Notebooks\n", + "\n", + " # Installing MCFF itself\n", " - pip=21.2\n", " - pip:\n", " - --editable . #install MCFF from the local repository using pip and do it in editable mode\n", @@ -112,8 +126,9 @@ "id": "07c8092c-dd23-470c-adc6-002ccd8e44d0", "metadata": {}, "source": [ - "So I also output `conda env export`, this is annoying because it also gives you dependencies. While the versions of dependencies coudl potentially be important we usually draw the line at just listing the version of directly required packages. So what I usually do is to take the above output and then use the the output of `conda env export` to set the version numbers, leaving out the number because that indicates non-breaking changes\n", - "```\n", + "So I also output `conda env export`, this is annoying because it also gives you dependencies. While the versions of dependencies could potentially be important we usually draw the line at just listing the version of directly required packages. So what I usually do is to take the above output and then use the output of `conda env export` to set the version numbers, leaving out the number because that indicates non-breaking changes\n", + "\n", + "```yaml\n", "#output of conda env export\n", "name: recode\n", "channels:\n", @@ -137,13 +152,13 @@ "id": "fc858ba3-49db-47e3-89c8-a9945b61a8fb", "metadata": {}, "source": [ - "## Spliting the code into files\n", + "## Splitting the code into files\n", "\n", "To avoid you having to go away and find the files, I'll just put them here. Let's start with the file that generates the data. I'll give it what I hope is an informative name and a shebang so that we can run it with `./generate_montecarlo_walkers.py` (after doing `chmod +x generate_montecarlo_walkers.py` just once).\n", "\n", - "I'll set the seed using a large pregenerated seed, you've likely seen me use 42 in some places but that's not really best practive because it might not be entropy to reliably seed the generator.\n", + "I'll set the seed using a large pregenerated seed, you've likely seen me use `42` in some places, but that's not really best practice because it might not be entropy to reliably seed the generator.\n", "\n", - "I've also added some code that get's the commit hash of MCFF and saves it into the data file along with the date. This helps us keep track of the generated data too." + "I've also added some code that gets the commit hash of MCFF and saves it into the data file along with the date. This helps us keep track of the generated data too." ] }, { @@ -240,7 +255,7 @@ "\n", "np.random.seed(\n", " seed\n", - ") # This makes our random numbers reproducable when the notebook is rerun in order\n", + ") # This makes our random numbers reproducible when the notebook is rerun in order\n", "\n", "### The measurement we will make ###\n", "def average_color(state):\n", @@ -253,7 +268,7 @@ "steps = 200 # How many times to sample the state\n", "stepsize = N**2 # How many individual monte carlo flips to do in between each sample\n", "N_repeats = 10 # How many times to repeat each run at fixed temperature\n", - "initial_state = np.ones(shape=(N, N)) # the intial state to use\n", + "initial_state = np.ones(shape=(N, N)) # the initial state to use\n", "flips = (\n", " np.arange(steps) * stepsize\n", ") # Use this to plot the data in terms of individual flip attemps\n", @@ -372,11 +387,11 @@ "source": [ "## Citations and DOIs\n", "\n", - "Now that we have a nicely reproducable plot, let's share it with the world. The easiest way is probably to put your code in a hosted git repository like Github or Gitlab. \n", + "Now that we have a nicely reproducible plot, let's share it with the world. The easiest way is probably to put your code in a hosted git repository like GitHub or GitLab. \n", "\n", - "Next, let's mint a shiny Digital Object Identifier (DOI) for the repository, using something like [Zenodo](https://zenodo.org/). These services archive a snapshot of the repository and assign a DOI to that snapshot, this is realy useful for citing a particular version of the software. \n", + "Next, let's mint a shiny Digital Object Identifier (DOI) for the repository, using something like [Zenodo](https://zenodo.org/). These services archive a snapshot of the repository and assign a DOI to that snapshot, this is really useful for citing a particular version of the software. \n", "\n", - "Finally let's add a `citation.cff` file to the root of the repository, this makes it easier for people who might cite this software to generate a good citation for it. We can add the zenodo DOI to it too. You can read more about `citation.cff` files [here](https://citation-file-format.github.io/) and there is a convenient generator tool [here](https://citation-file-format.github.io/cff-initializer-javascript/)." + "Finally, let's add a `citation.cff` file to the root of the repository, this makes it easier for people who might cite this software to generate a good citation for it. We can add the Zenodo DOI to it too. You can read more about `citation.cff` files [here](https://citation-file-format.github.io/) and there is a convenient generator tool [here](https://citation-file-format.github.io/cff-initializer-javascript/)." ] }, { diff --git a/docs/learning/09 Adding Documentation.ipynb b/docs/learning/09 Adding Documentation.ipynb index 344e389..9ee476e 100644 --- a/docs/learning/09 Adding Documentation.ipynb +++ b/docs/learning/09 Adding Documentation.ipynb @@ -30,7 +30,7 @@ "source": [ "We'll use sphinx along with a couple plugins: [autodoc][autodoc] allows us to generate documentation automatically from the docstrings in our source code, while [napoleon][napoleon] allows us to use [NUMPYDOC][numpydoc] and Google formats for the docstrings in addition to [reStructuredText][rst]\n", "\n", - "What this means is that we'll be able to write documentation directly into the source code and it will get rendered into a nice website. This helps keep the documentation up to date beause it's right there next to the code!\n", + "What this means is that we'll be able to write documentation directly into the source code, and it will get rendered into a nice website. This helps keep the documentation up to date because it's right there next to the code!\n", "\n", "[autodoc]: https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html\n", "[napoleon]: https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html\n", @@ -131,7 +131,7 @@ " return True\n", "```\n", "\n", - "I normally just copy paste this and go from there but there's a full description [here](https://numpydoc.readthedocs.io/en/latest/format.html). You can also check out the docstrings in MCFF" + "I normally just copy and paste this and go from there, but there's a full description [here](https://numpydoc.readthedocs.io/en/latest/format.html). You can also check out the docstrings in MCFF." ] }, { @@ -142,7 +142,7 @@ "## Making the function declarations a bit nicer\n", "Longer function names in the generated documentation currently generate with no line break, I found a fix for that buried [inside a bug report on sphinx](https://github.com/sphinx-doc/sphinx/issues/1514#issuecomment-742703082) \n", "\n", - "It involves adding some custom css and an extra line to `conf.py`:\n", + "It involves adding some custom CSS and an extra line to `conf.py`:\n", "```python\n", "html_css_files = [\n", " 'custom.css',\n", @@ -155,7 +155,7 @@ "id": "c771a57d-c802-429a-b051-1bd0364b9317", "metadata": {}, "source": [ - "Finally we add a `readthedocs.yaml` file (which you can copy from the root of this repo) to tell readthedocs how to build our documentation. https://docs.readthedocs.io/en/stable/config-file/v2.html#packages" + "Finally, we add a `readthedocs.yaml` file (which you can copy from the root of this repo) to tell readthedocs how to build our documentation. https://docs.readthedocs.io/en/stable/config-file/v2.html#packages" ] }, { @@ -173,7 +173,7 @@ "source": [ "### Documentation Ideas\n", "\n", - "Readthedocs can be a bit tricky to setup, it is also possible to use Github pages to acomplish something similar. Another idea is to include some simple copyable code snippets in a quickstart guide. This lets people get up and running your code more quickly than is they need to read the API docs to understand how to interact with your module." + "Readthedocs can be a bit tricky to set up, it is also possible to use GitHub pages to accomplish something similar. Another idea is to include some simple copyable code snippets in a quickstart guide. This lets people get up and running your code more quickly than is they need to read the API docs to understand how to interact with your module." ] }, { From 197a53ce3c222aee090933350d5fa5771c56ccd0 Mon Sep 17 00:00:00 2001 From: gnikit Date: Tue, 19 Jul 2022 01:07:11 +0100 Subject: [PATCH 03/12] docs: fix math rendering in sphinx website --- docs/conf.py | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/docs/conf.py b/docs/conf.py index 0173c0d..8a2d21e 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -34,9 +34,22 @@ release = "1.0" extensions = [ "sphinx.ext.autodoc", "sphinx.ext.napoleon", + 'sphinx.ext.mathjax', "myst_nb", ] +# Setup the myst_nb extension for LaTeX equations rendering +myst_enable_extensions = [ + "amsmath", + "colon_fence", + "deflist", + "dollarmath", + "html_image", +] +myst_dmath_allow_labels = True +myst_dmath_double_inline = True +myst_update_mathjax = True + # Tell myst_nb not to execute the notebooks nb_execution_mode = "off" From 64e2dc119cb8ab45f4db92b96e646977aca10fba Mon Sep 17 00:00:00 2001 From: gnikit Date: Tue, 19 Jul 2022 10:33:43 +0100 Subject: [PATCH 04/12] docs: add descriptions to sphinx exts + mathajax --- docs/learning/09 Adding Documentation.ipynb | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/learning/09 Adding Documentation.ipynb b/docs/learning/09 Adding Documentation.ipynb index 9ee476e..8de63c1 100644 --- a/docs/learning/09 Adding Documentation.ipynb +++ b/docs/learning/09 Adding Documentation.ipynb @@ -95,8 +95,10 @@ "We add the extensions by adding this to `conf.py` too:\n", "```python\n", "extensions = [\n", - " 'sphinx.ext.autodoc',\n", - " 'sphinx.ext.napoleon',\n", + " 'sphinx.ext.autodoc', # for generating documentation from the docstrings in our code\n", + " 'sphinx.ext.napoleon', # for parsing Numpy and Google stye docstrings\n", + " 'sphinx.ext.mathjax', # for equation rendering\n", + "\n", "]\n", "```" ] From d54a11f270e00adeb5e979846419d62afcbcebf6 Mon Sep 17 00:00:00 2001 From: gnikit Date: Tue, 19 Jul 2022 10:34:01 +0100 Subject: [PATCH 05/12] chore: fix string formatting --- docs/conf.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/conf.py b/docs/conf.py index 8a2d21e..73dedee 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -34,7 +34,7 @@ release = "1.0" extensions = [ "sphinx.ext.autodoc", "sphinx.ext.napoleon", - 'sphinx.ext.mathjax', + "sphinx.ext.mathjax", "myst_nb", ] From 6d656cc43642551fb0a3a6672adcbcb28144df98 Mon Sep 17 00:00:00 2001 From: gnikit Date: Tue, 19 Jul 2022 10:34:36 +0100 Subject: [PATCH 06/12] docs: move API docs at the end of the exemplar --- docs/index.rst | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/index.rst b/docs/index.rst index aa7ccc5..b7c2b52 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -13,9 +13,8 @@ There is a `Jupyter notebook `_ deta :caption: Table of Contents: quickstart - api_docs learning/* - + api_docs Indices and tables From b545ffbc20d312832f2206913226eec1042604b4 Mon Sep 17 00:00:00 2001 From: gnikit Date: Tue, 19 Jul 2022 11:28:09 +0100 Subject: [PATCH 07/12] docs: Add GitHub clone instructions at Intro --- docs/learning/01 Introduction.ipynb | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/docs/learning/01 Introduction.ipynb b/docs/learning/01 Introduction.ipynb index 47a2cc5..b418b1f 100644 --- a/docs/learning/01 Introduction.ipynb +++ b/docs/learning/01 Introduction.ipynb @@ -16,9 +16,16 @@ "\n", "## Setting up your environment\n", "\n", - "It's strongly encouraged that you follow along this notebook in an environment where you can run the cells yourself and change them. You can either clone this git repository and run the cells in a python environment on your local machine, or if you for some reason can't do that (because you are on a phone or tablet for instance) you can instead open this notebook in [binder](https://mybinder.org/v2/gh/TomHodson/ReCoDE_MCMCFF/HEAD)\n", + "It's **strongly encouraged** that you follow along this series of notebooks in an environment where you can run the cells yourself and change them. You can either clone this git repository and run the cells in a Python environment on your local machine,\n", "\n", - "I would also suggest you set up a python environment just for this. You can use your preferred method to do this, but I will recommend `conda` because it's both what I currently use and what is recommended by Imperial.\n", + "```bash\n", + "git clone https://github.com/ImperialCollegeLondon/ReCoDE_MCMCFF mcmc\n", + "cd mcmc\n", + "```\n", + "\n", + "or if for some reason you can't do that (because you are on a phone or tablet for instance) you can instead open this notebook in [binder](https://mybinder.org/v2/gh/TomHodson/ReCoDE_MCMCFF/HEAD)\n", + "\n", + "I would also suggest you set up a Python environment just for this. You can use your preferred method to do this, but I will recommend `Anaconda` because it's both what I currently use and what is recommended by Imperial.\n", "\n", "```bash\n", "#make a new conda environment from the specification in environment.yml\n", From 13190e66ccf7bd13e9c86ab365c6242490de46f0 Mon Sep 17 00:00:00 2001 From: gnikit Date: Tue, 19 Jul 2022 11:29:35 +0100 Subject: [PATCH 08/12] chore: updated link to ImperialCollegeLondon Also removed duplicate link to Jupyter notebooks --- docs/index.rst | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/docs/index.rst b/docs/index.rst index b7c2b52..83079f8 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -3,9 +3,7 @@ Imperial College London ReCoDE : Monte Carlo for Fun This is an exemplar project designed to showcase best practices in developing scientific software as part of the ReCoDE Project at Imperial College London. These docs have been generated automatically with sphinx. -You can find the source code and main landing page for this project on `GitHub `_ - -There is a `Jupyter notebook `_ detailing how this page was generated in there. +You can find the source code and main landing page for this project on `GitHub `_ .. toctree:: :maxdepth: 1 From 27453fcf95db010b8de8c9d292bf7e5db241fa07 Mon Sep 17 00:00:00 2001 From: gnikit Date: Tue, 19 Jul 2022 12:26:51 +0100 Subject: [PATCH 09/12] docs: edited README to adhere to template Template can be found in: https://raw.githubusercontent.com/ImperialCollegeLondon/ReCoDE-README-template/main/README-template.md?token=GHSAT0AAAAAABPRNXTZCOF7UC74ZT5T5P5YYWWUL3A --- README.md | 80 +++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 54 insertions(+), 26 deletions(-) diff --git a/README.md b/README.md index 2c3dd48..670accd 100644 --- a/README.md +++ b/README.md @@ -16,35 +16,64 @@

+## Description + This is an exemplar project designed to showcase best practices in developing scientific software as part of the ReCoDE Project at Imperial College London. **You do not need to know or care about Markov Chain Monte Carlo for this to be useful to you.** Rather this project is primarily designed to showcase the tools and practices available to you when developing scientific software projects. Maybe you are a PhD student just starting, or a researcher just about to embark on a larger scale software project - there should be something interesting here for you. -## Table of contents +## Learning Outcomes -1. [Introduction](docs/learning/01%20Introduction.ipynb) -1. [Packaging It Up](docs/learning/02%20Packaging%20It%20Up.ipynb) -1. [Writing a Markov Chain Monte Carlo Sampler](docs/learning/03%20Writing%20a%20Markov%20Chain%20Monte%20Carlo%20Sampler.ipynb) -1. [Testing](docs/learning/04%20Testing.ipynb) -1. [Adding Functionality](docs/learning/05%20Adding%20Functionality.ipynb) -1. [Speeding It Up](docs/learning/06%20Speeding%20It%20Up.ipynb) -1. [Producing Research Outputs](docs/learning/07%20Producing%20Research%20Outputs.ipynb) -1. [Doing Reproducible Science](docs/learning/08%20Doing%20Reproducible%20Science.ipynb) -1. [Adding Documentation](docs/learning/09%20Adding%20Documentation.ipynb) +- Creating virtual environments using Anaconda +- Plotting data using Matplotlib +- Improving code performance with `numba` and Just-in-time compilation +- Packaging Python projects into modules +- Writing a simple Monte Carlo simulation using `numba` and `numpy` +- Using Test Driven Development (TDD) to test your code +- Creating unittests with `pytest` +- Calculating the `coverage` of your codebase +- Visualising coarse and detailed views of the `coverage` in your codebase +- Creating property-based tests with `hypothesis` +- Creating regression tests +- Using autoformatters like `black` and other development tools +- Improving performance using `generators` and `yield` +- Making a reproducible Python environment using Anaconda +- Documenting your code using `sphinx` +- Writing docstrings using a standardised format -## How to use this repository +## Requirements + +### Academic + +Entry level researcher with basic knowledge of Python. + +**Complementary Resources to the exemplar:** + +- [The Turing Way](https://the-turing-way.netlify.app/) has tons of great resources on the topics discussed here. +- [Intermediate Research Software Development in Python](https://carpentries-incubator.github.io/python-intermediate-development/index.html) + +### System + +| Language | Version | +| ---------------------------------------------------------- | ------- | +| [Python](https://www.python.org/downloads/) | >= 3.7 | +| [Anaconda](https://www.anaconda.com/products/distribution) | >= 4.1 | + +## Getting Started Take a look at the table of contents below and see if there are any topics that might be useful to you. The actual code lives in `src` and the documentation in `docs/learning` in the form of Jupyter notebooks. -When you're ready to dive in you have three options: +When you're ready to dive in you have 4 options: -### 1. Launch them in Binder (easiest but a bit slow) +### 1. Launch them in Binder [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ImperialCollegeLondon/ReCoDE_MCMCFF/HEAD?urlpath=lab%2Ftree%2Fdocs%2Flearning%2F01%20Introduction.ipynb) -### 2. Clone the repo and run the Jupyter notebooks locally. (Faster but requires you have python/jupyter installed) +_NOTE: Performance might be a bit slow_. + +### 2. Clone the repo and run the Jupyter notebooks locally ```bash git clone https://github.com/ImperialCollegeLondon/ReCoDE_MCMCFF mcmc @@ -53,9 +82,18 @@ pip install .[dev] jupyter lab ``` -### 3. View them non-interactively in GitHub via the links in the table of contents +_NOTE: Better performance but requires you have python and Jupyter installed_. -## The map +### 3. View the Jupyter notebooks non-interactively via the online documentation + +You can read all the Jupyter notebooks online and non-interactively in the official **[Documentation](https://recode-mcmcff.readthedocs.io/)**. + +### 4. View the Jupyter notebooks non-interactively on GitHub + +Click [here](https://github.com/ImperialCollegeLondon/ReCoDE_MCMCFF/tree/main/docs/learning) +to view the individual Jupyter notebooks. + +## Project Structure ```bash . @@ -76,13 +114,3 @@ jupyter lab │ └── tests # automated tests for the code ``` - -## External Resources - -- [The Turing Way](https://the-turing-way.netlify.app/) has tons of great resources on the topics discussed here. -- [Intermediate Research Software Development in Python](https://carpentries-incubator.github.io/python-intermediate-development/index.html) - -[tdd]: learning/01%20Introduction.ipynb -[intro]: learning/01%20Introduction.ipynb -[packaging]: learning/02%20Packaging%20it%20up.ipynb -[testing]: learning/02%20Testing.ipynb From f5e7e816dda9b74003406bd97fc50b3743011e70 Mon Sep 17 00:00:00 2001 From: gnikit Date: Wed, 20 Jul 2022 00:33:32 +0100 Subject: [PATCH 10/12] Apply suggestions from code review Co-authored-by: Jeremy Cohen --- README.md | 4 ++-- docs/learning/01 Introduction.ipynb | 8 ++++---- docs/learning/02 Packaging It Up.ipynb | 19 +++++++------------ ...g a Markov Chain Monte Carlo Sampler.ipynb | 17 ++++++----------- docs/learning/04 Testing.ipynb | 18 +++++++++--------- docs/learning/05 Adding Functionality.ipynb | 8 ++++---- docs/learning/06 Speeding It Up.ipynb | 2 +- .../07 Producing Research Outputs.ipynb | 8 ++++---- .../08 Doing Reproducible Science.ipynb | 12 ++++++------ docs/learning/09 Adding Documentation.ipynb | 4 ++-- 10 files changed, 45 insertions(+), 55 deletions(-) diff --git a/README.md b/README.md index 670accd..3b95861 100644 --- a/README.md +++ b/README.md @@ -67,7 +67,7 @@ Take a look at the table of contents below and see if there are any topics that When you're ready to dive in you have 4 options: -### 1. Launch them in Binder +### 1. Launch the notebooks in Binder [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ImperialCollegeLondon/ReCoDE_MCMCFF/HEAD?urlpath=lab%2Ftree%2Fdocs%2Flearning%2F01%20Introduction.ipynb) @@ -82,7 +82,7 @@ pip install .[dev] jupyter lab ``` -_NOTE: Better performance but requires you have python and Jupyter installed_. +_NOTE: Better performance but requires you have Python and Jupyter installed_. ### 3. View the Jupyter notebooks non-interactively via the online documentation diff --git a/docs/learning/01 Introduction.ipynb b/docs/learning/01 Introduction.ipynb index b418b1f..9ea59ff 100644 --- a/docs/learning/01 Introduction.ipynb +++ b/docs/learning/01 Introduction.ipynb @@ -25,7 +25,7 @@ "\n", "or if for some reason you can't do that (because you are on a phone or tablet for instance) you can instead open this notebook in [binder](https://mybinder.org/v2/gh/TomHodson/ReCoDE_MCMCFF/HEAD)\n", "\n", - "I would also suggest you set up a Python environment just for this. You can use your preferred method to do this, but I will recommend `Anaconda` because it's both what I currently use and what is recommended by Imperial.\n", + "I would also suggest you set up a Python environment just for this project. You can use your preferred method to do this, but I will recommend `Anaconda` because it's both what I currently use and what is recommended by Imperial.\n", "\n", "```bash\n", "#make a new conda environment from the specification in environment.yml\n", @@ -88,7 +88,7 @@ "id": "e52245f1-8ecc-45f1-8d52-337916b0ce7c", "metadata": {}, "source": [ - "We're going to be working with arrays of numbers, so it will make sense to work with `Numpy`, and we'll also want to plot things, the standard choice for this is `matplotlib`, though there are other options, `pandas` and `plotly` being notable ones.\n", + "We're going to be working with arrays of numbers, so it will make sense to work with `NumPy`, and we'll also want to plot things, the standard choice for this is `Matplotlib`, though there are other options, `pandas` and `Plotly` being notable ones.\n", "\n", "Let me quickly plot something to aid the imagination:" ] @@ -409,7 +409,7 @@ "In scientific python like this there are usually two main options for reducing the overhead:\n", "\n", "#### Using Arrays\n", - "One way is we work with arrays of numbers and operations defined over those arrays such as `sum`, `product` etc. `Numpy` is the canonical example of this in Python, but many machine learning libraries are essentially doing a similar thing. We rely on the library to implement the operations efficiently and try to chain those operations together to achieve what we want. This imposes some limitations on the way we can write our code.\n", + "One way is we work with arrays of numbers and operations defined over those arrays such as `sum`, `product` etc. `NumPy` is the canonical example of this in Python, but many machine learning libraries are essentially doing a similar thing. We rely on the library to implement the operations efficiently and try to chain those operations together to achieve what we want. This imposes some limitations on the way we can write our code.\n", "\n", "#### Using Compilation\n", "The alternative is that we convert our Python code into a more efficient form that incurs less overhead. This requires a compilation or transpilation step and imposes a different set of constraints on the code.\n", @@ -620,7 +620,7 @@ "## Conclusion\n", "So far we've discussed the problem we want to solve, written a little code, tested it a bit and made some speed improvements.\n", "\n", - "In the next notebook we will package the code up into a little python package, this has two big benefits to use: \n", + "In the next notebook we will package the code up into a little python package, this has two big benefits when using the code: \n", "1. I won't have to redefine the energy function we just wrote in the next notebook \n", "1. It will help with testing and documenting our code later" ] diff --git a/docs/learning/02 Packaging It Up.ipynb b/docs/learning/02 Packaging It Up.ipynb index 14d392c..0b450ab 100644 --- a/docs/learning/02 Packaging It Up.ipynb +++ b/docs/learning/02 Packaging It Up.ipynb @@ -30,7 +30,7 @@ "- [Packaging for pytest](https://docs.pytest.org/en/6.2.x/goodpractices.html)\n", "\n", "\n", - "Before we can do any testing, it is best practice to structure and then package your code up as a python project up. You don't have to do it like this, but it carries with it the benefit that many other tutorials _expect_ you to do it like this, and generally you want to reduce friction for yourself later. \n", + "Before we can do any testing, it is best practice to structure and then package your code up as a python project. You don't have to do it like this, but it carries with it the benefit that many other tutorials _expect_ you to do it like this, and generally you want to reduce friction for yourself later. \n", "\n", "Like all things programming, there are many opinions about how python projects should be structured, as I write this the structure of this repository is this: (This is the lightly edited output of the `tree` command if you're interested) \n", "```bash\n", @@ -53,15 +53,15 @@ "└── tests # automated tests for the code\n", "```\n", "\n", - "It's looks pretty intimidating! But let's quickly go through it, at the top level of most projects you'll find on GitHub, and elsewhere you'll find files to do with the project as a whole:\n", + "It looks pretty intimidating! But let's quickly go through it: at the top level of most projects you'll find on GitHub (and elsewhere) there are a group of files that describe the project as a whole or provide key project information - not all projects will have all of these files and, indeed, there a variety of other files that you may also see so this is an example of some of the more important files:\n", "- `README.md` - An intro to the project\n", "- `LICENSE` - The software license that governs this project, there are a few standard ones people use.\n", - "- `environment.yml` (or both) this list what python packages the project needs in a standard format\n", + "- `environment.yml` (or alternatives) - this lists what Python packages the project needs in a standard format (other languages have equivalents).\n", "- `CITATION.cff` This is the new standard way to describe how a work should be cited, v useful for academic software.\n", "\n", "Then below that you will usually have directories breaking the project up into main categories, here I have `src/` and `docs/learning/`.\n", "\n", - "Inside `src/` we have a standard python package directory structure.\n", + "Inside `src/` we have a standard Python package directory structure.\n", "\n", "## Packaging\n", "There are a few things going on here, our actual code lives in `MCFF/` which is wrapped up inside a `src` folder, the `src` thing is a convention related to pytests, check [Packaging for pytest](https://docs.pytest.org/en/6.2.x/goodpractices.html) if you want the gory details.\n", @@ -190,9 +190,9 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3.8.10 ('venv': venv)", + "display_name": "Python [conda env:recode]", "language": "python", - "name": "python3" + "name": "conda-env-recode-py" }, "language_info": { "codemirror_mode": { @@ -204,12 +204,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.10" - }, - "vscode": { - "interpreter": { - "hash": "f5403acae4671aac0ae5a29dd5903d33d0105a9e9d4148f755d3321f5023d387" - } + "version": "3.9.12" } }, "nbformat": 4, diff --git a/docs/learning/03 Writing a Markov Chain Monte Carlo Sampler.ipynb b/docs/learning/03 Writing a Markov Chain Monte Carlo Sampler.ipynb index 17f0ccf..7f1194c 100644 --- a/docs/learning/03 Writing a Markov Chain Monte Carlo Sampler.ipynb +++ b/docs/learning/03 Writing a Markov Chain Monte Carlo Sampler.ipynb @@ -51,7 +51,7 @@ "There isn't that much more work to do Markov Chain Monte Carlo. I won't go into the details of how MCMC works but put very simply MCMC lets us calculate thermal averages of a physical system at some temperature. For example, the physical system might be \"[$10^{23}$][wa] H20 molecules in a box\" and the thermal average we want is \"Are they organised like a solid or a liquid?\". We can ask that question at different temperatures, and we will get different answers.\n", "\n", "\n", - "For our Ising model the equivalent question would be what's the average color of this system? At high temperatures we expect the pixels to be random and average out out grey, while at low temperatures they will all be either black or while.\n", + "For our Ising model the equivalent question would be what's the average color of this system? At high temperatures we expect the pixels to be random and average out grey, while at low temperatures they will all be either black or white.\n", "\n", "What happens in between? This question is pretty hard to answer using maths, it can be done for the 2D Ising model but for anything more complicated it's pretty much impossible. This is where MCMC comes in.\n", "\n", @@ -139,9 +139,9 @@ "id": "5d1874d4-4585-49ed-bc6f-b11c22231669", "metadata": {}, "source": [ - "These images give a flavour of why physicists find this model useful, it gives window into how thermal noise and spontaneous order interact. At low temperatures the energy cost of being different from your neighbours is the most important thing, while at high temperatures, it doesn't matter, and you really just do your own thing.\n", + "These images give a flavour of why physicists find this model useful, it gives a window into how thermal noise and spontaneous order interact. At low temperatures the energy cost of being different from your neighbours is the most important thing, while at high temperatures, it doesn't matter, and you really just do your own thing.\n", "\n", - "There's a special point somewhere in the middle called the critical point $T_c$ where all sorts of cool things happen, but my favourite is that for large system sizes you get a kind of fractal behaviour which I will demonstrate more once we've sped this code up and can simulate larger systems in a reasonable time. You can kinda see it for 50x50 system at T = 5 but not really clearly." + "There's a special point somewhere in the middle called the critical point $T_c$ where all sorts of cool things happen, but my favourite is that for large system sizes you get a kind of fractal behaviour which I will demonstrate more once we've sped this code up and can simulate larger systems in a reasonable time. You can kinda see it for a 50x50 system at T = 5 but not really clearly." ] }, { @@ -206,9 +206,9 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3.8.10 ('venv': venv)", + "display_name": "Python [conda env:recode]", "language": "python", - "name": "python3" + "name": "conda-env-recode-py" }, "language_info": { "codemirror_mode": { @@ -220,12 +220,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.10" - }, - "vscode": { - "interpreter": { - "hash": "f5403acae4671aac0ae5a29dd5903d33d0105a9e9d4148f755d3321f5023d387" - } + "version": "3.9.12" } }, "nbformat": 4, diff --git a/docs/learning/04 Testing.ipynb b/docs/learning/04 Testing.ipynb index b4a7fe6..f084b6f 100644 --- a/docs/learning/04 Testing.ipynb +++ b/docs/learning/04 Testing.ipynb @@ -25,7 +25,7 @@ "\n", "Ok we can finally start writing and running some tests!\n", "\n", - "I copied some of the initial tests that we did in chapter 1 into `test_energy.py` installed pytest into my development environment with `pip install pytest`. If you're using conda you need to use `conda install pytest`, and now I can run the `pytest` command in the `mcmc` directory. Pytest will automatically discover our tests and run them, to do this it relies on their being python files with functions named `test_\\*` which it will run.\n", + "I copied some of the initial tests that we did in chapter 1 into `test_energy.py` and installed pytest into my development environment with `pip install pytest`. If you're using conda you need to use `conda install pytest`. I can now run the `pytest` command in the `mcmc` directory. Pytest will automatically discover our tests and run them. To do this it relies on there being Python files with functions named `test_\\*` which it will run. It's also a widely used convention to begin the name of Python files containing tests with `test_`\n", "\n", "If that doesn't work and complains it can't find MCFF, try `python -m pytest`, this asks python to find a module and run it, which can be useful to ensure you're running pytest inside the correct environment. I ran into this problem because I used `pip install pytest` into a conda environment when I should have done `conda install pytest`.\n", "\n", @@ -92,7 +92,7 @@ " assert energy(state) == E_prediction_all_the_same(L)\n", "```\n", "\n", - "I will defer to external resources for a full discussion of the philosophy of testing, but I generally think of tests as an aid to my future debugging. If I make a change that breaks something then I want my tests to catch that and to make it clear what has broken. As such I generally put tests that check very basic properties of my code early on in the file and then follow them with tests that probe more subtle things or more obscure edges cases.\n", + "I will defer to external resources for a full discussion of the philosophy of testing, but I generally think of tests as an aid to my future debugging. If I make a change that breaks something then I want my tests to catch that and to make it clear what has broken. As such I generally put tests that check very basic properties of my code early on in the file and then follow them with tests that probe more subtle things or more obscure edge cases.\n", "\n", "`test_exact_energies` checks that the energies of our exact states come out as we calculated they should in chapter 1. This is testing a very limited space of the possible inputs to `energy` so we'd like to find some way to be more confident that our implementation is correct.\n", "\n", @@ -116,7 +116,7 @@ "source": [ "## Coverage Testing\n", "\n", - "A useful little trick for testing, are tools like pytest-cov that can measure *coverage*, that is, the amount of your code base that is activated by your tests. Unfortunately Numba does not play super well with pytest-cov, so we have to turn off numba to generate the test report using an environment variable.\n", + "A useful aspect of testing is *coverage*. This involves looking at how much of your code is actually "covered" by the tests you've written. That is, which individual lines of your code are actually being run by your tests. Tools like `pytest-cov` can measure _coverage_. Unfortunately Numba does not play super well with `pytest-cov`, so we have to turn off Numba using an environment variable so that we can run `pytest-cov` and generate the "test report".\n", "\n", "```bash\n", "(recode) tom@TomsLaptop ReCoDE_MCMCFF % pip install pytest-cov # install the coverage testing plugin\n", @@ -144,7 +144,7 @@ "=================================================== 3 passed in 1.89s ===================================================\n", "```\n", "\n", - "Ok so this is telling us that we currently test 86% of the lines in ising_model.py. We can also change `--cov-report=html` to get a really nice `html` output which shows which parts of your code aren't being run.\n", + "Ok so this is telling us that we currently test 86% of the lines in ising_model.py. We can also change `--cov-report=html` to get a really nice `html` output which shows which parts of our code aren't being run.\n", "\n", "A warning though, testing 100% of your lines of code doesn't mean it's correct, you need to think carefully about the data you test on, try to pick the hardest examples you can think of! What edge cases might there be that would break your code? Zero, empty strings and empty arrays are classic examples." ] @@ -156,7 +156,7 @@ "source": [ "## Advanced Testing Methods: Property Based Testing\n", "\n", - "I won't do into huge detail here, but I thought it would be nice to make you aware of a nice library called `Hypothesis` that helps with this problem of finding edge cases. `Hypothesis` gives you tools to generate randomised inputs to functions, so as long as you can come up with some way to verify the output is correct or has the correct _properties_ (or just that the code doesn't throw and error!) then this can be a powerful method of testing. \n", + "I won't go into huge detail here, but I thought it would be nice to make you aware of a nice library called [`Hypothesis`](https://hypothesis.readthedocs.io) that helps with this problem of finding edge cases. `Hypothesis` gives you tools to generate randomised inputs to functions, so as long as you can come up with some way to verify the output is correct or has the correct _properties_ (or just that the code doesn't throw and error!) then this can be a powerful method of testing. \n", "\n", "\n", "Take a look in `test_energy_using_hypothesis.py`\n", @@ -169,7 +169,7 @@ "def test_generated_states(state):\n", " assert np.allclose(energy(state), energy_numpy(state))\n", "```\n", - "You tell Hypothesis how to generate the test data, in this case we use some numpy specific code to generate 2 dimensional arrays with `dtype = int` and entries randomly sampled from `[1, -1]`. We use the same trick as before of checking two implementations against one another." + "You tell Hypothesis how to generate the test data. In this case we use some NumPy specific code to generate 2 dimensional arrays with `dtype = int` and entries randomly sampled from `[1, -1]`. We use the same trick as before of checking two implementations against one another." ] }, { @@ -181,7 +181,7 @@ "source": [ "## Testing Stochastic Code\n", "\n", - "We have an interesting problem here, most testing assumes that for the same inputs we will always get the same outputs but our MCMC sampler is a stochastic algorithm. So how can we test it? I can see three mains routes we can take:\n", + "We have an interesting problem here, most testing assumes that for the same inputs we will always get the same outputs but our MCMC sampler is a stochastic algorithm. So how can we test it? I can see three main routes we can take:\n", "\n", "- Fix the seed of the random number generator to make it deterministic\n", "- Do statistical tests on the output \n", @@ -383,7 +383,7 @@ "source": [ "## Test Driven Development\n", "\n", - "I won't talk about TDD much here, but it's likely a term you will hear at some point. It essentially refers to the practice of writing tests as part of your process of writing code. Rather than writing all your code and then writing tests for them. You could instead write some or all of your tests upfront and then write code that passes them. \n", + "I won't talk much about Test Driven Development, or TDD, here, but it's likely a term you will hear at some point. It essentially refers to the practice of writing tests as part of your process of writing code. Rather than writing all your code and then writing tests for them. You could instead write some or all of your tests upfront, describing the expected bahviour of code that doesn't yet exist, and then write the necessary code so that your tests pass. \n", "\n", "This can be an incredibly productive way to work, it forces you think about the structure and interface of your software before you start writing it. It also gives you nice incremental goals that you can tick off once each test starts to pass, gamification maybe?" ] @@ -417,7 +417,7 @@ " - id: black\n", " - id: black-jupyter\n", "```\n", - "And finally `pre-commit install` will make this run every time you commit to git. It's worth running it manually once the first time to check it works: `pre-commit run --all-files`. Running this I immediately got a cryptic error that, on googling, turned out to be that something broke in version 21.12b0 of `21.12b0`. Running `precommit autoupdate` fixed this for me by updated `black` to a later version. Running `pre-commit run --all-files` a second time now gives me:\n", + "And finally `pre-commit install` will make this run every time you commit to git. It's worth running it manually once the first time to check it works: `pre-commit run --all-files`. Running this I immediately got a cryptic error that, on googling, turned out to be that something broke in version 21.12b0 of `black`. Running `precommit autoupdate` fixed this for me by updating `black` to a later version. Running `pre-commit run --all-files` a second time now gives me:\n", "```bash\n", "(recode) tom@TomsLaptop ReCoDE_MCMCFF % pre-commit run --all-files\n", "trim trailing whitespace.................................................Passed\n", diff --git a/docs/learning/05 Adding Functionality.ipynb b/docs/learning/05 Adding Functionality.ipynb index 5930cc3..066c61b 100644 --- a/docs/learning/05 Adding Functionality.ipynb +++ b/docs/learning/05 Adding Functionality.ipynb @@ -40,7 +40,7 @@ "source": [ "# Adding Functionality\n", "\n", - "The main thing we want to be able to do is to take measurements, the code as I have writing it doesn't really allow that because it only returns the final state in the chain. Let's say we have a measurement called `average_color(state)` that we want to average over the whole chain. We could just stick that inside our definition of `mcmc`, but we know that we will likely make other measurements too, and we don't want to keep writing new versions of our core functionality!\n", + "The main thing we want to be able to do is to take measurements. The code, as I have written it, doesn't really allow that because it only returns the final state in the chain. Let's say we have a measurement called `average_color(state)` that we want to average over the whole chain. We could just stick that inside our definition of `mcmc`, but we know that we will likely make other measurements too, and we don't want to keep writing new versions of our core functionality!\n", "\n", "## Exercise 1\n", "Have a think about how you would implement this and what options you have." @@ -56,7 +56,7 @@ "\n", "### Option 1: Just save all the states and return them\n", "\n", - "The problem with this is the states are very big, and we don't want to waste all that memory. For an `NxN` state that uses 8-bit integers (the smallest we can use in numpy) `1000` samples would already use `2.5Gb` of memory! We will see later that we'd really like to be able to go a bit bigger than `50x50` and `1000` samples!\n", + "The problem with this is the states are very big, and we don't want to waste all that memory. For an `NxN` state that uses 8-bit integers (the smallest we can use in NumPy) `1000` samples would already use `2.5GB` (2.5 gigabytes) of memory! We will see later that we'd really like to be able to go a bit bigger than `50x50` and `1000` samples!\n", "\n", "### Option 2: Pass in a function to make measurements\n", "```python\n", @@ -73,7 +73,7 @@ " return measurements\n", "```\n", "\n", - "This could work, but it limits how we can store measurements and what shape and type they can be. What if we want to store our measurements in a numpy array? Or what if your measurement itself is a vector or and object that can't easily be stored in a numpy array? We would have to think carefully about what functionality we want." + "This could work, but it limits how we can store measurements and what shape and type they can be. What if we want to store our measurements in a NumPy array? Or what if your measurement itself is a vector or an object that can't easily be stored in a NumPy array? We would have to think carefully about what functionality we want." ] }, { @@ -153,7 +153,7 @@ "id": "b74fadbe-80c2-4a20-b651-0e47188b005a", "metadata": {}, "source": [ - "This requires only a very small change to our `mcmc` function, and suddenly we can do whatever we like with the states! While we're at it, I'm going to add an argument `stepsize` that allows us to only sample the state every `stepsize` MCMC steps. You'll see why we would want to set this to value greater than 1 in a moment." + "This requires only a very small change to our `mcmc` function, and suddenly we can do whatever we like with the states! While we're at it, I'm going to add an argument `stepsize` that allows us to only sample the state every `stepsize` MCMC steps. You'll see why we would want to set this to a value greater than 1 in a moment." ] }, { diff --git a/docs/learning/06 Speeding It Up.ipynb b/docs/learning/06 Speeding It Up.ipynb index 03d7180..0f9ea9a 100644 --- a/docs/learning/06 Speeding It Up.ipynb +++ b/docs/learning/06 Speeding It Up.ipynb @@ -40,7 +40,7 @@ "source": [ "# Speeding It Up\n", "\n", - "In order to show you a really big system will still need to make the code a bit faster. Right now we calculate the energy of each state, flip a pixel and then calculate the energy again. It turns out that you can actually directly calculate the energy change instead of doing this subtraction. Let's do this is a sort of test driven development fashion: we want to write a function that when given a state and a pixel to flip, returns how much the energy goes up by (negative if down) upon performing the flip.\n", + "In order to show you a really big system, we will still need to make the code a bit faster. Right now we calculate the energy of each state, flip a pixel, and then calculate the energy again. It turns out that you can actually directly calculate the energy change instead of doing this subtraction. Let's do this in a sort of test-driven development fashion: we want to write a function that when given a state and a pixel to flip, returns how much the energy goes up by (negative if down) upon performing the flip.\n", "\n", "I'll first write a slow version of this using the code we already have, and then use that to validate our faster version:" ] diff --git a/docs/learning/07 Producing Research Outputs.ipynb b/docs/learning/07 Producing Research Outputs.ipynb index 85d5b47..8fc557c 100644 --- a/docs/learning/07 Producing Research Outputs.ipynb +++ b/docs/learning/07 Producing Research Outputs.ipynb @@ -40,7 +40,7 @@ "source": [ "# Producing Research Outputs\n", "\n", - "So now that we have the ability to simulate our system let's do a little exploration. First let's take three temperatures at each we'll do `10` runs and see how the systems evolve. I'll also tack on a little histogram at the right-hand side of where the systems spent their time." + "So now that we have the ability to simulate our system let's do a little exploration. First let's take three temperatures. For each we'll do `10` runs and see how the systems evolve. I'll also tack on a little histogram at the right-hand side showing where the systems spent their time." ] }, { @@ -138,9 +138,9 @@ "source": [ "There are a few key takeaways about MCMC in this plot:\n", "\n", - "- It takes a while for MCMC to 'settle in', you can see that for T = 10 the natural state is somewhere around c = 0, which takes about 2000 steps to reach from the initial state with c = 1. In general when doing MCMC we want to throw away some values at the beginning because they're too affected by the initial state.\n", - "- At High and Low temperatures we basically just get small fluctuations about an average value\n", - "- At intermediate temperature the fluctuations occur on much longer time scales! Because the systems can only move a little each timestep, it means that the measurements we are making are *correlated* with themselves at previous times. The result of this is that if we use MCMC to draw N samples, we don't get as much information as if we had drawn samples from an uncorrelated variable (like a die roll for instance)." + "- It takes a while for MCMC to 'settle in', you can see that for T = 10 the natural state is somewhere around c = 0, which takes about 2000 steps to reach from the initial state with c = 1. In general when doing MCMC we want to throw away some values at the beginning because they're affected too much by the initial state.\n", + "- At High and Low temperatures we basically just get small fluctuations around an average value\n", + "- At intermediate temperatures the fluctuations occur on much longer time scales! Because the systems can only move a little each timestep, it means that the measurements we are making are *correlated* with themselves at previous times. The result of this is that if we use MCMC to draw N samples, we don't get as much information as if we had drawn samples from an uncorrelated variable (like a die roll for instance)." ] }, { diff --git a/docs/learning/08 Doing Reproducible Science.ipynb b/docs/learning/08 Doing Reproducible Science.ipynb index d677a6d..e317760 100644 --- a/docs/learning/08 Doing Reproducible Science.ipynb +++ b/docs/learning/08 Doing Reproducible Science.ipynb @@ -15,15 +15,15 @@ "metadata": {}, "source": [ "# Doing Reproducible Science\n", - "Further Reading on this software reproducibility: [The Turing Way: Guide to producing reproducible research](https://the-turing-way.netlify.app/reproducible-research/reproducible-research.html)\n", + "Further reading on the reproducibility of software outputs: [The Turing Way: Guide to producing reproducible research](https://the-turing-way.netlify.app/reproducible-research/reproducible-research.html)\n", "\n", "In the last chapter we made a nice little graph, let's imagine we wanted to include that in a paper, and we want other researchers to be able to understand and reproduce how it was generated.\n", "\n", "There are many aspects to this, but I'll list what I think is relevant here:\n", "1. We have some code that generates the data and some code that uses it to plot the output, let's split that into two python files.\n", - "2. Our code has external dependencies on numpy and matplotlib, in the future those packages could change their behaviour in a way that breaks the code, so let's record what version our code is compatible with.\n", + "2. Our code has external dependencies on `numpy` and `matplotlib`. In the future those packages could change their behaviour in a way that breaks (our changes the output of) our code, so let's record what version our code is compatible with.\n", "3. We also have an internal dependency on other code in this MCFF repository, that could also change so let's record the git hash of the commit where the code works for posterity.\n", - "4. The data generating process is random, so we'll fix the seed as discussed in the testing section to make it reproducible.\n", + "4. The data generation process is random, so we'll fix the seed as discussed in the testing section to make it reproducible.\n", "\n" ] }, @@ -156,7 +156,7 @@ "\n", "To avoid you having to go away and find the files, I'll just put them here. Let's start with the file that generates the data. I'll give it what I hope is an informative name and a shebang so that we can run it with `./generate_montecarlo_walkers.py` (after doing `chmod +x generate_montecarlo_walkers.py` just once).\n", "\n", - "I'll set the seed using a large pregenerated seed, you've likely seen me use `42` in some places, but that's not really best practice because it might not be entropy to reliably seed the generator.\n", + "I'll set the seed using a large pregenerated seed, you've likely seen me use `42` in some places, but that's not really best practice because it might not provide good enough entropy to reliably seed the generator.\n", "\n", "I've also added some code that gets the commit hash of MCFF and saves it into the data file along with the date. This helps us keep track of the generated data too." ] @@ -389,9 +389,9 @@ "\n", "Now that we have a nicely reproducible plot, let's share it with the world. The easiest way is probably to put your code in a hosted git repository like GitHub or GitLab. \n", "\n", - "Next, let's mint a shiny Digital Object Identifier (DOI) for the repository, using something like [Zenodo](https://zenodo.org/). These services archive a snapshot of the repository and assign a DOI to that snapshot, this is really useful for citing a particular version of the software. \n", + "Next, let's mint a shiny Digital Object Identifier (DOI) for the repository, using something like [Zenodo](https://zenodo.org/). These services archive a snapshot of the repository and assign a DOI to that snapshot, this is really useful for citing a particular version of the software, e.g. in a publication (and helping to ensure that published results are reproducible by others). \n", "\n", - "Finally, let's add a `citation.cff` file to the root of the repository, this makes it easier for people who might cite this software to generate a good citation for it. We can add the Zenodo DOI to it too. You can read more about `citation.cff` files [here](https://citation-file-format.github.io/) and there is a convenient generator tool [here](https://citation-file-format.github.io/cff-initializer-javascript/)." + "Finally, let's add a `CITATION.cff` file to the root of the repository, this makes it easier for people who want to cite this software to generate a good citation for it. We can add the Zenodo DOI to it too. You can read more about `CITATION.cff` files [here](https://citation-file-format.github.io/) and there is a convenient generator tool [here](https://citation-file-format.github.io/cff-initializer-javascript/)." ] }, { diff --git a/docs/learning/09 Adding Documentation.ipynb b/docs/learning/09 Adding Documentation.ipynb index 8de63c1..e10c3cf 100644 --- a/docs/learning/09 Adding Documentation.ipynb +++ b/docs/learning/09 Adding Documentation.ipynb @@ -30,7 +30,7 @@ "source": [ "We'll use sphinx along with a couple plugins: [autodoc][autodoc] allows us to generate documentation automatically from the docstrings in our source code, while [napoleon][napoleon] allows us to use [NUMPYDOC][numpydoc] and Google formats for the docstrings in addition to [reStructuredText][rst]\n", "\n", - "What this means is that we'll be able to write documentation directly into the source code, and it will get rendered into a nice website. This helps keep the documentation up to date because it's right there next to the code!\n", + "What this means is that we'll be able to write documentation directly into the source code and it will get rendered into a nice website. This helps keep the documentation up to date because it's right there next to the code and the web-based documentation will get automatically re-generated every time the documentation files are updated!\n", "\n", "[autodoc]: https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html\n", "[napoleon]: https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html\n", @@ -175,7 +175,7 @@ "source": [ "### Documentation Ideas\n", "\n", - "Readthedocs can be a bit tricky to set up, it is also possible to use GitHub pages to accomplish something similar. Another idea is to include some simple copyable code snippets in a quickstart guide. This lets people get up and running your code more quickly than is they need to read the API docs to understand how to interact with your module." + "Readthedocs can be a bit tricky to set up, it is also possible to use [GitHub Pages](https://pages.github.com/) to accomplish something similar. Another idea is to include some simple copyable code snippets in a quickstart guide. This lets people get up and running with your code more quickly than if they need to read the API documentation to understand how to interact with your module." ] }, { From 74ee6ed58edd95c9b401424c6567c2c51c33680b Mon Sep 17 00:00:00 2001 From: gnikit Date: Thu, 21 Jul 2022 10:44:32 +0100 Subject: [PATCH 11/12] chore: renamed table header --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 3b95861..07fe171 100644 --- a/README.md +++ b/README.md @@ -56,7 +56,7 @@ Entry level researcher with basic knowledge of Python. ### System -| Language | Version | +| Program | Version | | ---------------------------------------------------------- | ------- | | [Python](https://www.python.org/downloads/) | >= 3.7 | | [Anaconda](https://www.anaconda.com/products/distribution) | >= 4.1 | From 28ac21273d04fd2b2eeb02d2dfccaf25c263c0e2 Mon Sep 17 00:00:00 2001 From: Tom Date: Thu, 28 Jul 2022 13:55:00 +0200 Subject: [PATCH 12/12] Apply suggestions from code review Co-authored-by: Jeremy Cohen --- docs/learning/04 Testing.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/learning/04 Testing.ipynb b/docs/learning/04 Testing.ipynb index f084b6f..5eaa3ab 100644 --- a/docs/learning/04 Testing.ipynb +++ b/docs/learning/04 Testing.ipynb @@ -357,7 +357,7 @@ "\n", "1. We are taking 2000 samples of a random variable X, those samples has some mean $m$ and standard deviation $\\sigma_X$, the mean is the center of mass of the above histogram and the standard deviation is a measure of how wide it is.\n", "\n", - "2. However, what we actually want to do is ask \"How close is the mean to 0?\", to answer that we need to know how much we expect the mean to vary by when we rerun the calculation. Turns the mean of N samples of a variable X then the mean varies by \n", + "2. However, what we actually want to do is ask \"How close is the mean to 0?\", to answer that we need to know how much we expect the mean to vary by when we rerun the calculation. It turns out that given N samples of a variable X, the mean varies by \n", " $$\\sigma_m = \\sigma_X / \\sqrt{N}$$\n", " this is usually called the standard error of the mean.\n", "\n",