ReCoDE_MCMCFF/code/docs/learning/04 Testing.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "69cf54db-101a-4cd3-8e68-1b7ba93078eb",
   "metadata": {},
   "source": [
    "<h1 align=\"center\">Markov Chain Monte Carlo for fun and profit</h1>\n",
    "<h1 align=\"center\"> 🎲 ⛓️ 👉 🧪 </h1>\n",
    "\n",
    "# Testing"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7bfb6308-24ec-474d-adbc-b60797d58c29",
   "metadata": {
    "tags": []
   },
   "source": [
    "Ok we can finally start writing and running some tests! Check out the [pytest website](https://docs.pytest.org/en/7.1.x/getting-started.html) for a tutorial on how to write tests in pytest and head over to the [Turing Way](https://the-turing-way.netlify.app/reproducible-research/testing.html) for a great introduction to testing in general. \n",
    "\n",
    "I copied some of the initial tests that we did in chapter 1 into `test_energy.py` installed pytest into my development environment with `pip install pytest`. If you're using conda you need to use `conda install pytest` and now I can run the `pytest` command in the ReCoDE_MCFF directory. Pytest will automatically discover our tests and run them, to do this it relies on their being python files with functions named `test_\\*` which it will run.\n",
    "\n",
    "If that doesn't work and complains it can't find MCFF, try `python -m pytest`, this asks python to find a module and run it, which can be useful to ensure you're running pytest inside the correct environment. I ran into this problem because I used `pip install pytest` into a conda environment when I should have done `conda install pytest`.\n",
    "\n",
    "But hopefully you can get it working and get a lovely testy output! I've embedded a little video of this below but if it doesn't load, check out the [link](https://asciinema.org/a/498583)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d4082e07-c51f-46ba-9a5e-bf45c2c319ba",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div style=\"max-width:700px;margin:auto;\"><script id=\"asciicast-498583\" src=\"https://asciinema.org/a/498583.js\" async></script></div>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%html \n",
    "<div style=\"max-width:700px;margin:auto;\"><script id=\"asciicast-498583\" src=\"https://asciinema.org/a/498583.js\" async></script></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "27a83dc3-eaa5-4b40-b61c-0cd969fa8049",
   "metadata": {},
   "source": [
    "## Basic Testing with Pytest\n",
    "\n",
    "Take a look at `tests/test_energy.py`. You can see that I've done some imports, setup some test states and then defined two testing functions:\n",
    "\n",
    "```python\n",
    "# Note that only functions whose name begins with test_ get run by pytest\n",
    "def E_prediction_all_the_same(L): \n",
    "    \"The exact energy in for the case where all spins are up or down\"\n",
    "    return -(4*(L - 2)**2 + 12*(L-2) + 8)\n",
    "\n",
    "def test_exact_energies():\n",
    "    for state in [all_up, all_down]:\n",
    "        L = state.shape[0]\n",
    "        assert energy(state) == E_prediction_all_the_same(L)\n",
    "```\n",
    "\n",
    "I will defer to external resources for a full discussion of the philosphy of testing but I generally think of tests as an aid to my future debugging. If I make a change that breaks something then I want my tests to catch that and to make it clear what has broken. As such I generally put tests that check very basic properties of my code early on in the file and then follow them with tests that probe more subtle things or more obscure edges cases.\n",
    "\n",
    "`test_exact_energies` checks that the energies of our exact states come out as we calculated they should in chapter 1. This is testing a very limited space of the possible inputs to `energy` so we'd like to find some way to be more confident that our implementation is correct.\n",
    "\n",
    "One was is to test multiple independant implementations against one another: `test_energy_implementations` checks our numpy implementation against our numba one. This should catch implementation bugs because it's unlikely we will make the same such error in both implementations. \n",
    "\n",
    "```python\n",
    "def test_energy_implementations():\n",
    "    for state in states:\n",
    "        assert np.allclose(energy(state), energy_numpy(state))\n",
    "```\n",
    "\n",
    "However if we have made some logical errors in how we've defined the energy, that error will likely appear in both implememtations and thus won't be caught by this. \n",
    "\n",
    "Generally what we will do now, is that as we write more code or add new functionality we will add tests to check that functionality."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eeb05f7e-913d-4d9e-9580-9c840e06d410",
   "metadata": {},
   "source": [
    "## Coverage Testing\n",
    "\n",
    "A useful little trick for testing, are tools like pytest-cov that can measure *coverage*, that is, the amount of your code base that is activate by your tests. Unfortunatley Numba does not play super well with pytest-cov so we have to turn off numba to generate the test report using an environment variable.\n",
    "\n",
    "```bash\n",
    "(recode) tom@TomsLaptop ReCoDE_MCMCFF % pip install pytest-cov # install the coverage testing plugin\n",
    "(recode) tom@TomsLaptop ReCoDE_MCMCFF % NUMBA_DISABLE_JIT=1 pytest --cov=MCFF --cov-report=term\n",
    "\n",
    "================================================== test session starts ==================================================\n",
    "platform darwin -- Python 3.9.12, pytest-7.1.1, pluggy-1.0.0\n",
    "rootdir: /Users/tom/git/ReCoDE_MCMCFF\n",
    "plugins: hypothesis-6.46.10, cov-3.0.0\n",
    "collected 3 items                                                                                                       \n",
    "\n",
    "code/tests/test_energy.py ..                                                                                      [ 66%]\n",
    "code/tests/test_energy_using_hypothesis.py .                                                                      [100%]\n",
    "\n",
    "---------- coverage: platform darwin, python 3.9.12-final-0 ----------\n",
    "Name                           Stmts   Miss  Cover\n",
    "--------------------------------------------------\n",
    "code/src/MCFF/__init__.py          0      0   100%\n",
    "code/src/MCFF/ising_model.py      22      3    86%\n",
    "code/src/MCFF/mcmc.py             14     14     0%\n",
    "--------------------------------------------------\n",
    "TOTAL                             36     17    53%\n",
    "\n",
    "\n",
    "=================================================== 3 passed in 1.89s ===================================================\n",
    "```\n",
    "\n",
    "Ok so this is telling us that we currently test 86% of the lines in ising_model.py. We can also change `--cov-report=html` to get a really nice html output which shows which parts of your code aren't being run.\n",
    "\n",
    "A warning though, testing 100% of your lines of code doesn't mean it's correct, you need to think carefully about the data you test on, try to pick the hardest examples you can think of! What edge cases might there be that would break your code? Zero, empty strings and empty arrays are classic examples."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d70a8934-a58d-4aa6-afca-35fee23bf851",
   "metadata": {},
   "source": [
    "## Advanced Testing Methods: Hypothesis\n",
    "\n",
    "\n",
    "I won't do into huge detail here but I thought it would be nice to make you aware of a nice library called `Hypothesis` that helps with this problem of finding edge cases. `Hypothesis` gives you tools to generate randomised inputs to functions, so as long as you can come up with some way to verify the output is correct (or just that the code doens't throw and error!) then this can be a powerful method of testing. \n",
    "\n",
    "\n",
    "Take a look in `test_energy_using_hypothesis.py`\n",
    "```python\n",
    "from hypothesis.extra import numpy as hnp\n",
    "\n",
    "@given(hnp.arrays(dtype = int,\n",
    "                 shape = hnp.array_shapes(min_dims = 2, max_dims = 2),\n",
    "                 elements = st.sampled_from([1, -1])))\n",
    "def test_generated_states(state):\n",
    "    assert np.allclose(energy(state), energy_numpy(state))\n",
    "```\n",
    "You tell Hypothesis how to generate the test data, in this case we use some numpy specifc code to generate 2 dimensional arrays with `dtype = int` and entries randomly sampled from `[1, -1]`. We use the same trick as before of checking two implementations against one another."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "91fa6842-f214-4fd3-bbe5-0f20d7d0d2cc",
   "metadata": {},
   "source": [
    "## Autoformaters\n",
    "Resources: [](https://the-turing-way.netlify.app/reproducible-research/code-quality/code-quality-style.html)\n",
    "\n",
    "While we're doing things that will help keep our code clean and tidy in the future, I would recommend installing a code formatter like `black`. This is a program that enforces a particular formatting style on your code by simply doing it for you. At first this sounds a bit weird, but it has a few benefits:\n",
    "\n",
    "- It makes git diffs as small as possible because formatting changes never show up\n",
    "- It means you never have to discuss with your collaborators about code formatting, something which can waste a lot of time!\n",
    "\n",
    "Here I will show you how to setup `black` as a pre-commit hook, this means it runs before you commit anything to git, which is probably the best time to do it. We'll use a helper tool called [pre-commit](https://pre-commit.com/).\n",
    "\n",
    "```bash\n",
    "pip install pre-commit\n",
    "pre-commit sample-config >> .pre-commit-config.yaml # Generate an initial config\n",
    "```\n",
    "Now we add some additional lines to the `.pre-commit-config.yaml` config file to setup black:\n",
    "```yaml\n",
    "-   repo: https://github.com/psf/black\n",
    "    rev: 21.12b0\n",
    "    hooks:\n",
    "    -   id: black\n",
    "    -   id: black-jupyter\n",
    "```\n",
    "And finally `pre-commit install` will make this run every time you commit to git. It's worth running it manually once the first time to check it works: `pre-commit run --all-files`. Running this I immediatly got a cryptic error that, on googling, turned out to be that something broke in version 21.12b0 of `21.12b0`. Running `precommit autoupdate` fixed this for me by updated `black` to a later version. Running `pre-commit run --all-files` a second time now gives me:\n",
    "```bash\n",
    "(recode) tom@TomsLaptop ReCoDE_MCMCFF % pre-commit run --all-files\n",
    "trim trailing whitespace.................................................Passed\n",
    "fix end of files.........................................................Passed\n",
    "check yaml...........................................(no files to check)Skipped\n",
    "check for added large files..............................................Passed\n",
    "black....................................................................Passed\n",
    "(recode) tom@TomsLaptop ReCoDE_MCMCFF % \n",
    "```\n",
    "\n",
    "Now whenever you commit anything, `black` will autoformat it before it actually gets commited. I'll test this for myself by putting\n",
    "```python\n",
    "def ugly_litte_one_liner(a,b,c): return \" \".join([a,b,c/2. +3])\n",
    "```\n",
    "in a code cell below and we'll see how `black` formats it. The only gotcha here is that you have to reload jupyter notebooks from disk in order to see the changes that `black` makes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "24874a21-b9b3-4c38-9a79-1a2b4bce88a9",
   "metadata": {},
   "outputs": [],
   "source": [
    "def ugly_litte_one_liner(a, b, c):\n",
    "    return \" \".join([a, b, c / 2.0 + 3])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68cccdcc-b82e-4dec-bfc1-54072db8d762",
   "metadata": {},
   "source": [
    "Finally, be aware that if you try to commit code with incorrect syntax then black will just error and prevent it, this is probably a good thing but there may be the occasional time where that's a problem."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ba4802e-40e9-4cd5-8877-7a51ad2224b1",
   "metadata": {},
   "source": [
    "## Testing Stochastic Code\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e255b929-3ae7-4584-a2d4-5386d443a4af",
   "metadata": {
    "tags": []
   },
   "source": [
    "...\n",
    "\n",
    "We can use a t-test to check a sample from the MCMC sampler matches our expectations. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "64cd4fd1-2a22-41ec-820f-5f5cac4a68b3",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# observations = average_color_data.mean(axis = -1)\n",
    "\n",
    "# from scipy import stats\n",
    "# stats.ttest_1samp(observations[0], 0).pvalue < 1 / 50"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0a38999f-e4c9-4b59-913c-35eb7fbffb4c",
   "metadata": {},
   "source": [
    "## Test Driven Development\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1b729c67-246d-4620-8e00-f355432c28a7",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:jupyter3.9] *",
   "language": "python",
   "name": "conda-env-jupyter3.9-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}